rohrpost

A commandline mail client to change the world as we see it.
git clone git://r-36.net/rohrpost
Log | Files | Refs | LICENSE

commit a070db77245acdab7cee5aa2d67e145959aed08c
parent c1a076f8bfe69c7c5f741798f8831de09f9f102a
Author: Christoph Lohmann <20h@r-36.net>
Date:   Fri, 21 Dec 2012 18:01:59 +0100

Moving the RFCs to their own rfc folder.

Diffstat:
add.c | 2+-
proto/rfc1341.txt | 5265-------------------------------------------------------------------------------
proto/rfc2045.txt | 1739-------------------------------------------------------------------------------
proto/rfc2046.txt | 2467------------------------------------------------------------------------------
proto/rfc2047.txt | 843-------------------------------------------------------------------------------
proto/rfc2048.txt | 1180-------------------------------------------------------------------------------
proto/rfc2049.txt | 1347-------------------------------------------------------------------------------
proto/rfc2183.txt | 675-------------------------------------------------------------------------------
proto/rfc2231.txt | 563-------------------------------------------------------------------------------
proto/rfc2387.txt | 563-------------------------------------------------------------------------------
proto/rfc2425.txt | 1851-------------------------------------------------------------------------------
proto/rfc2426.txt | 2355-------------------------------------------------------------------------------
proto/rfc2595.txt | 843-------------------------------------------------------------------------------
proto/rfc2646.txt | 787-------------------------------------------------------------------------------
proto/rfc2822.txt | 2859-------------------------------------------------------------------------------
proto/rfc3501.txt | 6051-------------------------------------------------------------------------------
proto/rfc4616.txt | 619-------------------------------------------------------------------------------
proto/rfc5256.txt | 1067-------------------------------------------------------------------------------
proto/rfc5322.txt | 3195-------------------------------------------------------------------------------
proto/rfc5804.txt | 2747-------------------------------------------------------------------------------
proto/rfc822.txt | 2901-------------------------------------------------------------------------------
proto/sieve/rfc3028.txt | 2019-------------------------------------------------------------------------------
proto/sieve/rfc3431.txt | 451-------------------------------------------------------------------------------
proto/sieve/rfc5231.txt | 507-------------------------------------------------------------------------------
proto/sieve/rfc5260.txt | 731-------------------------------------------------------------------------------
proto/sieve/rfc5437.txt | 787-------------------------------------------------------------------------------
proto/sieve/rfc5804.txt | 2747-------------------------------------------------------------------------------
rfc/rfc1341.txt | 5265+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2045.txt | 1739+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2046.txt | 2467++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2047.txt | 843+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2048.txt | 1180+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2049.txt | 1347+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2183.txt | 675+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2231.txt | 563+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2387.txt | 563+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2425.txt | 1851+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2426.txt | 2355+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2595.txt | 843+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2646.txt | 787+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2821.txt | 4427+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc2822.txt | 2859+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc3501.txt | 6051+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc4616.txt | 619+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc5256.txt | 1067+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc5322.txt | 3195+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc5804.txt | 2747+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/rfc822.txt | 2901+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc3028.txt | 2019+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc3431.txt | 451+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc5231.txt | 507+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc5260.txt | 731+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc5437.txt | 787+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rfc/sieve/rfc5804.txt | 2747+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
54 files changed, 51587 insertions(+), 47160 deletions(-)

diff --git a/add.c b/add.c @@ -88,7 +88,7 @@ addmain(int argc, char *argv[]) if (flags != NULL) { flagl = flag_sanitize(flags); if (flagl == NULL) - die("Flag parameter seems to be invalid."); + die("Flag parameter seems to be invalid.\n"); } cfg = config_init(cfgn); diff --git a/proto/rfc1341.txt b/proto/rfc1341.txt @@ -1,5265 +0,0 @@ - - - - - - - Network Working Group N. Borenstein, Bellcore - Request for Comments: 1341 N. Freed, Innosoft - June 1992 - - - - MIME (Multipurpose Internet Mail Extensions): - - - Mechanisms for Specifying and Describing - the Format of Internet Message Bodies - - - Status of this Memo - - This RFC specifies an IAB standards track protocol for the - Internet community, and requests discussion and suggestions - for improvements. Please refer to the current edition of - the "IAB Official Protocol Standards" for the - standardization state and status of this protocol. - Distribution of this memo is unlimited. - - Abstract - - RFC 822 defines a message representation protocol which - specifies considerable detail about message headers, but - which leaves the message content, or message body, as flat - ASCII text. This document redefines the format of message - bodies to allow multi-part textual and non-textual message - bodies to be represented and exchanged without loss of - information. This is based on earlier work documented in - RFC 934 and RFC 1049, but extends and revises that work. - Because RFC 822 said so little about message bodies, this - document is largely orthogonal to (rather than a revision - of) RFC 822. - - In particular, this document is designed to provide - facilities to include multiple objects in a single message, - to represent body text in character sets other than US- - ASCII, to represent formatted multi-font text messages, to - represent non-textual material such as images and audio - fragments, and generally to facilitate later extensions - defining new types of Internet mail for use by cooperating - mail agents. - - This document does NOT extend Internet mail header fields to - permit anything other than US-ASCII text data. It is - recognized that such extensions are necessary, and they are - the subject of a companion document [RFC -1342]. - - A table of contents appears at the end of this document. - - - - - - - Borenstein & Freed [Page i] - - - - - - - - 1 Introduction - - Since its publication in 1982, RFC 822 [RFC-822] has defined - the standard format of textual mail messages on the - Internet. Its success has been such that the RFC 822 format - has been adopted, wholly or partially, well beyond the - confines of the Internet and the Internet SMTP transport - defined by RFC 821 [RFC-821]. As the format has seen wider - use, a number of limitations have proven increasingly - restrictive for the user community. - - RFC 822 was intended to specify a format for text messages. - As such, non-text messages, such as multimedia messages that - might include audio or images, are simply not mentioned. - Even in the case of text, however, RFC 822 is inadequate for - the needs of mail users whose languages require the use of - character sets richer than US ASCII [US-ASCII]. Since RFC - 822 does not specify mechanisms for mail containing audio, - video, Asian language text, or even text in most European - languages, additional specifications are needed - - One of the notable limitations of RFC 821/822 based mail - systems is the fact that they limit the contents of - electronic mail messages to relatively short lines of - seven-bit ASCII. This forces users to convert any non- - textual data that they may wish to send into seven-bit bytes - representable as printable ASCII characters before invoking - a local mail UA (User Agent, a program with which human - users send and receive mail). Examples of such encodings - currently used in the Internet include pure hexadecimal, - uuencode, the 3-in-4 base 64 scheme specified in RFC 1113, - the Andrew Toolkit Representation [ATK], and many others. - - The limitations of RFC 822 mail become even more apparent as - gateways are designed to allow for the exchange of mail - messages between RFC 822 hosts and X.400 hosts. X.400 [X400] - specifies mechanisms for the inclusion of non-textual body - parts within electronic mail messages. The current - standards for the mapping of X.400 messages to RFC 822 - messages specify that either X.400 non-textual body parts - should be converted to (not encoded in) an ASCII format, or - that they should be discarded, notifying the RFC 822 user - that discarding has occurred. This is clearly undesirable, - as information that a user may wish to receive is lost. - Even though a user's UA may not have the capability of - dealing with the non-textual body part, the user might have - some mechanism external to the UA that can extract useful - information from the body part. Moreover, it does not allow - for the fact that the message may eventually be gatewayed - back into an X.400 message handling system (i.e., the X.400 - message is "tunneled" through Internet mail), where the - non-textual information would definitely become useful - again. - - - - - Borenstein & Freed [Page 1] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - This document describes several mechanisms that combine to - solve most of these problems without introducing any serious - incompatibilities with the existing world of RFC 822 mail. - In particular, it describes: - - 1. A MIME-Version header field, which uses a version number - to declare a message to be conformant with this - specification and allows mail processing agents to - distinguish between such messages and those generated - by older or non-conformant software, which is presumed - to lack such a field. - - 2. A Content-Type header field, generalized from RFC 1049 - [RFC-1049], which can be used to specify the type and - subtype of data in the body of a message and to fully - specify the native representation (encoding) of such - data. - - 2.a. A "text" Content-Type value, which can be used to - represent textual information in a number of - character sets and formatted text description - languages in a standardized manner. - - 2.b. A "multipart" Content-Type value, which can be - used to combine several body parts, possibly of - differing types of data, into a single message. - - 2.c. An "application" Content-Type value, which can be - used to transmit application data or binary data, - and hence, among other uses, to implement an - electronic mail file transfer service. - - 2.d. A "message" Content-Type value, for encapsulating - a mail message. - - 2.e An "image" Content-Type value, for transmitting - still image (picture) data. - - 2.f. An "audio" Content-Type value, for transmitting - audio or voice data. - - 2.g. A "video" Content-Type value, for transmitting - video or moving image data, possibly with audio as - part of the composite video data format. - - 3. A Content-Transfer-Encoding header field, which can be - used to specify an auxiliary encoding that was applied - to the data in order to allow it to pass through mail - transport mechanisms which may have data or character - set limitations. - - 4. Two optional header fields that can be used to further - describe the data in a message body, the Content-ID and - Content-Description header fields. - - - - Borenstein & Freed [Page 2] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - MIME has been carefully designed as an extensible mechanism, - and it is expected that the set of content-type/subtype - pairs and their associated parameters will grow - significantly with time. Several other MIME fields, notably - including character set names, are likely to have new values - defined over time. In order to ensure that the set of such - values is developed in an orderly, well-specified, and - public manner, MIME defines a registration process which - uses the Internet Assigned Numbers Authority (IANA) as a - central registry for such values. Appendix F provides - details about how IANA registration is accomplished. - - Finally, to specify and promote interoperability, Appendix A - of this document provides a basic applicability statement - for a subset of the above mechanisms that defines a minimal - level of "conformance" with this document. - - HISTORICAL NOTE: Several of the mechanisms described in - this document may seem somewhat strange or even baroque at - first reading. It is important to note that compatibility - with existing standards AND robustness across existing - practice were two of the highest priorities of the working - group that developed this document. In particular, - compatibility was always favored over elegance. - - 2 Notations, Conventions, and Generic BNF Grammar - - This document is being published in two versions, one as - plain ASCII text and one as PostScript. The latter is - recommended, though the textual contents are identical. An - Andrew-format copy of this document is also available from - the first author (Borenstein). - - Although the mechanisms specified in this document are all - described in prose, most are also described formally in the - modified BNF notation of RFC 822. Implementors will need to - be familiar with this notation in order to understand this - specification, and are referred to RFC 822 for a complete - explanation of the modified BNF notation. - - Some of the modified BNF in this document makes reference to - syntactic entities that are defined in RFC 822 and not in - this document. A complete formal grammar, then, is obtained - by combining the collected grammar appendix of this document - with that of RFC 822. - - The term CRLF, in this document, refers to the sequence of - the two ASCII characters CR (13) and LF (10) which, taken - together, in this order, denote a line break in RFC 822 - mail. - - The term "character set", wherever it is used in this - document, refers to a coded character set, in the sense of - ISO character set standardization work, and must not be - - - - Borenstein & Freed [Page 3] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - misinterpreted as meaning "a set of characters." - - The term "message", when not further qualified, means either - the (complete or "top-level") message being transferred on a - network, or a message encapsulated in a body of type - "message". - - The term "body part", in this document, means one of the - parts of the body of a multipart entity. A body part has a - header and a body, so it makes sense to speak about the body - of a body part. - - The term "entity", in this document, means either a message - or a body part. All kinds of entities share the property - that they have a header and a body. - - The term "body", when not further qualified, means the body - of an entity, that is the body of either a message or of a - body part. - - Note : the previous four definitions are clearly circular. - This is unavoidable, since the overal structure of a MIME - message is indeed recursive. - - In this document, all numeric and octet values are given in - decimal notation. - - It must be noted that Content-Type values, subtypes, and - parameter names as defined in this document are case- - insensitive. However, parameter values are case-sensitive - unless otherwise specified for the specific parameter. - - FORMATTING NOTE: This document has been carefully formatted - for ease of reading. The PostScript version of this - document, in particular, places notes like this one, which - may be skipped by the reader, in a smaller, italicized, - font, and indents it as well. In the text version, only the - indentation is preserved, so if you are reading the text - version of this you might consider using the PostScript - version instead. However, all such notes will be indented - and preceded by "NOTE:" or some similar introduction, even - in the text version. - - The primary purpose of these non-essential notes is to - convey information about the rationale of this document, or - to place this document in the proper historical or - evolutionary context. Such information may be skipped by - those who are focused entirely on building a compliant - implementation, but may be of use to those who wish to - understand why this document is written as it is. - - For ease of recognition, all BNF definitions have been - placed in a fixed-width font in the PostScript version of - this document. - - - - Borenstein & Freed [Page 4] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 3 The MIME-Version Header Field - - Since RFC 822 was published in 1982, there has really been - only one format standard for Internet messages, and there - has been little perceived need to declare the format - standard in use. This document is an independent document - that complements RFC 822. Although the extensions in this - document have been defined in such a way as to be compatible - with RFC 822, there are still circumstances in which it - might be desirable for a mail-processing agent to know - whether a message was composed with the new standard in - mind. - - Therefore, this document defines a new header field, "MIME- - Version", which is to be used to declare the version of the - Internet message body format standard in use. - - Messages composed in accordance with this document MUST - include such a header field, with the following verbatim - text: - - MIME-Version: 1.0 - - The presence of this header field is an assertion that the - message has been composed in compliance with this document. - - Since it is possible that a future document might extend the - message format standard again, a formal BNF is given for the - content of the MIME-Version field: - - MIME-Version := text - - Thus, future format specifiers, which might replace or - extend "1.0", are (minimally) constrained by the definition - of "text", which appears in RFC 822. - - Note that the MIME-Version header field is required at the - top level of a message. It is not required for each body - part of a multipart entity. It is required for the embedded - headers of a body of type "message" if and only if the - embedded message is itself claimed to be MIME-compliant. - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 5] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 4 The Content-Type Header Field - - The purpose of the Content-Type field is to describe the - data contained in the body fully enough that the receiving - user agent can pick an appropriate agent or mechanism to - present the data to the user, or otherwise deal with the - data in an appropriate manner. - - HISTORICAL NOTE: The Content-Type header field was first - defined in RFC 1049. RFC 1049 Content-types used a simpler - and less powerful syntax, but one that is largely compatible - with the mechanism given here. - - The Content-Type header field is used to specify the nature - of the data in the body of an entity, by giving type and - subtype identifiers, and by providing auxiliary information - that may be required for certain types. After the type and - subtype names, the remainder of the header field is simply a - set of parameters, specified in an attribute/value notation. - The set of meaningful parameters differs for the different - types. The ordering of parameters is not significant. - Among the defined parameters is a "charset" parameter by - which the character set used in the body may be declared. - Comments are allowed in accordance with RFC 822 rules for - structured header fields. - - In general, the top-level Content-Type is used to declare - the general type of data, while the subtype specifies a - specific format for that type of data. Thus, a Content-Type - of "image/xyz" is enough to tell a user agent that the data - is an image, even if the user agent has no knowledge of the - specific image format "xyz". Such information can be used, - for example, to decide whether or not to show a user the raw - data from an unrecognized subtype -- such an action might be - reasonable for unrecognized subtypes of text, but not for - unrecognized subtypes of image or audio. For this reason, - registered subtypes of audio, image, text, and video, should - not contain embedded information that is really of a - different type. Such compound types should be represented - using the "multipart" or "application" types. - - Parameters are modifiers of the content-subtype, and do not - fundamentally affect the requirements of the host system. - Although most parameters make sense only with certain - content-types, others are "global" in the sense that they - might apply to any subtype. For example, the "boundary" - parameter makes sense only for the "multipart" content-type, - but the "charset" parameter might make sense with several - content-types. - - An initial set of seven Content-Types is defined by this - document. This set of top-level names is intended to be - substantially complete. It is expected that additions to - the larger set of supported types can generally be - - - - Borenstein & Freed [Page 6] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - accomplished by the creation of new subtypes of these - initial types. In the future, more top-level types may be - defined only by an extension to this standard. If another - primary type is to be used for any reason, it must be given - a name starting with "X-" to indicate its non-standard - status and to avoid a potential conflict with a future - official name. - - In the Extended BNF notation of RFC 822, a Content-Type - header field value is defined as follows: - - Content-Type := type "/" subtype *[";" parameter] - - type := "application" / "audio" - / "image" / "message" - / "multipart" / "text" - / "video" / x-token - - x-token := <The two characters "X-" followed, with no - intervening white space, by any token> - - subtype := token - - parameter := attribute "=" value - - attribute := token - - value := token / quoted-string - - token := 1*<any CHAR except SPACE, CTLs, or tspecials> - - tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in - / "," / ";" / ":" / "\" / <"> ; quoted-string, - / "/" / "[" / "]" / "?" / "." ; to use within - / "=" ; parameter values - - Note that the definition of "tspecials" is the same as the - RFC 822 definition of "specials" with the addition of the - three characters "/", "?", and "=". - - Note also that a subtype specification is MANDATORY. There - are no default subtypes. - - The type, subtype, and parameter names are not case - sensitive. For example, TEXT, Text, and TeXt are all - equivalent. Parameter values are normally case sensitive, - but certain parameters are interpreted to be case- - insensitive, depending on the intended use. (For example, - multipart boundaries are case-sensitive, but the "access- - type" for message/External-body is not case-sensitive.) - - Beyond this syntax, the only constraint on the definition of - subtype names is the desire that their uses must not - conflict. That is, it would be undesirable to have two - - - - Borenstein & Freed [Page 7] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - different communities using "Content-Type: - application/foobar" to mean two different things. The - process of defining new content-subtypes, then, is not - intended to be a mechanism for imposing restrictions, but - simply a mechanism for publicizing the usages. There are, - therefore, two acceptable mechanisms for defining new - Content-Type subtypes: - - 1. Private values (starting with "X-") may be - defined bilaterally between two cooperating - agents without outside registration or - standardization. - - 2. New standard values must be documented, - registered with, and approved by IANA, as - described in Appendix F. Where intended for - public use, the formats they refer to must - also be defined by a published specification, - and possibly offered for standardization. - - The seven standard initial predefined Content-Types are - detailed in the bulk of this document. They are: - - text -- textual information. The primary subtype, - "plain", indicates plain (unformatted) text. No - special software is required to get the full - meaning of the text, aside from support for the - indicated character set. Subtypes are to be used - for enriched text in forms where application - software may enhance the appearance of the text, - but such software must not be required in order to - get the general idea of the content. Possible - subtypes thus include any readable word processor - format. A very simple and portable subtype, - richtext, is defined in this document. - multipart -- data consisting of multiple parts of - independent data types. Four initial subtypes - are defined, including the primary "mixed" - subtype, "alternative" for representing the same - data in multiple formats, "parallel" for parts - intended to be viewed simultaneously, and "digest" - for multipart entities in which each part is of - type "message". - message -- an encapsulated message. A body of - Content-Type "message" is itself a fully formatted - RFC 822 conformant message which may contain its - own different Content-Type header field. The - primary subtype is "rfc822". The "partial" - subtype is defined for partial messages, to permit - the fragmented transmission of bodies that are - thought to be too large to be passed through mail - transport facilities. Another subtype, - "External-body", is defined for specifying large - bodies by reference to an external data source. - - - - Borenstein & Freed [Page 8] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - image -- image data. Image requires a display device - (such as a graphical display, a printer, or a FAX - machine) to view the information. Initial - subtypes are defined for two widely-used image - formats, jpeg and gif. - audio -- audio data, with initial subtype "basic". - Audio requires an audio output device (such as a - speaker or a telephone) to "display" the contents. - video -- video data. Video requires the capability to - display moving images, typically including - specialized hardware and software. The initial - subtype is "mpeg". - application -- some other kind of data, typically - either uninterpreted binary data or information to - be processed by a mail-based application. The - primary subtype, "octet-stream", is to be used in - the case of uninterpreted binary data, in which - case the simplest recommended action is to offer - to write the information into a file for the user. - Two additional subtypes, "ODA" and "PostScript", - are defined for transporting ODA and PostScript - documents in bodies. Other expected uses for - "application" include spreadsheets, data for - mail-based scheduling systems, and languages for - "active" (computational) email. (Note that active - email entails several securityconsiderations, - which are discussed later in this memo, - particularly in the context of - application/PostScript.) - - Default RFC 822 messages are typed by this protocol as plain - text in the US-ASCII character set, which can be explicitly - specified as "Content-type: text/plain; charset=us-ascii". - If no Content-Type is specified, either by error or by an - older user agent, this default is assumed. In the presence - of a MIME-Version header field, a receiving User Agent can - also assume that plain US-ASCII text was the sender's - intent. In the absence of a MIME-Version specification, - plain US-ASCII text must still be assumed, but the sender's - intent might have been otherwise. - - RATIONALE: In the absence of any Content-Type header field - or MIME-Version header field, it is impossible to be certain - that a message is actually text in the US-ASCII character - set, since it might well be a message that, using the - conventions that predate this document, includes text in - another character set or non-textual data in a manner that - cannot be automatically recognized (e.g., a uuencoded - compressed UNIX tar file). Although there is no fully - acceptable alternative to treating such untyped messages as - "text/plain; charset=us-ascii", implementors should remain - aware that if a message lacks both the MIME-Version and the - Content-Type header fields, it may in practice contain - almost anything. - - - - Borenstein & Freed [Page 9] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - It should be noted that the list of Content-Type values - given here may be augmented in time, via the mechanisms - described above, and that the set of subtypes is expected to - grow substantially. - - When a mail reader encounters mail with an unknown Content- - type value, it should generally treat it as equivalent to - "application/octet-stream", as described later in this - document. - - 5 The Content-Transfer-Encoding Header Field - - Many Content-Types which could usefully be transported via - email are represented, in their "natural" format, as 8-bit - character or binary data. Such data cannot be transmitted - over some transport protocols. For example, RFC 821 - restricts mail messages to 7-bit US-ASCII data with 1000 - character lines. - - It is necessary, therefore, to define a standard mechanism - for re-encoding such data into a 7-bit short-line format. - This document specifies that such encodings will be - indicated by a new "Content-Transfer-Encoding" header field. - The Content-Transfer-Encoding field is used to indicate the - type of transformation that has been used in order to - represent the body in an acceptable manner for transport. - - Unlike Content-Types, a proliferation of Content-Transfer- - Encoding values is undesirable and unnecessary. However, - establishing only a single Content-Transfer-Encoding - mechanism does not seem possible. There is a tradeoff - between the desire for a compact and efficient encoding of - largely-binary data and the desire for a readable encoding - of data that is mostly, but not entirely, 7-bit data. For - this reason, at least two encoding mechanisms are necessary: - a "readable" encoding and a "dense" encoding. - - The Content-Transfer-Encoding field is designed to specify - an invertible mapping between the "native" representation of - a type of data and a representation that can be readily - exchanged using 7 bit mail transport protocols, such as - those defined by RFC 821 (SMTP). This field has not been - defined by any previous standard. The field's value is a - single token specifying the type of encoding, as enumerated - below. Formally: - - Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" / - "8BIT" / "7BIT" / - "BINARY" / x-token - - These values are not case sensitive. That is, Base64 and - BASE64 and bAsE64 are all equivalent. An encoding type of - 7BIT requires that the body is already in a seven-bit mail- - ready representation. This is the default value -- that is, - - - - Borenstein & Freed [Page 10] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - "Content-Transfer-Encoding: 7BIT" is assumed if the - Content-Transfer-Encoding header field is not present. - - The values "8bit", "7bit", and "binary" all imply that NO - encoding has been performed. However, they are potentially - useful as indications of the kind of data contained in the - object, and therefore of the kind of encoding that might - need to be performed for transmission in a given transport - system. "7bit" means that the data is all represented as - short lines of US-ASCII data. "8bit" means that the lines - are short, but there may be non-ASCII characters (octets - with the high-order bit set). "Binary" means that not only - may non-ASCII characters be present, but also that the lines - are not necessarily short enough for SMTP transport. - - The difference between "8bit" (or any other conceivable - bit-width token) and the "binary" token is that "binary" - does not require adherence to any limits on line length or - to the SMTP CRLF semantics, while the bit-width tokens do - require such adherence. If the body contains data in any - bit-width other than 7-bit, the appropriate bit-width - Content-Transfer-Encoding token must be used (e.g., "8bit" - for unencoded 8 bit wide data). If the body contains binary - data, the "binary" Content-Transfer-Encoding token must be - used. - - NOTE: The distinction between the Content-Transfer-Encoding - values of "binary," "8bit," etc. may seem unimportant, in - that all of them really mean "none" -- that is, there has - been no encoding of the data for transport. However, clear - labeling will be of enormous value to gateways between - future mail transport systems with differing capabilities in - transporting data that do not meet the restrictions of RFC - 821 transport. - - As of the publication of this document, there are no - standardized Internet transports for which it is legitimate - to include unencoded 8-bit or binary data in mail bodies. - Thus there are no circumstances in which the "8bit" or - "binary" Content-Transfer-Encoding is actually legal on the - Internet. However, in the event that 8-bit or binary mail - transport becomes a reality in Internet mail, or when this - document is used in conjunction with any other 8-bit or - binary-capable transport mechanism, 8-bit or binary bodies - should be labeled as such using this mechanism. - - NOTE: The five values defined for the Content-Transfer- - Encoding field imply nothing about the Content-Type other - than the algorithm by which it was encoded or the transport - system requirements if unencoded. - - Implementors may, if necessary, define new Content- - Transfer-Encoding values, but must use an x-token, which is - a name prefixed by "X-" to indicate its non-standard status, - - - - Borenstein & Freed [Page 11] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - e.g., "Content-Transfer-Encoding: x-my-new-encoding". - However, unlike Content-Types and subtypes, the creation of - new Content-Transfer-Encoding values is explicitly and - strongly discouraged, as it seems likely to hinder - interoperability with little potential benefit. Their use - is allowed only as the result of an agreement between - cooperating user agents. - - If a Content-Transfer-Encoding header field appears as part - of a message header, it applies to the entire body of that - message. If a Content-Transfer-Encoding header field - appears as part of a body part's headers, it applies only to - the body of that body part. If an entity is of type - "multipart" or "message", the Content-Transfer-Encoding is - not permitted to have any value other than a bit width - (e.g., "7bit", "8bit", etc.) or "binary". - - It should be noted that email is character-oriented, so that - the mechanisms described here are mechanisms for encoding - arbitrary byte streams, not bit streams. If a bit stream is - to be encoded via one of these mechanisms, it must first be - converted to an 8-bit byte stream using the network standard - bit order ("big-endian"), in which the earlier bits in a - stream become the higher-order bits in a byte. A bit stream - not ending at an 8-bit boundary must be padded with zeroes. - This document provides a mechanism for noting the addition - of such padding in the case of the application Content-Type, - which has a "padding" parameter. - - The encoding mechanisms defined here explicitly encode all - data in ASCII. Thus, for example, suppose an entity has - header fields such as: - - Content-Type: text/plain; charset=ISO-8859-1 - Content-transfer-encoding: base64 - - This should be interpreted to mean that the body is a base64 - ASCII encoding of data that was originally in ISO-8859-1, - and will be in that character set again after decoding. - - The following sections will define the two standard encoding - mechanisms. The definition of new content-transfer- - encodings is explicitly discouraged and should only occur - when absolutely necessary. All content-transfer-encoding - namespace except that beginning with "X-" is explicitly - reserved to the IANA for future use. Private agreements - about content-transfer-encodings are also explicitly - discouraged. - - Certain Content-Transfer-Encoding values may only be used on - certain Content-Types. In particular, it is expressly - forbidden to use any encodings other than "7bit", "8bit", or - "binary" with any Content-Type that recursively includes - other Content-Type fields, notably the "multipart" and - - - - Borenstein & Freed [Page 12] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - "message" Content-Types. All encodings that are desired for - bodies of type multipart or message must be done at the - innermost level, by encoding the actual body that needs to - be encoded. - - NOTE ON ENCODING RESTRICTIONS: Though the prohibition - against using content-transfer-encodings on data of type - multipart or message may seem overly restrictive, it is - necessary to prevent nested encodings, in which data are - passed through an encoding algorithm multiple times, and - must be decoded multiple times in order to be properly - viewed. Nested encodings add considerable complexity to - user agents: aside from the obvious efficiency problems - with such multiple encodings, they can obscure the basic - structure of a message. In particular, they can imply that - several decoding operations are necessary simply to find out - what types of objects a message contains. Banning nested - encodings may complicate the job of certain mail gateways, - but this seems less of a problem than the effect of nested - encodings on user agents. - - NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT- - TRANSFER-ENCODING: It may seem that the Content-Transfer- - Encoding could be inferred from the characteristics of the - Content-Type that is to be encoded, or, at the very least, - that certain Content-Transfer-Encodings could be mandated - for use with specific Content-Types. There are several - reasons why this is not the case. First, given the varying - types of transports used for mail, some encodings may be - appropriate for some Content-Type/transport combinations and - not for others. (For example, in an 8-bit transport, no - encoding would be required for text in certain character - sets, while such encodings are clearly required for 7-bit - SMTP.) Second, certain Content-Types may require different - types of transfer encoding under different circumstances. - For example, many PostScript bodies might consist entirely - of short lines of 7-bit data and hence require little or no - encoding. Other PostScript bodies (especially those using - Level 2 PostScript's binary encoding mechanism) may only be - reasonably represented using a binary transport encoding. - Finally, since Content-Type is intended to be an open-ended - specification mechanism, strict specification of an - association between Content-Types and encodings effectively - couples the specification of an application protocol with a - specific lower-level transport. This is not desirable since - the developers of a Content-Type should not have to be aware - of all the transports in use and what their limitations are. - - NOTE ON TRANSLATING ENCODINGS: The quoted-printable and - base64 encodings are designed so that conversion between - them is possible. The only issue that arises in such a - conversion is the handling of line breaks. When converting - from quoted-printable to base64 a line break must be - converted into a CRLF sequence. Similarly, a CRLF sequence - - - - Borenstein & Freed [Page 13] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - in base64 data should be converted to a quoted-printable - line break, but ONLY when converting text data. - - NOTE ON CANONICAL ENCODING MODEL: There was some - confusion, in earlier drafts of this memo, regarding the - model for when email data was to be converted to canonical - form and encoded, and in particular how this process would - affect the treatment of CRLFs, given that the representation - of newlines varies greatly from system to system. For this - reason, a canonical model for encoding is presented as - Appendix H. - - 5.1 Quoted-Printable Content-Transfer-Encoding - - The Quoted-Printable encoding is intended to represent data - that largely consists of octets that correspond to printable - characters in the ASCII character set. It encodes the data - in such a way that the resulting octets are unlikely to be - modified by mail transport. If the data being encoded are - mostly ASCII text, the encoded form of the data remains - largely recognizable by humans. A body which is entirely - ASCII may also be encoded in Quoted-Printable to ensure the - integrity of the data should the message pass through a - character-translating, and/or line-wrapping gateway. - - In this encoding, octets are to be represented as determined - by the following rules: - - Rule #1: (General 8-bit representation) Any octet, - except those indicating a line break according to the - newline convention of the canonical form of the data - being encoded, may be represented by an "=" followed by - a two digit hexadecimal representation of the octet's - value. The digits of the hexadecimal alphabet, for this - purpose, are "0123456789ABCDEF". Uppercase letters must - be - used when sending hexadecimal data, though a robust - implementation may choose to recognize lowercase - letters on receipt. Thus, for example, the value 12 - (ASCII form feed) can be represented by "=0C", and the - value 61 (ASCII EQUAL SIGN) can be represented by - "=3D". Except when the following rules allow an - alternative encoding, this rule is mandatory. - - Rule #2: (Literal representation) Octets with decimal - values of 33 through 60 inclusive, and 62 through 126, - inclusive, MAY be represented as the ASCII characters - which correspond to those octets (EXCLAMATION POINT - through LESS THAN, and GREATER THAN through TILDE, - respectively). - - Rule #3: (White Space): Octets with values of 9 and 32 - MAY be represented as ASCII TAB (HT) and SPACE - characters, respectively, but MUST NOT be so - - - - Borenstein & Freed [Page 14] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - represented at the end of an encoded line. Any TAB (HT) - or SPACE characters on an encoded line MUST thus be - followed on that line by a printable character. In - particular, an "=" at the end of an encoded line, - indicating a soft line break (see rule #5) may follow - one or more TAB (HT) or SPACE characters. It follows - that an octet with value 9 or 32 appearing at the end - of an encoded line must be represented according to - Rule #1. This rule is necessary because some MTAs - (Message Transport Agents, programs which transport - messages from one user to another, or perform a part of - such transfers) are known to pad lines of text with - SPACEs, and others are known to remove "white space" - characters from the end of a line. Therefore, when - decoding a Quoted-Printable body, any trailing white - space on a line must be deleted, as it will necessarily - have been added by intermediate transport agents. - - Rule #4 (Line Breaks): A line break in a text body - part, independent of what its representation is - following the canonical representation of the data - being encoded, must be represented by a (RFC 822) line - break, which is a CRLF sequence, in the Quoted- - Printable encoding. If isolated CRs and LFs, or LF CR - and CR LF sequences are allowed to appear in binary - data according to the canonical form, they must be - represented using the "=0D", "=0A", "=0A=0D" and - "=0D=0A" notations respectively. - - Note that many implementation may elect to encode the - local representation of various content types directly. - In particular, this may apply to plain text material on - systems that use newline conventions other than CRLF - delimiters. Such an implementation is permissible, but - the generation of line breaks must be generalized to - account for the case where alternate representations of - newline sequences are used. - - Rule #5 (Soft Line Breaks): The Quoted-Printable - encoding REQUIRES that encoded lines be no more than 76 - characters long. If longer lines are to be encoded with - the Quoted-Printable encoding, 'soft' line breaks must - be used. An equal sign as the last character on a - encoded line indicates such a non-significant ('soft') - line break in the encoded text. Thus if the "raw" form - of the line is a single unencoded line that says: - - Now's the time for all folk to come to the aid of - their country. - - This can be represented, in the Quoted-Printable - encoding, as - - - - - - Borenstein & Freed [Page 15] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Now's the time = - for all folk to come= - to the aid of their country. - - This provides a mechanism with which long lines are - encoded in such a way as to be restored by the user - agent. The 76 character limit does not count the - trailing CRLF, but counts all other characters, - including any equal signs. - - Since the hyphen character ("-") is represented as itself in - the Quoted-Printable encoding, care must be taken, when - encapsulating a quoted-printable encoded body in a multipart - entity, to ensure that the encapsulation boundary does not - appear anywhere in the encoded body. (A good strategy is to - choose a boundary that includes a character sequence such as - "=_" which can never appear in a quoted-printable body. See - the definition of multipart messages later in this - document.) - - NOTE: The quoted-printable encoding represents something of - a compromise between readability and reliability in - transport. Bodies encoded with the quoted-printable - encoding will work reliably over most mail gateways, but may - not work perfectly over a few gateways, notably those - involving translation into EBCDIC. (In theory, an EBCDIC - gateway could decode a quoted-printable body and re-encode - it using base64, but such gateways do not yet exist.) A - higher level of confidence is offered by the base64 - Content-Transfer-Encoding. A way to get reasonably reliable - transport through EBCDIC gateways is to also quote the ASCII - characters - - !"#$@[\]^`{|}~ - - according to rule #1. See Appendix B for more information. - - Because quoted-printable data is generally assumed to be - line-oriented, it is to be expected that the breaks between - the lines of quoted printable data may be altered in - transport, in the same manner that plain text mail has - always been altered in Internet mail when passing between - systems with differing newline conventions. If such - alterations are likely to constitute a corruption of the - data, it is probably more sensible to use the base64 - encoding rather than the quoted-printable encoding. - - - - - - - - - - - - Borenstein & Freed [Page 16] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 5.2 Base64 Content-Transfer-Encoding - - The Base64 Content-Transfer-Encoding is designed to - represent arbitrary sequences of octets in a form that is - not humanly readable. The encoding and decoding algorithms - are simple, but the encoded data are consistently only about - 33 percent larger than the unencoded data. This encoding is - based on the one used in Privacy Enhanced Mail applications, - as defined in RFC 1113. The base64 encoding is adapted - from RFC 1113, with one change: base64 eliminates the "*" - mechanism for embedded clear text. - - A 65-character subset of US-ASCII is used, enabling 6 bits - to be represented per printable character. (The extra 65th - character, "=", is used to signify a special processing - function.) - - NOTE: This subset has the important property that it is - represented identically in all versions of ISO 646, - including US ASCII, and all characters in the subset are - also represented identically in all versions of EBCDIC. - Other popular encodings, such as the encoding used by the - UUENCODE utility and the base85 encoding specified as part - of Level 2 PostScript, do not share these properties, and - thus do not fulfill the portability requirements a binary - transport encoding for mail must meet. - - The encoding process represents 24-bit groups of input bits - as output strings of 4 encoded characters. Proceeding from - left to right, a 24-bit input group is formed by - concatenating 3 8-bit input groups. These 24 bits are then - treated as 4 concatenated 6-bit groups, each of which is - translated into a single digit in the base64 alphabet. When - encoding a bit stream via the base64 encoding, the bit - stream must be presumed to be ordered with the most- - significant-bit first. That is, the first bit in the stream - will be the high-order bit in the first byte, and the eighth - bit will be the low-order bit in the first byte, and so on. - - Each 6-bit group is used as an index into an array of 64 - printable characters. The character referenced by the index - is placed in the output string. These characters, identified - in Table 1, below, are selected so as to be universally - representable, and the set excludes characters with - particular significance to SMTP (e.g., ".", "CR", "LF") and - to the encapsulation boundaries defined in this document - (e.g., "-"). - - - - - - - - - - - Borenstein & Freed [Page 17] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Table 1: The Base64 Alphabet - - Value Encoding Value Encoding Value Encoding Value - Encoding - 0 A 17 R 34 i 51 z - 1 B 18 S 35 j 52 0 - 2 C 19 T 36 k 53 1 - 3 D 20 U 37 l 54 2 - 4 E 21 V 38 m 55 3 - 5 F 22 W 39 n 56 4 - 6 G 23 X 40 o 57 5 - 7 H 24 Y 41 p 58 6 - 8 I 25 Z 42 q 59 7 - 9 J 26 a 43 r 60 8 - 10 K 27 b 44 s 61 9 - 11 L 28 c 45 t 62 + - 12 M 29 d 46 u 63 / - 13 N 30 e 47 v - 14 O 31 f 48 w (pad) = - 15 P 32 g 49 x - 16 Q 33 h 50 y - - The output stream (encoded bytes) must be represented in - lines of no more than 76 characters each. All line breaks - or other characters not found in Table 1 must be ignored by - decoding software. In base64 data, characters other than - those in Table 1, line breaks, and other white space - probably indicate a transmission error, about which a - warning message or even a message rejection might be - appropriate under some circumstances. - - Special processing is performed if fewer than 24 bits are - available at the end of the data being encoded. A full - encoding quantum is always completed at the end of a body. - When fewer than 24 input bits are available in an input - group, zero bits are added (on the right) to form an - integral number of 6-bit groups. Output character positions - which are not required to represent actual input data are - set to the character "=". Since all base64 input is an - integral number of octets, only the following cases can - arise: (1) the final quantum of encoding input is an - integral multiple of 24 bits; here, the final unit of - encoded output will be an integral multiple of 4 characters - with no "=" padding, (2) the final quantum of encoding input - is exactly 8 bits; here, the final unit of encoded output - will be two characters followed by two "=" padding - characters, or (3) the final quantum of encoding input is - exactly 16 bits; here, the final unit of encoded output will - be three characters followed by one "=" padding character. - - Care must be taken to use the proper octets for line breaks - if base64 encoding is applied directly to text material that - has not been converted to canonical form. In particular, - text line breaks should be converted into CRLF sequences - - - - Borenstein & Freed [Page 18] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - prior to base64 encoding. The important thing to note is - that this may be done directly by the encoder rather than in - a prior canonicalization step in some implementations. - - NOTE: There is no need to worry about quoting apparent - encapsulation boundaries within base64-encoded parts of - multipart entities because no hyphen characters are used in - the base64 encoding. - - 6 Additional Optional Content- Header Fields - - 6.1 Optional Content-ID Header Field - - In constructing a high-level user agent, it may be desirable - to allow one body to make reference to another. - Accordingly, bodies may be labeled using the "Content-ID" - header field, which is syntactically identical to the - "Message-ID" header field: - - Content-ID := msg-id - - Like the Message-ID values, Content-ID values must be - generated to be as unique as possible. - - 6.2 Optional Content-Description Header Field - - The ability to associate some descriptive information with a - given body is often desirable. For example, it may be useful - to mark an "image" body as "a picture of the Space Shuttle - Endeavor." Such text may be placed in the Content- - Description header field. - - Content-Description := *text - - The description is presumed to be given in the US-ASCII - character set, although the mechanism specified in [RFC- - 1342] may be used for non-US-ASCII Content-Description - values. - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 19] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7 The Predefined Content-Type Values - - This document defines seven initial Content-Type values and - an extension mechanism for private or experimental types. - Further standard types must be defined by new published - specifications. It is expected that most innovation in new - types of mail will take place as subtypes of the seven types - defined here. The most essential characteristics of the - seven content-types are summarized in Appendix G. - - 7.1 The Text Content-Type - - The text Content-Type is intended for sending material which - is principally textual in form. It is the default Content- - Type. A "charset" parameter may be used to indicate the - character set of the body text. The primary subtype of text - is "plain". This indicates plain (unformatted) text. The - default Content-Type for Internet mail is "text/plain; - charset=us-ascii". - - Beyond plain text, there are many formats for representing - what might be known as "extended text" -- text with embedded - formatting and presentation information. An interesting - characteristic of many such representations is that they are - to some extent readable even without the software that - interprets them. It is useful, then, to distinguish them, - at the highest level, from such unreadable data as images, - audio, or text represented in an unreadable form. In the - absence of appropriate interpretation software, it is - reasonable to show subtypes of text to the user, while it is - not reasonable to do so with most nontextual data. - - Such formatted textual data should be represented using - subtypes of text. Plausible subtypes of text are typically - given by the common name of the representation format, e.g., - "text/richtext". - - 7.1.1 The charset parameter - - A critical parameter that may be specified in the Content- - Type field for text data is the character set. This is - specified with a "charset" parameter, as in: - - Content-type: text/plain; charset=us-ascii - - Unlike some other parameter values, the values of the - charset parameter are NOT case sensitive. The default - character set, which must be assumed in the absence of a - charset parameter, is US-ASCII. - - An initial list of predefined character set names can be - found at the end of this section. Additional character sets - may be registered with IANA as described in Appendix F, - although the standardization of their use requires the usual - - - - Borenstein & Freed [Page 20] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - IAB review and approval. Note that if the specified - character set includes 8-bit data, a Content-Transfer- - Encoding header field and a corresponding encoding on the - data are required in order to transmit the body via some - mail transfer protocols, such as SMTP. - - The default character set, US-ASCII, has been the subject of - some confusion and ambiguity in the past. Not only were - there some ambiguities in the definition, there have been - wide variations in practice. In order to eliminate such - ambiguity and variations in the future, it is strongly - recommended that new user agents explicitly specify a - character set via the Content-Type header field. "US-ASCII" - does not indicate an arbitrary seven-bit character code, but - specifies that the body uses character coding that uses the - exact correspondence of codes to characters specified in - ASCII. National use variations of ISO 646 [ISO-646] are NOT - ASCII and their use in Internet mail is explicitly - discouraged. The omission of the ISO 646 character set is - deliberate in this regard. The character set name of "US- - ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only. - The character set name "ASCII" is reserved and must not be - used for any purpose. - - NOTE: RFC 821 explicitly specifies "ASCII", and references - an earlier version of the American Standard. Insofar as one - of the purposes of specifying a Content-Type and character - set is to permit the receiver to unambiguously determine how - the sender intended the coded message to be interpreted, - assuming anything other than "strict ASCII" as the default - would risk unintentional and incompatible changes to the - semantics of messages now being transmitted. This also - implies that messages containing characters coded according - to national variations on ISO 646, or using code-switching - procedures (e.g., those of ISO 2022), as well as 8-bit or - multiple octet character encodings MUST use an appropriate - character set specification to be consistent with this - specification. - - The complete US-ASCII character set is listed in [US-ASCII]. - Note that the control characters including DEL (0-31, 127) - have no defined meaning apart from the combination CRLF - (ASCII values 13 and 10) indicating a new line. Two of the - characters have de facto meanings in wide use: FF (12) often - means "start subsequent text on the beginning of a new - page"; and TAB or HT (9) often (though not always) means - "move the cursor to the next available column after the - current position where the column number is a multiple of 8 - (counting the first column as column 0)." Apart from this, - any use of the control characters or DEL in a body must be - part of a private agreement between the sender and - recipient. Such private agreements are discouraged and - should be replaced by the other capabilities of this - document. - - - - Borenstein & Freed [Page 21] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - NOTE: Beyond US-ASCII, an enormous proliferation of - character sets is possible. It is the opinion of the IETF - working group that a large number of character sets is NOT a - good thing. We would prefer to specify a single character - set that can be used universally for representing all of the - world's languages in electronic mail. Unfortunately, - existing practice in several communities seems to point to - the continued use of multiple character sets in the near - future. For this reason, we define names for a small number - of character sets for which a strong constituent base - exists. It is our hope that ISO 10646 or some other - effort will eventually define a single world character set - which can then be specified for use in Internet mail, but in - the advance of that definition we cannot specify the use of - ISO 10646, Unicode, or any other character set whose - definition is, as of this writing, incomplete. - - The defined charset values are: - - US-ASCII -- as defined in [US-ASCII]. - - ISO-8859-X -- where "X" is to be replaced, as - necessary, for the parts of ISO-8859 [ISO- - 8859]. Note that the ISO 646 character sets - have deliberately been omitted in favor of - their 8859 replacements, which are the - designated character sets for Internet mail. - As of the publication of this document, the - legitimate values for "X" are the digits 1 - through 9. - - Note that the character set used, if anything other than - US-ASCII, must always be explicitly specified in the - Content-Type field. - - No other character set name may be used in Internet mail - without the publication of a formal specification and its - registration with IANA as described in Appendix F, or by - private agreement, in which case the character set name must - begin with "X-". - - Implementors are discouraged from defining new character - sets for mail use unless absolutely necessary. - - The "charset" parameter has been defined primarily for the - purpose of textual data, and is described in this section - for that reason. However, it is conceivable that non- - textual data might also wish to specify a charset value for - some purpose, in which case the same syntax and values - should be used. - - In general, mail-sending software should always use the - "lowest common denominator" character set possible. For - example, if a body contains only US-ASCII characters, it - - - - Borenstein & Freed [Page 22] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - should be marked as being in the US-ASCII character set, not - ISO-8859-1, which, like all the ISO-8859 family of character - sets, is a superset of US-ASCII. More generally, if a - widely-used character set is a subset of another character - set, and a body contains only characters in the widely-used - subset, it should be labeled as being in that subset. This - will increase the chances that the recipient will be able to - view the mail correctly. - - 7.1.2 The Text/plain subtype - - The primary subtype of text is "plain". This indicates - plain (unformatted) text. The default Content-Type for - Internet mail, "text/plain; charset=us-ascii", describes - existing Internet practice, that is, it is the type of body - defined by RFC 822. - - 7.1.3 The Text/richtext subtype - - In order to promote the wider interoperability of simple - formatted text, this document defines an extremely simple - subtype of "text", the "richtext" subtype. This subtype was - designed to meet the following criteria: - - 1. The syntax must be extremely simple to parse, - so that even teletype-oriented mail systems can - easily strip away the formatting information and - leave only the readable text. - - 2. The syntax must be extensible to allow for new - formatting commands that are deemed essential. - - 3. The capabilities must be extremely limited, to - ensure that it can represent no more than is - likely to be representable by the user's primary - word processor. While this limits what can be - sent, it increases the likelihood that what is - sent can be properly displayed. - - 4. The syntax must be compatible with SGML, so - that, with an appropriate DTD (Document Type - Definition, the standard mechanism for defining a - document type using SGML), a general SGML parser - could be made to parse richtext. However, despite - this compatibility, the syntax should be far - simpler than full SGML, so that no SGML knowledge - is required in order to implement it. - - The syntax of "richtext" is very simple. It is assumed, at - the top-level, to be in the US-ASCII character set, unless - of course a different charset parameter was specified in the - Content-type field. All characters represent themselves, - with the exception of the "<" character (ASCII 60), which is - used to mark the beginning of a formatting command. - - - - Borenstein & Freed [Page 23] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Formatting instructions consist of formatting commands - surrounded by angle brackets ("<>", ASCII 60 and 62). Each - formatting command may be no more than 40 characters in - length, all in US-ASCII, restricted to the alphanumeric and - hyphen ("-") characters. Formatting commands may be preceded - by a forward slash or solidus ("/", ASCII 47), making them - negations, and such negations must always exist to balance - the initial opening commands, except as noted below. Thus, - if the formatting command "<bold>" appears at some point, - there must later be a "</bold>" to balance it. There are - only three exceptions to this "balancing" rule: First, the - command "<lt>" is used to represent a literal "<" character. - Second, the command "<nl>" is used to represent a required - line break. (Otherwise, CRLFs in the data are treated as - equivalent to a single SPACE character.) Finally, the - command "<np>" is used to represent a page break. (NOTE: - The 40 character limit on formatting commands does not - include the "<", ">", or "/" characters that might be - attached to such commands.) - - Initially defined formatting commands, not all of which will - be implemented by all richtext implementations, include: - - Bold -- causes the subsequent text to be in a bold - font. - Italic -- causes the subsequent text to be in an italic - font. - Fixed -- causes the subsequent text to be in a fixed - width font. - Smaller -- causes the subsequent text to be in a - smaller font. - Bigger -- causes the subsequent text to be in a bigger - font. - Underline -- causes the subsequent text to be - underlined. - Center -- causes the subsequent text to be centered. - FlushLeft -- causes the subsequent text to be left - justified. - FlushRight -- causes the subsequent text to be right - justified. - Indent -- causes the subsequent text to be indented at - the left margin. - IndentRight -- causes the subsequent text to be - indented at the right margin. - Outdent -- causes the subsequent text to be outdented - at the left margin. - OutdentRight -- causes the subsequent text to be - outdented at the right margin. - SamePage -- causes the subsequent text to be grouped, - if possible, on one page. - Subscript -- causes the subsequent text to be - interpreted as a subscript. - - - - - - Borenstein & Freed [Page 24] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Superscript -- causes the subsequent text to be - interpreted as a superscript. - Heading -- causes the subsequent text to be interpreted - as a page heading. - Footing -- causes the subsequent text to be interpreted - as a page footing. - ISO-8859-X (for any value of X that is legal as a - "charset" parameter) -- causes the subsequent text - to be interpreted as text in the appropriate - character set. - US-ASCII -- causes the subsequent text to be - interpreted as text in the US-ASCII character set. - Excerpt -- causes the subsequent text to be interpreted - as a textual excerpt from another source. - Typically this will be displayed using indentation - and an alternate font, but such decisions are up - to the viewer. - Paragraph -- causes the subsequent text to be - interpreted as a single paragraph, with - appropriate paragraph breaks (typically blank - space) before and after. - Signature -- causes the subsequent text to be - interpreted as a "signature". Some systems may - wish to display signatures in a smaller font or - otherwise set them apart from the main text of the - message. - Comment -- causes the subsequent text to be interpreted - as a comment, and hence not shown to the reader. - No-op -- has no effect on the subsequent text. - lt -- <lt> is replaced by a literal "<" character. No - balancing </lt> is allowed. - nl -- <nl> causes a line break. No balancing </nl> is - allowed. - np -- <np> causes a page break. No balancing </np> is - allowed. - - Each positive formatting command affects all subsequent text - until the matching negative formatting command. Such pairs - of formatting commands must be properly balanced and nested. - Thus, a proper way to describe text in bold italics is: - - <bold><italic>the-text</italic></bold> - - or, alternately, - - <italic><bold>the-text</bold></italic> - - but, in particular, the following is illegal - richtext: - - <bold><italic>the-text</bold></italic> - - NOTE: The nesting requirement for formatting commands - imposes a slightly higher burden upon the composers of - - - - Borenstein & Freed [Page 25] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - richtext bodies, but potentially simplifies richtext - displayers by allowing them to be stack-based. The main - goal of richtext is to be simple enough to make multifont, - formatted email widely readable, so that those with the - capability of sending it will be able to do so with - confidence. Thus slightly increased complexity in the - composing software was deemed a reasonable tradeoff for - simplified reading software. Nonetheless, implementors of - richtext readers are encouraged to follow the general - Internet guidelines of being conservative in what you send - and liberal in what you accept. Those implementations that - can do so are encouraged to deal reasonably with improperly - nested richtext. - - Implementations must regard any unrecognized formatting - command as equivalent to "No-op", thus facilitating future - extensions to "richtext". Private extensions may be defined - using formatting commands that begin with "X-", by analogy - to Internet mail header field names. - - It is worth noting that no special behavior is required for - the TAB (HT) character. It is recommended, however, that, at - least when fixed-width fonts are in use, the common - semantics of the TAB (HT) character should be observed, - namely that it moves to the next column position that is a - multiple of 8. (In other words, if a TAB (HT) occurs in - column n, where the leftmost column is column 0, then that - TAB (HT) should be replaced by 8-(n mod 8) SPACE - characters.) - - Richtext also differentiates between "hard" and "soft" line - breaks. A line break (CRLF) in the richtext data stream is - interpreted as a "soft" line break, one that is included - only for purposes of mail transport, and is to be treated as - white space by richtext interpreters. To include a "hard" - line break (one that must be displayed as such), the "<nl>" - or "<paragraph> formatting constructs should be used. In - general, a soft line break should be treated as white space, - but when soft line breaks immediately follow a <nl> or a - </paragraph> tag they should be ignored rather than treated - as white space. - - Putting all this together, the following "text/richtext" - body fragment: - - <bold>Now</bold> is the time for - <italic>all</italic> good men - <smaller>(and <lt>women>)</smaller> to - <ignoreme></ignoreme> come - - to the aid of their - <nl> - - - - - - Borenstein & Freed [Page 26] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - beloved <nl><nl>country. <comment> Stupid - quote! </comment> -- the end - - represents the following formatted text (which will, no - doubt, look cryptic in the text-only version of this - document): - - Now is the time for all good men (and <women>) to - come to the aid of their - beloved - - country. -- the end - - Richtext conformance: A minimal richtext implementation is - one that simply converts "<lt>" to "<", converts CRLFs to - SPACE, converts <nl> to a newline according to local newline - convention, removes everything between a <comment> command - and the next balancing </comment> command, and removes all - other formatting commands (all text enclosed in angle - brackets). - - NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML: Richtext is - decidedly not SGML, and must not be used to transport - arbitrary SGML documents. Those who wish to use SGML - document types as a mail transport format must define a new - text or application subtype, e.g., "text/sgml-dtd-whatever" - or "application/sgml-dtd-whatever", depending on the - perceived readability of the DTD in use. Richtext is - designed to be compatible with SGML, and specifically so - that it will be possible to define a richtext DTD if one is - needed. However, this does not imply that arbitrary SGML - can be called richtext, nor that richtext implementors have - any need to understand SGML; the description in this - document is a complete definition of richtext, which is far - simpler than complete SGML. - - NOTE ON THE INTENDED USE OF RICHTEXT: It is recognized that - implementors of future mail systems will want rich text - functionality far beyond that currently defined for - richtext. The intent of richtext is to provide a common - format for expressing that functionality in a form in which - much of it, at least, will be understood by interoperating - software. Thus, in particular, software with a richer - notion of formatted text than richtext can still use - richtext as its basic representation, but can extend it with - new formatting commands and by hiding information specific - to that software system in richtext comments. As such - systems evolve, it is expected that the definition of - richtext will be further refined by future published - specifications, but richtext as defined here provides a - platform on which evolutionary refinements can be based. - - IMPLEMENTATION NOTE: In some environments, it might be - impossible to combine certain richtext formatting commands, - - - - Borenstein & Freed [Page 27] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - whereas in others they might be combined easily. For - example, the combination of <bold> and <italic> might - produce bold italics on systems that support such fonts, but - there exist systems that can make text bold or italicized, - but not both. In such cases, the most recently issued - recognized formatting command should be preferred. - - One of the major goals in the design of richtext was to make - it so simple that even text-only mailers will implement - richtext-to-plain-text translators, thus increasing the - likelihood that multifont text will become "safe" to use - very widely. To demonstrate this simplicity, an extremely - simple 35-line C program that converts richtext input into - plain text output is included in Appendix D. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 28] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.2 The Multipart Content-Type - - In the case of multiple part messages, in which one or more - different sets of data are combined in a single body, a - "multipart" Content-Type field must appear in the entity's - header. The body must then contain one or more "body parts," - each preceded by an encapsulation boundary, and the last one - followed by a closing boundary. Each part starts with an - encapsulation boundary, and then contains a body part - consisting of header area, a blank line, and a body area. - Thus a body part is similar to an RFC 822 message in syntax, - but different in meaning. - - A body part is NOT to be interpreted as actually being an - RFC 822 message. To begin with, NO header fields are - actually required in body parts. A body part that starts - with a blank line, therefore, is allowed and is a body part - for which all default values are to be assumed. In such a - case, the absence of a Content-Type header field implies - that the encapsulation is plain US-ASCII text. The only - header fields that have defined meaning for body parts are - those the names of which begin with "Content-". All other - header fields are generally to be ignored in body parts. - Although they should generally be retained in mail - processing, they may be discarded by gateways if necessary. - Such other fields are permitted to appear in body parts but - should not be depended on. "X-" fields may be created for - experimental or private purposes, with the recognition that - the information they contain may be lost at some gateways. - - The distinction between an RFC 822 message and a body part - is subtle, but important. A gateway between Internet and - X.400 mail, for example, must be able to tell the difference - between a body part that contains an image and a body part - that contains an encapsulated message, the body of which is - an image. In order to represent the latter, the body part - must have "Content-Type: message", and its body (after the - blank line) must be the encapsulated message, with its own - "Content-Type: image" header field. The use of similar - syntax facilitates the conversion of messages to body parts, - and vice versa, but the distinction between the two must be - understood by implementors. (For the special case in which - all parts actually are messages, a "digest" subtype is also - defined.) - - As stated previously, each body part is preceded by an - encapsulation boundary. The encapsulation boundary MUST NOT - appear inside any of the encapsulated parts. Thus, it is - crucial that the composing agent be able to choose and - specify the unique boundary that will separate the parts. - - All present and future subtypes of the "multipart" type must - use an identical syntax. Subtypes may differ in their - semantics, and may impose additional restrictions on syntax, - - - - Borenstein & Freed [Page 29] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - but must conform to the required syntax for the multipart - type. This requirement ensures that all conformant user - agents will at least be able to recognize and separate the - parts of any multipart entity, even of an unrecognized - subtype. - - As stated in the definition of the Content-Transfer-Encoding - field, no encoding other than "7bit", "8bit", or "binary" is - permitted for entities of type "multipart". The multipart - delimiters and header fields are always 7-bit ASCII in any - case, and data within the body parts can be encoded on a - part-by-part basis, with Content-Transfer-Encoding fields - for each appropriate body part. - - Mail gateways, relays, and other mail handling agents are - commonly known to alter the top-level header of an RFC 822 - message. In particular, they frequently add, remove, or - reorder header fields. Such alterations are explicitly - forbidden for the body part headers embedded in the bodies - of messages of type "multipart." - - 7.2.1 Multipart: The common syntax - - All subtypes of "multipart" share a common syntax, defined - in this section. A simple example of a multipart message - also appears in this section. An example of a more complex - multipart message is given in Appendix C. - - The Content-Type field for multipart entities requires one - parameter, "boundary", which is used to specify the - encapsulation boundary. The encapsulation boundary is - defined as a line consisting entirely of two hyphen - characters ("-", decimal code 45) followed by the boundary - parameter value from the Content-Type header field. - - NOTE: The hyphens are for rough compatibility with the - earlier RFC 934 method of message encapsulation, and for - ease of searching for the boundaries in some - implementations. However, it should be noted that multipart - messages are NOT completely compatible with RFC 934 - encapsulations; in particular, they do not obey RFC 934 - quoting conventions for embedded lines that begin with - hyphens. This mechanism was chosen over the RFC 934 - mechanism because the latter causes lines to grow with each - level of quoting. The combination of this growth with the - fact that SMTP implementations sometimes wrap long lines - made the RFC 934 mechanism unsuitable for use in the event - that deeply-nested multipart structuring is ever desired. - - Thus, a typical multipart Content-Type header field might - look like this: - - Content-Type: multipart/mixed; - - - - - Borenstein & Freed [Page 30] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - boundary=gc0p4Jq0M2Yt08jU534c0p - - This indicates that the entity consists of several parts, - each itself with a structure that is syntactically identical - to an RFC 822 message, except that the header area might be - completely empty, and that the parts are each preceded by - the line - - --gc0p4Jq0M2Yt08jU534c0p - - Note that the encapsulation boundary must occur at the - beginning of a line, i.e., following a CRLF, and that that - initial CRLF is considered to be part of the encapsulation - boundary rather than part of the preceding part. The - boundary must be followed immediately either by another CRLF - and the header fields for the next part, or by two CRLFs, in - which case there are no header fields for the next part (and - it is therefore assumed to be of Content-Type text/plain). - - NOTE: The CRLF preceding the encapsulation line is - considered part of the boundary so that it is possible to - have a part that does not end with a CRLF (line break). - Body parts that must be considered to end with line breaks, - therefore, should have two CRLFs preceding the encapsulation - line, the first of which is part of the preceding body part, - and the second of which is part of the encapsulation - boundary. - - The requirement that the encapsulation boundary begins with - a CRLF implies that the body of a multipart entity must - itself begin with a CRLF before the first encapsulation line - -- that is, if the "preamble" area is not used, the entity - headers must be followed by TWO CRLFs. This is indeed how - such entities should be composed. A tolerant mail reading - program, however, may interpret a body of type multipart - that begins with an encapsulation line NOT initiated by a - CRLF as also being an encapsulation boundary, but a - compliant mail sending program must not generate such - entities. - - Encapsulation boundaries must not appear within the - encapsulations, and must be no longer than 70 characters, - not counting the two leading hyphens. - - The encapsulation boundary following the last body part is a - distinguished delimiter that indicates that no further body - parts will follow. Such a delimiter is identical to the - previous delimiters, with the addition of two more hyphens - at the end of the line: - - --gc0p4Jq0M2Yt08jU534c0p-- - - There appears to be room for additional information prior to - the first encapsulation boundary and following the final - - - - Borenstein & Freed [Page 31] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - boundary. These areas should generally be left blank, and - implementations should ignore anything that appears before - the first boundary or after the last one. - - NOTE: These "preamble" and "epilogue" areas are not used - because of the lack of proper typing of these parts and the - lack of clear semantics for handling these areas at - gateways, particularly X.400 gateways. - - NOTE: Because encapsulation boundaries must not appear in - the body parts being encapsulated, a user agent must - exercise care to choose a unique boundary. The boundary in - the example above could have been the result of an algorithm - designed to produce boundaries with a very low probability - of already existing in the data to be encapsulated without - having to prescan the data. Alternate algorithms might - result in more 'readable' boundaries for a recipient with an - old user agent, but would require more attention to the - possibility that the boundary might appear in the - encapsulated part. The simplest boundary possible is - something like "---", with a closing boundary of "-----". - - As a very simple example, the following multipart message - has two parts, both of them plain text, one of them - explicitly typed and one of them implicitly typed: - - From: Nathaniel Borenstein <nsb@bellcore.com> - To: Ned Freed <ned@innosoft.com> - Subject: Sample message - MIME-Version: 1.0 - Content-type: multipart/mixed; boundary="simple - boundary" - - This is the preamble. It is to be ignored, though it - is a handy place for mail composers to include an - explanatory note to non-MIME compliant readers. - --simple boundary - - This is implicitly typed plain ASCII text. - It does NOT end with a linebreak. - --simple boundary - Content-type: text/plain; charset=us-ascii - - This is explicitly typed plain ASCII text. - It DOES end with a linebreak. - - --simple boundary-- - This is the epilogue. It is also to be ignored. - - The use of a Content-Type of multipart in a body part within - another multipart entity is explicitly allowed. In such - cases, for obvious reasons, care must be taken to ensure - that each nested multipart entity must use a different - boundary delimiter. See Appendix C for an example of nested - - - - Borenstein & Freed [Page 32] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - multipart entities. - - The use of the multipart Content-Type with only a single - body part may be useful in certain contexts, and is - explicitly permitted. - - The only mandatory parameter for the multipart Content-Type - is the boundary parameter, which consists of 1 to 70 - characters from a set of characters known to be very robust - through email gateways, and NOT ending with white space. - (If a boundary appears to end with white space, the white - space must be presumed to have been added by a gateway, and - should be deleted.) It is formally specified by the - following BNF: - - boundary := 0*69<bchars> bcharsnospace - - bchars := bcharsnospace / " " - - bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / - "_" - / "," / "-" / "." / "/" / ":" / "=" / "?" - - Overall, the body of a multipart entity may be specified as - follows: - - multipart-body := preamble 1*encapsulation - close-delimiter epilogue - - encapsulation := delimiter CRLF body-part - - delimiter := CRLF "--" boundary ; taken from Content-Type - field. - ; when content-type is - multipart - ; There must be no space - ; between "--" and boundary. - - close-delimiter := delimiter "--" ; Again, no space before - "--" - - preamble := *text ; to be ignored upon - receipt. - - epilogue := *text ; to be ignored upon - receipt. - - body-part = <"message" as defined in RFC 822, - with all header fields optional, and with the - specified delimiter not occurring anywhere in - the message body, either on a line by itself - or as a substring anywhere. Note that the - - - - - - Borenstein & Freed [Page 33] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - semantics of a part differ from the semantics - of a message, as described in the text.> - - NOTE: Conspicuously missing from the multipart type is a - notion of structured, related body parts. In general, it - seems premature to try to standardize interpart structure - yet. It is recommended that those wishing to provide a more - structured or integrated multipart messaging facility should - define a subtype of multipart that is syntactically - identical, but that always expects the inclusion of a - distinguished part that can be used to specify the structure - and integration of the other parts, probably referring to - them by their Content-ID field. If this approach is used, - other implementations will not recognize the new subtype, - but will treat it as the primary subtype (multipart/mixed) - and will thus be able to show the user the parts that are - recognized. - - 7.2.2 The Multipart/mixed (primary) subtype - - The primary subtype for multipart, "mixed", is intended for - use when the body parts are independent and intended to be - displayed serially. Any multipart subtypes that an - implementation does not recognize should be treated as being - of subtype "mixed". - - 7.2.3 The Multipart/alternative subtype - - The multipart/alternative type is syntactically identical to - multipart/mixed, but the semantics are different. In - particular, each of the parts is an "alternative" version of - the same information. User agents should recognize that the - content of the various parts are interchangeable. The user - agent should either choose the "best" type based on the - user's environment and preferences, or offer the user the - available alternatives. In general, choosing the best type - means displaying only the LAST part that can be displayed. - This may be used, for example, to send mail in a fancy text - format in such a way that it can easily be displayed - anywhere: - - From: Nathaniel Borenstein <nsb@bellcore.com> - To: Ned Freed <ned@innosoft.com> - Subject: Formatted text mail - MIME-Version: 1.0 - Content-Type: multipart/alternative; boundary=boundary42 - - - --boundary42 - Content-Type: text/plain; charset=us-ascii - - ...plain text version of message goes here.... - - - - - - Borenstein & Freed [Page 34] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - --boundary42 - Content-Type: text/richtext - - .... richtext version of same message goes here ... - --boundary42 - Content-Type: text/x-whatever - - .... fanciest formatted version of same message goes here - ... - --boundary42-- - - In this example, users whose mail system understood the - "text/x-whatever" format would see only the fancy version, - while other users would see only the richtext or plain text - version, depending on the capabilities of their system. - - In general, user agents that compose multipart/alternative - entities should place the body parts in increasing order of - preference, that is, with the preferred format last. For - fancy text, the sending user agent should put the plainest - format first and the richest format last. Receiving user - agents should pick and display the last format they are - capable of displaying. In the case where one of the - alternatives is itself of type "multipart" and contains - unrecognized sub-parts, the user agent may choose either to - show that alternative, an earlier alternative, or both. - - NOTE: From an implementor's perspective, it might seem more - sensible to reverse this ordering, and have the plainest - alternative last. However, placing the plainest alternative - first is the friendliest possible option when - mutlipart/alternative entities are viewed using a non-MIME- - compliant mail reader. While this approach does impose some - burden on compliant mail readers, interoperability with - older mail readers was deemed to be more important in this - case. - - It may be the case that some user agents, if they can - recognize more than one of the formats, will prefer to offer - the user the choice of which format to view. This makes - sense, for example, if mail includes both a nicely-formatted - image version and an easily-edited text version. What is - most critical, however, is that the user not automatically - be shown multiple versions of the same data. Either the - user should be shown the last recognized version or should - explicitly be given the choice. - - - - - - - - - - - - Borenstein & Freed [Page 35] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.2.4 The Multipart/digest subtype - - This document defines a "digest" subtype of the multipart - Content-Type. This type is syntactically identical to - multipart/mixed, but the semantics are different. In - particular, in a digest, the default Content-Type value for - a body part is changed from "text/plain" to - "message/rfc822". This is done to allow a more readable - digest format that is largely compatible (except for the - quoting convention) with RFC 934. - - A digest in this format might, then, look something like - this: - - From: Moderator-Address - MIME-Version: 1.0 - Subject: Internet Digest, volume 42 - Content-Type: multipart/digest; - boundary="---- next message ----" - - - ------ next message ---- - - From: someone-else - Subject: my opinion - - ...body goes here ... - - ------ next message ---- - - From: someone-else-again - Subject: my different opinion - - ... another body goes here... - - ------ next message ------ - - 7.2.5 The Multipart/parallel subtype - - This document defines a "parallel" subtype of the multipart - Content-Type. This type is syntactically identical to - multipart/mixed, but the semantics are different. In - particular, in a parallel entity, all of the parts are - intended to be presented in parallel, i.e., simultaneously, - on hardware and software that are capable of doing so. - Composing agents should be aware that many mail readers will - lack this capability and will show the parts serially in any - event. - - - - - - - - - - Borenstein & Freed [Page 36] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.3 The Message Content-Type - - It is frequently desirable, in sending mail, to encapsulate - another mail message. For this common operation, a special - Content-Type, "message", is defined. The primary subtype, - message/rfc822, has no required parameters in the Content- - Type field. Additional subtypes, "partial" and "External- - body", do have required parameters. These subtypes are - explained below. - - NOTE: It has been suggested that subtypes of message might - be defined for forwarded or rejected messages. However, - forwarded and rejected messages can be handled as multipart - messages in which the first part contains any control or - descriptive information, and a second part, of type - message/rfc822, is the forwarded or rejected message. - Composing rejection and forwarding messages in this manner - will preserve the type information on the original message - and allow it to be correctly presented to the recipient, and - hence is strongly encouraged. - - As stated in the definition of the Content-Transfer-Encoding - field, no encoding other than "7bit", "8bit", or "binary" is - permitted for messages or parts of type "message". The - message header fields are always US-ASCII in any case, and - data within the body can still be encoded, in which case the - Content-Transfer-Encoding header field in the encapsulated - message will reflect this. Non-ASCII text in the headers of - an encapsulated message can be specified using the - mechanisms described in [RFC-1342]. - - Mail gateways, relays, and other mail handling agents are - commonly known to alter the top-level header of an RFC 822 - message. In particular, they frequently add, remove, or - reorder header fields. Such alterations are explicitly - forbidden for the encapsulated headers embedded in the - bodies of messages of type "message." - - 7.3.1 The Message/rfc822 (primary) subtype - - A Content-Type of "message/rfc822" indicates that the body - contains an encapsulated message, with the syntax of an RFC - 822 message. - - 7.3.2 The Message/Partial subtype - - A subtype of message, "partial", is defined in order to - allow large objects to be delivered as several separate - pieces of mail and automatically reassembled by the - receiving user agent. (The concept is similar to IP - fragmentation/reassembly in the basic Internet Protocols.) - This mechanism can be used when intermediate transport - agents limit the size of individual messages that can be - sent. Content-Type "message/partial" thus indicates that - - - - Borenstein & Freed [Page 37] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - the body contains a fragment of a larger message. - - Three parameters must be specified in the Content-Type field - of type message/partial: The first, "id", is a unique - identifier, as close to a world-unique identifier as - possible, to be used to match the parts together. (In - general, the identifier is essentially a message-id; if - placed in double quotes, it can be any message-id, in - accordance with the BNF for "parameter" given earlier in - this specification.) The second, "number", an integer, is - the part number, which indicates where this part fits into - the sequence of fragments. The third, "total", another - integer, is the total number of parts. This third subfield - is required on the final part, and is optional on the - earlier parts. Note also that these parameters may be given - in any order. - - Thus, part 2 of a 3-part message may have either of the - following header fields: - - Content-Type: Message/Partial; - number=2; total=3; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; - - Content-Type: Message/Partial; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; - number=2 - - But part 3 MUST specify the total number of parts: - - Content-Type: Message/Partial; - number=3; total=3; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; - - Note that part numbering begins with 1, not 0. - - When the parts of a message broken up in this manner are put - together, the result is a complete RFC 822 format message, - which may have its own Content-Type header field, and thus - may contain any other data type. - - Message fragmentation and reassembly: The semantics of a - reassembled partial message must be those of the "inner" - message, rather than of a message containing the inner - message. This makes it possible, for example, to send a - large audio message as several partial messages, and still - have it appear to the recipient as a simple audio message - rather than as an encapsulated message containing an audio - message. That is, the encapsulation of the message is - considered to be "transparent". - - When generating and reassembling the parts of a - message/partial message, the headers of the encapsulated - message must be merged with the headers of the enclosing - - - - Borenstein & Freed [Page 38] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - entities. In this process the following rules must be - observed: - - (1) All of the headers from the initial enclosing - entity (part one), except those that start with - "Content-" and "Message-ID", must be copied, in - order, to the new message. - - (2) Only those headers in the enclosed message - which start with "Content-" and "Message-ID" must - be appended, in order, to the headers of the new - message. Any headers in the enclosed message - which do not start with "Content-" (except for - "Message-ID") will be ignored. - - (3) All of the headers from the second and any - subsequent messages will be ignored. - - For example, if an audio message is broken into two parts, - the first part might look something like this: - - X-Weird-Header-1: Foo - From: Bill@host.com - To: joe@otherhost.com - Subject: Audio mail - Message-ID: id1@host.com - MIME-Version: 1.0 - Content-type: message/partial; - id="ABC@host.com"; - number=1; total=2 - - X-Weird-Header-1: Bar - X-Weird-Header-2: Hello - Message-ID: anotherid@foo.com - Content-type: audio/basic - Content-transfer-encoding: base64 - - ... first half of encoded audio data goes here... - - and the second half might look something like this: - - From: Bill@host.com - To: joe@otherhost.com - Subject: Audio mail - MIME-Version: 1.0 - Message-ID: id2@host.com - Content-type: message/partial; - id="ABC@host.com"; number=2; total=2 - - ... second half of encoded audio data goes here... - - Then, when the fragmented message is reassembled, the - resulting message to be displayed to the user should look - something like this: - - - - Borenstein & Freed [Page 39] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - X-Weird-Header-1: Foo - From: Bill@host.com - To: joe@otherhost.com - Subject: Audio mail - Message-ID: anotherid@foo.com - MIME-Version: 1.0 - Content-type: audio/basic - Content-transfer-encoding: base64 - - ... first half of encoded audio data goes here... - ... second half of encoded audio data goes here... - - It should be noted that, because some message transfer - agents may choose to automatically fragment large messages, - and because such agents may use different fragmentation - thresholds, it is possible that the pieces of a partial - message, upon reassembly, may prove themselves to comprise a - partial message. This is explicitly permitted. - - It should also be noted that the inclusion of a "References" - field in the headers of the second and subsequent pieces of - a fragmented message that references the Message-Id on the - previous piece may be of benefit to mail readers that - understand and track references. However, the generation of - such "References" fields is entirely optional. - - 7.3.3 The Message/External-Body subtype - - The external-body subtype indicates that the actual body - data are not included, but merely referenced. In this case, - the parameters describe a mechanism for accessing the - external data. - - When a message body or body part is of type - "message/external-body", it consists of a header, two - consecutive CRLFs, and the message header for the - encapsulated message. If another pair of consecutive CRLFs - appears, this of course ends the message header for the - encapsulated message. However, since the encapsulated - message's body is itself external, it does NOT appear in the - area that follows. For example, consider the following - message: - - Content-type: message/external-body; access- - type=local-file; - name=/u/nsb/Me.gif - - Content-type: image/gif - - THIS IS NOT REALLY THE BODY! - - The area at the end, which might be called the "phantom - body", is ignored for most external-body messages. However, - it may be used to contain auxilliary information for some - - - - Borenstein & Freed [Page 40] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - such messages, as indeed it is when the access-type is - "mail-server". Of the access-types defined by this - document, the phantom body is used only when the access-type - is "mail-server". In all other cases, the phantom body is - ignored. - - The only always-mandatory parameter for message/external- - body is "access-type"; all of the other parameters may be - mandatory or optional depending on the value of access-type. - - ACCESS-TYPE -- One or more case-insensitive words, - comma-separated, indicating supported access - mechanisms by which the file or data may be - obtained. Values include, but are not limited to, - "FTP", "ANON-FTP", "TFTP", "AFS", "LOCAL-FILE", - and "MAIL-SERVER". Future values, except for - experimental values beginning with "X-", must be - registered with IANA, as described in Appendix F . - - In addition, the following two parameters are optional for - ALL access-types: - - EXPIRATION -- The date (in the RFC 822 "date-time" - syntax, as extended by RFC 1123 to permit 4 digits - in the date field) after which the existence of - the external data is not guaranteed. - - SIZE -- The size (in octets) of the data. The - intent of this parameter is to help the recipient - decide whether or not to expend the necessary - resources to retrieve the external data. - - PERMISSION -- A field that indicates whether or - not it is expected that clients might also attempt - to overwrite the data. By default, or if - permission is "read", the assumption is that they - are not, and that if the data is retrieved once, - it is never needed again. If PERMISSION is "read- - write", this assumption is invalid, and any local - copy must be considered no more than a cache. - "Read" and "Read-write" are the only defined - values of permission. - - The precise semantics of the access-types defined here are - described in the sections that follow. - - 7.3.3.1 The "ftp" and "tftp" access-types - - An access-type of FTP or TFTP indicates that the message - body is accessible as a file using the FTP [RFC-959] or TFTP - [RFC-783] protocols, respectively. For these access-types, - the following additional parameters are mandatory: - - - - - - Borenstein & Freed [Page 41] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - NAME -- The name of the file that contains the - actual body data. - - SITE -- A machine from which the file may be - obtained, using the given protocol - - Before the data is retrieved, using these protocols, the - user will generally need to be asked to provide a login id - and a password for the machine named by the site parameter. - - In addition, the following optional parameters may also - appear when the access-type is FTP or ANON-FTP: - - DIRECTORY -- A directory from which the data named - by NAME should be retrieved. - - MODE -- A transfer mode for retrieving the - information, e.g. "image". - - 7.3.3.2 The "anon-ftp" access-type - - The "anon-ftp" access-type is identical to the "ftp" access - type, except that the user need not be asked to provide a - name and password for the specified site. Instead, the ftp - protocol will be used with login "anonymous" and a password - that corresponds to the user's email address. - - 7.3.3.3 The "local-file" and "afs" access-types - - An access-type of "local-file" indicates that the actual - body is accessible as a file on the local machine. An - access-type of "afs" indicates that the file is accessible - via the global AFS file system. In both cases, only a - single parameter is required: - - NAME -- The name of the file that contains the - actual body data. - - The following optional parameter may be used to describe the - locality of reference for the data, that is, the site or - sites at which the file is expected to be visible: - - SITE -- A domain specifier for a machine or set of - machines that are known to have access to the data - file. Asterisks may be used for wildcard matching - to a part of a domain name, such as - "*.bellcore.com", to indicate a set of machines on - which the data should be directly visible, while a - single asterisk may be used to indicate a file - that is expected to be universally available, - e.g., via a global file system. - - 7.3.3.4 The "mail-server" access-type - - - - - Borenstein & Freed [Page 42] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - The "mail-server" access-type indicates that the actual body - is available from a mail server. The mandatory parameter - for this access-type is: - - SERVER -- The email address of the mail server - from which the actual body data can be obtained. - - Because mail servers accept a variety of syntax, some of - which is multiline, the full command to be sent to a mail - server is not included as a parameter on the content-type - line. Instead, it may be provided as the "phantom body" - when the content-type is message/external-body and the - access-type is mail-server. - - Note that MIME does not define a mail server syntax. - Rather, it allows the inclusion of arbitrary mail server - commands in the phantom body. Implementations should - include the phantom body in the body of the message it sends - to the mail server address to retrieve the relevant data. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 43] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.3.3.5 Examples and Further Explanations - - With the emerging possibility of very wide-area file - systems, it becomes very hard to know in advance the set of - machines where a file will and will not be accessible - directly from the file system. Therefore it may make sense - to provide both a file name, to be tried directly, and the - name of one or more sites from which the file is known to be - accessible. An implementation can try to retrieve remote - files using FTP or any other protocol, using anonymous file - retrieval or prompting the user for the necessary name and - password. If an external body is accessible via multiple - mechanisms, the sender may include multiple parts of type - message/external-body within an entity of type - multipart/alternative. - - However, the external-body mechanism is not intended to be - limited to file retrieval, as shown by the mail-server - access-type. Beyond this, one can imagine, for example, - using a video server for external references to video clips. - - If an entity is of type "message/external-body", then the - body of the entity will contain the header fields of the - encapsulated message. The body itself is to be found in the - external location. This means that if the body of the - "message/external-body" message contains two consecutive - CRLFs, everything after those pairs is NOT part of the - message itself. For most message/external-body messages, - this trailing area must simply be ignored. However, it is a - convenient place for additional data that cannot be included - in the content-type header field. In particular, if the - "access-type" value is "mail-server", then the trailing area - must contain commands to be sent to the mail server at the - address given by NAME@SITE, where NAME and SITE are the - values of the NAME and SITE parameters, respectively. - - The embedded message header fields which appear in the body - of the message/external-body data can be used to declare the - Content-type of the external body. Thus a complete - message/external-body message, referring to a document in - PostScript format, might look like this: - - From: Whomever - Subject: whatever - MIME-Version: 1.0 - Message-ID: id1@host.com - Content-Type: multipart/alternative; boundary=42 - - - --42 - Content-Type: message/external-body; - name="BodyFormats.ps"; - - - - - - Borenstein & Freed [Page 44] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - site="thumper.bellcore.com"; - access-type=ANON-FTP; - directory="pub"; - mode="image"; - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - - --42 - Content-Type: message/external-body; - name="/u/nsb/writing/rfcs/RFC-XXXX.ps"; - site="thumper.bellcore.com"; - access-type=AFS - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - - --42 - Content-Type: message/external-body; - access-type=mail-server - server="listserv@bogus.bitnet"; - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - - get rfc-xxxx doc - - --42-- - - Like the message/partial type, the message/external-body - type is intended to be transparent, that is, to convey the - data type in the external body rather than to convey a - message with a body of that type. Thus the headers on the - outer and inner parts must be merged using the same rules as - for message/partial. In particular, this means that the - Content-type header is overridden, but the From and Subject - headers are preserved. - - Note that since the external bodies are not transported as - mail, they need not conform to the 7-bit and line length - requirements, but might in fact be binary files. Thus a - Content-Transfer-Encoding is not generally necessary, though - it is permitted. - - Note that the body of a message of type "message/external- - body" is governed by the basic syntax for an RFC 822 - message. In particular, anything before the first - consecutive pair of CRLFs is header information, while - anything after it is body information, which is ignored for - most access-types. - - - - - - - - Borenstein & Freed [Page 45] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.4 The Application Content-Type - - The "application" Content-Type is to be used for data which - do not fit in any of the other categories, and particularly - for data to be processed by mail-based uses of application - programs. This is information which must be processed by an - application before it is viewable or usable to a user. - Expected uses for Content-Type application include mail- - based file transfer, spreadsheets, data for mail-based - scheduling systems, and languages for "active" - (computational) email. (The latter, in particular, can pose - security problems which should be understood by - implementors, and are considered in detail in the discussion - of the application/PostScript content-type.) - - For example, a meeting scheduler might define a standard - representation for information about proposed meeting dates. - An intelligent user agent would use this information to - conduct a dialog with the user, and might then send further - mail based on that dialog. More generally, there have been - several "active" messaging languages developed in which - programs in a suitably specialized language are sent through - the mail and automatically run in the recipient's - environment. - - Such applications may be defined as subtypes of the - "application" Content-Type. This document defines three - subtypes: octet-stream, ODA, and PostScript. - - In general, the subtype of application will often be the - name of the application for which the data are intended. - This does not mean, however, that any application program - name may be used freely as a subtype of application. Such - usages must be registered with IANA, as described in - Appendix F. - - 7.4.1 The Application/Octet-Stream (primary) subtype - - The primary subtype of application, "octet-stream", may be - used to indicate that a body contains binary data. The set - of possible parameters includes, but is not limited to: - - NAME -- a suggested name for the binary data if - stored as a file. - - TYPE -- the general type or category of binary - data. This is intended as information for the - human recipient rather than for any automatic - processing. - - CONVERSIONS -- the set of operations that have - been performed on the data before putting it in - the mail (and before any Content-Transfer-Encoding - that might have been applied). If multiple - - - - Borenstein & Freed [Page 46] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - conversions have occurred, they must be separated - by commas and specified in the order they were - applied -- that is, the leftmost conversion must - have occurred first, and conversions are undone - from right to left. Note that NO conversion - values are defined by this document. Any - conversion values that that do not begin with "X-" - must be preceded by a published specification and - by registration with IANA, as described in - Appendix F. - - PADDING -- the number of bits of padding that were - appended to the bitstream comprising the actual - contents to produce the enclosed byte-oriented - data. This is useful for enclosing a bitstream in - a body when the total number of bits is not a - multiple of the byte size. - - The values for these attributes are left undefined at - present, but may require specification in the future. An - example of a common (though UNIX-specific) usage might be: - - Content-Type: application/octet-stream; - name=foo.tar.Z; type=tar; - conversions="x-encrypt,x-compress" - - However, it should be noted that the use of such conversions - is explicitly discouraged due to a lack of portability and - standardization. The use of uuencode is particularly - discouraged, in favor of the Content-Transfer-Encoding - mechanism, which is both more standardized and more portable - across mail boundaries. - - The recommended action for an implementation that receives - application/octet-stream mail is to simply offer to put the - data in a file, with any Content-Transfer-Encoding undone, - or perhaps to use it as input to a user-specified process. - - To reduce the danger of transmitting rogue programs through - the mail, it is strongly recommended that implementations - NOT implement a path-search mechanism whereby an arbitrary - program named in the Content-Type parameter (e.g., an - "interpreter=" parameter) is found and executed using the - mail body as input. - - 7.4.2 The Application/PostScript subtype - - A Content-Type of "application/postscript" indicates a - PostScript program. The language is defined in - [POSTSCRIPT]. It is recommended that Postscript as sent - through email should use Postscript document structuring - conventions if at all possible, and correctly. - - - - - - Borenstein & Freed [Page 47] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - The execution of general-purpose PostScript interpreters - entails serious security risks, and implementors are - discouraged from simply sending PostScript email bodies to - "off-the-shelf" interpreters. While it is usually safe to - send PostScript to a printer, where the potential for harm - is greatly constrained, implementors should consider all of - the following before they add interactive display of - PostScript bodies to their mail readers. - - The remainder of this section outlines some, though probably - not all, of the possible problems with sending PostScript - through the mail. - - Dangerous operations in the PostScript language include, but - may not be limited to, the PostScript operators deletefile, - renamefile, filenameforall, and file. File is only - dangerous when applied to something other than standard - input or output. Implementations may also define additional - nonstandard file operators; these may also pose a threat to - security. Filenameforall, the wildcard file search - operator, may appear at first glance to be harmless. Note, - however, that this operator has the potential to reveal - information about what files the recipient has access to, - and this information may itself be sensitive. Message - senders should avoid the use of potentially dangerous file - operators, since these operators are quite likely to be - unavailable in secure PostScript implementations. Message- - receiving and -displaying software should either completely - disable all potentially dangerous file operators or take - special care not to delegate any special authority to their - operation. These operators should be viewed as being done by - an outside agency when interpreting PostScript documents. - Such disabling and/or checking should be done completely - outside of the reach of the PostScript language itself; care - should be taken to insure that no method exists for - reenabling full-function versions of these operators. - - The PostScript language provides facilities for exiting the - normal interpreter, or server, loop. Changes made in this - "outer" environment are customarily retained across - documents, and may in some cases be retained semipermanently - in nonvolatile memory. The operators associated with exiting - the interpreter loop have the potential to interfere with - subsequent document processing. As such, their unrestrained - use constitutes a threat of service denial. PostScript - operators that exit the interpreter loop include, but may - not be limited to, the exitserver and startjob operators. - Message-sending software should not generate PostScript that - depends on exiting the interpreter loop to operate. The - ability to exit will probably be unavailable in secure - PostScript implementations. Message-receiving and - -displaying software should, if possible, disable the - ability to make retained changes to the PostScript - environment. Eliminate the startjob and exitserver commands. - - - - Borenstein & Freed [Page 48] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - If these commands cannot be eliminated, at least set the - password associated with them to a hard-to-guess value. - - PostScript provides operators for setting system-wide and - device-specific parameters. These parameter settings may be - retained across jobs and may potentially pose a threat to - the correct operation of the interpreter. The PostScript - operators that set system and device parameters include, but - may not be limited to, the setsystemparams and setdevparams - operators. Message-sending software should not generate - PostScript that depends on the setting of system or device - parameters to operate correctly. The ability to set these - parameters will probably be unavailable in secure PostScript - implementations. Message-receiving and -displaying software - should, if possible, disable the ability to change system - and device parameters. If these operators cannot be - disabled, at least set the password associated with them to - a hard-to-guess value. - - Some PostScript implementations provide nonstandard - facilities for the direct loading and execution of machine - code. Such facilities are quite obviously open to - substantial abuse. Message-sending software should not - make use of such features. Besides being totally hardware- - specific, they are also likely to be unavailable in secure - implementations of PostScript. Message-receiving and - -displaying software should not allow such operators to be - used if they exist. - - PostScript is an extensible language, and many, if not most, - implementations of it provide a number of their own - extensions. This document does not deal with such extensions - explicitly since they constitute an unknown factor. - Message-sending software should not make use of nonstandard - extensions; they are likely to be missing from some - implementations. Message-receiving and -displaying software - should make sure that any nonstandard PostScript operators - are secure and don't present any kind of threat. - - It is possible to write PostScript that consumes huge - amounts of various system resources. It is also possible to - write PostScript programs that loop infinitely. Both types - of programs have the potential to cause damage if sent to - unsuspecting recipients. Message-sending software should - avoid the construction and dissemination of such programs, - which is antisocial. Message-receiving and -displaying - software should provide appropriate mechanisms to abort - processing of a document after a reasonable amount of time - has elapsed. In addition, PostScript interpreters should be - limited to the consumption of only a reasonable amount of - any given system resource. - - Finally, bugs may exist in some PostScript interpreters - which could possibly be exploited to gain unauthorized - - - - Borenstein & Freed [Page 49] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - access to a recipient's system. Apart from noting this - possibility, there is no specific action to take to prevent - this, apart from the timely correction of such bugs if any - are found. - - 7.4.3 The Application/ODA subtype - - The "ODA" subtype of application is used to indicate that a - body contains information encoded according to the Office - Document Architecture [ODA] standards, using the ODIF - representation format. For application/oda, the Content- - Type line should also specify an attribute/value pair that - indicates the document application profile (DAP), using the - key word "profile". Thus an appropriate header field might - look like this: - - Content-Type: application/oda; profile=Q112 - - Consult the ODA standard [ODA] for further information. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 50] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 7.5 The Image Content-Type - - A Content-Type of "image" indicates that the bodycontains an - image. The subtype names the specific image format. These - names are case insensitive. Two initial subtypes are "jpeg" - for the JPEG format, JFIF encoding, and "gif" for GIF format - [GIF]. - - The list of image subtypes given here is neither exclusive - nor exhaustive, and is expected to grow as more types are - registered with IANA, as described in Appendix F. - - 7.6 The Audio Content-Type - - A Content-Type of "audio" indicates that the body contains - audio data. Although there is not yet a consensus on an - "ideal" audio format for use with computers, there is a - pressing need for a format capable of providing - interoperable behavior. - - The initial subtype of "basic" is specified to meet this - requirement by providing an absolutely minimal lowest common - denominator audio format. It is expected that richer - formats for higher quality and/or lower bandwidth audio will - be defined by a later document. - - The content of the "audio/basic" subtype is audio encoded - using 8-bit ISDN u-law [PCM]. When this subtype is present, - a sample rate of 8000 Hz and a single channel is assumed. - - 7.7 The Video Content-Type - - A Content-Type of "video" indicates that the body contains a - time-varying-picture image, possibly with color and - coordinated sound. The term "video" is used extremely - generically, rather than with reference to any particular - technology or format, and is not meant to preclude subtypes - such as animated drawings encoded compactly. The subtype - "mpeg" refers to video coded according to the MPEG standard - [MPEG]. - - Note that although in general this document strongly - discourages the mixing of multiple media in a single body, - it is recognized that many so-called "video" formats include - a representation for synchronized audio, and this is - explicitly permitted for subtypes of "video". - - 7.8 Experimental Content-Type Values - - A Content-Type value beginning with the characters "X-" is a - private value, to be used by consenting mail systems by - mutual agreement. Any format without a rigorous and public - definition must be named with an "X-" prefix, and publicly - specified values shall never begin with "X-". (Older - - - - Borenstein & Freed [Page 51] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - versions of the widely-used Andrew system use the "X-BE2" - name, so new systems should probably choose a different - name.) - - In general, the use of "X-" top-level types is strongly - discouraged. Implementors should invent subtypes of the - existing types whenever possible. The invention of new - types is intended to be restricted primarily to the - development of new media types for email, such as digital - odors or holography, and not for new data formats in - general. In many cases, a subtype of application will be - more appropriate than a new top-level type. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 52] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Summary - - Using the MIME-Version, Content-Type, and Content-Transfer- - Encoding header fields, it is possible to include, in a - standardized way, arbitrary types of data objects with RFC - 822 conformant mail messages. No restrictions imposed by - either RFC 821 or RFC 822 are violated, and care has been - taken to avoid problems caused by additional restrictions - imposed by the characteristics of some Internet mail - transport mechanisms (see Appendix B). The "multipart" and - "message" Content-Types allow mixing and hierarchical - structuring of objects of different types in a single - message. Further Content-Types provide a standardized - mechanism for tagging messages or body parts as audio, - image, or several other kinds of data. A distinguished - parameter syntax allows further specification of data format - details, particularly the specification of alternate - character sets. Additional optional header fields provide - mechanisms for certain extensions deemed desirable by many - implementors. Finally, a number of useful Content-Types are - defined for general use by consenting user agents, notably - text/richtext, message/partial, and message/external-body. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 53] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Acknowledgements - - This document is the result of the collective effort of a - large number of people, at several IETF meetings, on the - IETF-SMTP and IETF-822 mailing lists, and elsewhere. - Although any enumeration seems doomed to suffer from - egregious omissions, the following are among the many - contributors to this effort: - - Harald Tveit Alvestrand Timo Lehtinen - Randall Atkinson John R. MacMillan - Philippe Brandon Rick McGowan - Kevin Carosso Leo Mclaughlin - Uhhyung Choi Goli Montaser-Kohsari - Cristian Constantinof Keith Moore - Mark Crispin Tom Moore - Dave Crocker Erik Naggum - Terry Crowley Mark Needleman - Walt Daniels John Noerenberg - Frank Dawson Mats Ohrman - Hitoshi Doi Julian Onions - Kevin Donnelly Michael Patton - Keith Edwards David J. Pepper - Chris Eich Blake C. Ramsdell - Johnny Eriksson Luc Rooijakkers - Craig Everhart Marshall T. Rose - Patrik Faeltstroem Jonathan Rosenberg - Erik E. Fair Jan Rynning - Roger Fajman Harri Salminen - Alain Fontaine Michael Sanderson - James M. Galvin Masahiro Sekiguchi - Philip Gladstone Mark Sherman - Thomas Gordon Keld Simonsen - Phill Gross Bob Smart - James Hamilton Peter Speck - Steve Hardcastle-Kille Henry Spencer - David Herron Einar Stefferud - Bruce Howard Michael Stein - Bill Janssen Klaus Steinberger - Olle Jaernefors Peter Svanberg - Risto Kankkunen James Thompson - Phil Karn Steve Uhler - Alan Katz Stuart Vance - Tim Kehres Erik van der Poel - Neil Katin Guido van Rossum - Kyuho Kim Peter Vanderbilt - Anders Klemets Greg Vaudreuil - John Klensin Ed Vielmetti - Valdis Kletniek Ryan Waldron - Jim Knowles Wally Wedel - Stev Knowles Sven-Ove Westberg - Bob Kummerfeld Brian Wideen - - - - - - Borenstein & Freed [Page 54] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Pekka Kytolaakso John Wobus - Stellan Lagerstr.m Glenn Wright - Vincent Lau Rayan Zachariassen - Donald Lindsay David Zimmerman - The authors apologize for any omissions from this list, - which are certainly unintentional. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 55] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix A -- Minimal MIME-Conformance - - The mechanisms described in this document are open-ended. - It is definitely not expected that all implementations will - support all of the Content-Types described, nor that they - will all share the same extensions. In order to promote - interoperability, however, it is useful to define the - concept of "MIME-conformance" to define a certain level of - implementation that allows the useful interworking of - messages with content that differs from US ASCII text. In - this section, we specify the requirements for such - conformance. - - A mail user agent that is MIME-conformant MUST: - - 1. Always generate a "MIME-Version: 1.0" header - field. - - 2. Recognize the Content-Transfer-Encoding header - field, and decode all received data encoded with - either the quoted-printable or base64 - implementations. Encode any data sent that is - not in seven-bit mail-ready representation using - one of these transformations and include the - appropriate Content-Transfer-Encoding header - field, unless the underlying transport mechanism - supports non-seven-bit data, as SMTP does not. - - 3. Recognize and interpret the Content-Type - header field, and avoid showing users raw data - with a Content-Type field other than text. Be - able to send at least text/plain messages, with - the character set specified as a parameter if it - is not US-ASCII. - - 4. Explicitly handle the following Content-Type - values, to at least the following extents: - - Text: - -- Recognize and display "text" mail - with the character set "US-ASCII." - -- Recognize other character sets at - least to the extent of being able - to inform the user about what - character set the message uses. - -- Recognize the "ISO-8859-*" character - sets to the extent of being able to - display those characters that are - common to ISO-8859-* and US-ASCII, - namely all characters represented - by octet values 0-127. - -- For unrecognized subtypes, show or - offer to show the user the "raw" - version of the data. An ability at - - - - Borenstein & Freed [Page 56] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - least to convert "text/richtext" to - plain text, as shown in Appendix D, - is encouraged, but not required for - conformance. - Message: - --Recognize and display at least the - primary (822) encapsulation. - Multipart: - -- Recognize the primary (mixed) - subtype. Display all relevant - information on the message level - and the body part header level and - then display or offer to display - each of the body parts - individually. - -- Recognize the "alternative" subtype, - and avoid showing the user - redundant parts of - multipart/alternative mail. - -- Treat any unrecognized subtypes as if - they were "mixed". - Application: - -- Offer the ability to remove either of - the two types of Content-Transfer- - Encoding defined in this document - and put the resulting information - in a user file. - - 5. Upon encountering any unrecognized Content- - Type, an implementation must treat it as if it had - a Content-Type of "application/octet-stream" with - no parameter sub-arguments. How such data are - handled is up to an implementation, but likely - options for handling such unrecognized data - include offering the user to write it into a file - (decoded from its mail transport format) or - offering the user to name a program to which the - decoded data should be passed as input. - Unrecognized predefined types, which in a MIME- - conformant mailer might still include audio, - image, or video, should also be treated in this - way. - - A user agent that meets the above conditions is said to be - MIME-conformant. The meaning of this phrase is that it is - assumed to be "safe" to send virtually any kind of - properly-marked data to users of such mail systems, because - such systems will at least be able to treat the data as - undifferentiated binary, and will not simply splash it onto - the screen of unsuspecting users. There is another sense - in which it is always "safe" to send data in a format that - is MIME-conformant, which is that such data will not break - or be broken by any known systems that are conformant with - RFC 821 and RFC 822. User agents that are MIME-conformant - - - - Borenstein & Freed [Page 57] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - have the additional guarantee that the user will not be - shown data that were never intended to be viewed as text. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 58] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix B -- General Guidelines For Sending Email Data - - Internet email is not a perfect, homogeneous system. Mail - may become corrupted at several stages in its travel to a - final destination. Specifically, email sent throughout the - Internet may travel across many networking technologies. - Many networking and mail technologies do not support the - full functionality possible in the SMTP transport - environment. Mail traversing these systems is likely to be - modified in such a way that it can be transported. - - There exist many widely-deployed non-conformant MTAs in the - Internet. These MTAs, speaking the SMTP protocol, alter - messages on the fly to take advantage of the internal data - structure of the hosts they are implemented on, or are just - plain broken. - - The following guidelines may be useful to anyone devising a - data format (Content-Type) that will survive the widest - range of networking technologies and known broken MTAs - unscathed. Note that anything encoded in the base64 - encoding will satisfy these rules, but that some well-known - mechanisms, notably the UNIX uuencode facility, will not. - Note also that anything encoded in the Quoted-Printable - encoding will survive most gateways intact, but possibly not - some gateways to systems that use the EBCDIC character set. - - (1) Under some circumstances the encoding used for - data may change as part of normal gateway or user - agent operation. In particular, conversion from - base64 to quoted-printable and vice versa may be - necessary. This may result in the confusion of - CRLF sequences with line breaks in text body - parts. As such, the persistence of CRLF as - something other than a line break should not be - relied on. - - (2) Many systems may elect to represent and store - text data using local newline conventions. Local - newline conventions may not match the RFC822 CRLF - convention -- systems are known that use plain CR, - plain LF, CRLF, or counted records. The result is - that isolated CR and LF characters are not well - tolerated in general; they may be lost or - converted to delimiters on some systems, and hence - should not be relied on. - - (3) TAB (HT) characters may be misinterpreted or - may be automatically converted to variable numbers - of spaces. This is unavoidable in some - environments, notably those not based on the ASCII - character set. Such conversion is STRONGLY - DISCOURAGED, but it may occur, and mail formats - should not rely on the persistence of TAB (HT) - - - - Borenstein & Freed [Page 59] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - characters. - - (4) Lines longer than 76 characters may be wrapped - or truncated in some environments. Line wrapping - and line truncation are STRONGLY DISCOURAGED, but - unavoidable in some cases. Applications which - require long lines should somehow differentiate - between soft and hard line breaks. (A simple way - to do this is to use the quoted-printable - encoding.) - - (5) Trailing "white space" characters (SPACE, TAB - (HT)) on a line may be discarded by some transport - agents, while other transport agents may pad lines - with these characters so that all lines in a mail - file are of equal length. The persistence of - trailing white space, therefore, should not be - relied on. - - (6) Many mail domains use variations on the ASCII - character set, or use character sets such as - EBCDIC which contain most but not all of the US- - ASCII characters. The correct translation of - characters not in the "invariant" set cannot be - depended on across character converting gateways. - For example, this situation is a problem when - sending uuencoded information across BITNET, an - EBCDIC system. Similar problems can occur without - crossing a gateway, since many Internet hosts use - character sets other than ASCII internally. The - definition of Printable Strings in X.400 adds - further restrictions in certain special cases. In - particular, the only characters that are known to - be consistent across all gateways are the 73 - characters that correspond to the upper and lower - case letters A-Z and a-z, the 10 digits 0-9, and - the following eleven special characters: - - "'" (ASCII code 39) - "(" (ASCII code 40) - ")" (ASCII code 41) - "+" (ASCII code 43) - "," (ASCII code 44) - "-" (ASCII code 45) - "." (ASCII code 46) - "/" (ASCII code 47) - ":" (ASCII code 58) - "=" (ASCII code 61) - "?" (ASCII code 63) - - A maximally portable mail representation, such as - the base64 encoding, will confine itself to - relatively short lines of text in which the only - meaningful characters are taken from this set of - - - - Borenstein & Freed [Page 60] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - 73 characters. - - Please note that the above list is NOT a list of recommended - practices for MTAs. RFC 821 MTAs are prohibited from - altering the character of white space or wrapping long - lines. These BAD and illegal practices are known to occur - on established networks, and implementions should be robust - in dealing with the bad effects they can cause. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 61] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix C -- A Complex Multipart Example - - What follows is the outline of a complex multipart message. - This message has five parts to be displayed serially: two - introductory plain text parts, an embedded multipart - message, a richtext part, and a closing encapsulated text - message in a non-ASCII character set. The embedded - multipart message has two parts to be displayed in parallel, - a picture and an audio fragment. - - MIME-Version: 1.0 - From: Nathaniel Borenstein <nsb@bellcore.com> - Subject: A multipart example - Content-Type: multipart/mixed; - boundary=unique-boundary-1 - - This is the preamble area of a multipart message. - Mail readers that understand multipart format - should ignore this preamble. - If you are reading this text, you might want to - consider changing to a mail reader that understands - how to properly display multipart messages. - --unique-boundary-1 - - ...Some text appears here... - [Note that the preceding blank line means - no header fields were given and this is text, - with charset US ASCII. It could have been - done with explicit typing as in the next part.] - - --unique-boundary-1 - Content-type: text/plain; charset=US-ASCII - - This could have been part of the previous part, - but illustrates explicit versus implicit - typing of body parts. - - --unique-boundary-1 - Content-Type: multipart/parallel; - boundary=unique-boundary-2 - - - --unique-boundary-2 - Content-Type: audio/basic - Content-Transfer-Encoding: base64 - - ... base64-encoded 8000 Hz single-channel - u-law-format audio data goes here.... - - --unique-boundary-2 - Content-Type: image/gif - Content-Transfer-Encoding: Base64 - - - - - - Borenstein & Freed [Page 62] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - ... base64-encoded image data goes here.... - - --unique-boundary-2-- - - --unique-boundary-1 - Content-type: text/richtext - - This is <bold><italic>richtext.</italic></bold> - <nl><nl>Isn't it - <bigger><bigger>cool?</bigger></bigger> - - --unique-boundary-1 - Content-Type: message/rfc822 - - From: (name in US-ASCII) - Subject: (subject in US-ASCII) - Content-Type: Text/plain; charset=ISO-8859-1 - Content-Transfer-Encoding: Quoted-printable - - ... Additional text in ISO-8859-1 goes here ... - - --unique-boundary-1-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 63] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix D -- A Simple Richtext-to-Text Translator in C - - One of the major goals in the design of the richtext subtype - of the text Content-Type is to make formatted text so simple - that even text-only mailers will implement richtext-to- - plain-text translators, thus increasing the likelihood that - multifont text will become "safe" to use very widely. To - demonstrate this simplicity, what follows is an extremely - simple 44-line C program that converts richtext input into - plain text output: - - #include <stdio.h> - #include <ctype.h> - main() { - int c, i; - char token[50]; - - while((c = getc(stdin)) != EOF) { - if (c == '<') { - for (i=0; (i<49 && (c = getc(stdin)) != '>' - && c != EOF); ++i) { - token[i] = isupper(c) ? tolower(c) : c; - } - if (c == EOF) break; - if (c != '>') while ((c = getc(stdin)) != - '>' - && c != EOF) {;} - if (c == EOF) break; - token[i] = '\0'; - if (!strcmp(token, "lt")) { - putc('<', stdout); - } else if (!strcmp(token, "nl")) { - putc('\n', stdout); - } else if (!strcmp(token, "/paragraph")) { - fputs("\n\n", stdout); - } else if (!strcmp(token, "comment")) { - int commct=1; - while (commct > 0) { - while ((c = getc(stdin)) != '<' - && c != EOF) ; - if (c == EOF) break; - for (i=0; (c = getc(stdin)) != '>' - && c != EOF; ++i) { - token[i] = isupper(c) ? - tolower(c) : c; - } - if (c== EOF) break; - token[i] = NULL; - if (!strcmp(token, "/comment")) -- - commct; - if (!strcmp(token, "comment")) - ++commct; - - - - - - Borenstein & Freed [Page 64] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - } - } /* Ignore all other tokens */ - } else if (c != '\n') putc(c, stdout); - } - putc('\n', stdout); /* for good measure */ - } - It should be noted that one can do considerably better than - this in displaying richtext data on a dumb terminal. In - particular, one can replace font information such as "bold" - with textual emphasis (like *this* or _T_H_I_S_). One can - also properly handle the richtext formatting commands - regarding indentation, justification, and others. However, - the above program is all that is necessary in order to - present richtext on a dumb terminal. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 65] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix E -- Collected Grammar - - This appendix contains the complete BNF grammar for all the - syntax specified by this document. - - By itself, however, this grammar is incomplete. It refers - to several entities that are defined by RFC 822. Rather - than reproduce those definitions here, and risk - unintentional differences between the two, this document - simply refers the reader to RFC 822 for the remaining - definitions. Wherever a term is undefined, it refers to the - RFC 822 definition. - - attribute := token - - body-part = <"message" as defined in RFC 822, - with all header fields optional, and with the - specified delimiter not occurring anywhere in - the message body, either on a line by itself - or as a substring anywhere.> - - boundary := 0*69<bchars> bcharsnospace - - bchars := bcharsnospace / " " - - bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / - "_" - / "," / "-" / "." / "/" / ":" / "=" / "?" - - close-delimiter := delimiter "--" - - Content-Description := *text - - Content-ID := msg-id - - Content-Transfer-Encoding := "BASE64" / "QUOTED- - PRINTABLE" / - "8BIT" / "7BIT" / - "BINARY" / x-token - - Content-Type := type "/" subtype *[";" parameter] - - delimiter := CRLF "--" boundary ; taken from Content-Type - field. - ; when content-type is - multipart - ; There should be no space - ; between "--" and boundary. - - encapsulation := delimiter CRLF body-part - - epilogue := *text ; to be ignored upon - receipt. - - - - - Borenstein & Freed [Page 66] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - MIME-Version := 1*text - - multipart-body := preamble 1*encapsulation close-delimiter - epilogue - - parameter := attribute "=" value - - preamble := *text ; to be ignored upon - receipt. - - subtype := token - - token := 1*<any CHAR except SPACE, CTLs, or tspecials> - - tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in - / "," / ";" / ":" / "\" / <"> ; quoted-string, - / "/" / "[" / "]" / "?" / "." ; to use within - / "=" ; parameter values - - - type := "application" / "audio" ; case- - insensitive - / "image" / "message" - / "multipart" / "text" - / "video" / x-token - - value := token / quoted-string - - x-token := <The two characters "X-" followed, with no - intervening white space, by any token> - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 67] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix F -- IANA Registration Procedures - - MIME has been carefully designed to have extensible - mechanisms, and it is expected that the set of content- - type/subtype pairs and their associated parameters will grow - significantly with time. Several other MIME fields, notably - character set names, access-type parameters for the - message/external-body type, conversions parameters for the - application type, and possibly even Content-Transfer- - Encoding values, are likely to have new values defined over - time. In order to ensure that the set of such values is - developed in an orderly, well-specified, and public manner, - MIME defines a registration process which uses the Internet - Assigned Numbers Authority (IANA) as a central registry for - such values. - - In general, parameters in the content-type header field are - used to convey supplemental information for various content - types, and their use is defined when the content-type and - subtype are defined. New parameters should not be defined - as a way to introduce new functionality. - - In order to simplify and standardize the registration - process, this appendix gives templates for the registration - of new values with IANA. Each of these is given in the form - of an email message template, to be filled in by the - registering party. - - F.1 Registration of New Content-type/subtype Values - - Note that MIME is generally expected to be extended by - subtypes. If a new fundamental top-level type is needed, - its specification should be published as an RFC or - submitted in a form suitable to become an RFC, and be - subject to the Internet standards process. - - To: IANA@isi.edu - Subject: Registration of new MIME content-type/subtype - - MIME type name: - - (If the above is not an existing top-level MIME type, - please explain why an existing type cannot be used.) - - MIME subtype name: - - Required parameters: - - Optional parameters: - - Encoding considerations: - - Security considerations: - - - - - Borenstein & Freed [Page 68] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Published specification: - - (The published specification must be an Internet RFC or - RFC-to-be if a new top-level type is being defined, and - must be a publicly available specification in any - case.) - - Person & email address to contact for further - information: - F.2 Registration of New Character Set Values - - To: IANA@isi.edu - Subject: Registration of new MIME character set value - - MIME character set name: - - Published specification: - - (The published specification must be an Internet RFC or - RFC-to-be or an international standard.) - - Person & email address to contact for further - information: - - F.3 Registration of New Access-type Values for - Message/external-body - - To: IANA@isi.edu - Subject: Registration of new MIME Access-type for - Message/external-body content-type - - MIME access-type name: - - Required parameters: - - Optional parameters: - - Published specification: - - (The published specification must be an Internet RFC or - RFC-to-be.) - - Person & email address to contact for further - information: - - - F.4 Registration of New Conversions Values for Application - - To: IANA@isi.edu - Subject: Registration of new MIME Conversions value - for Application content-type - - MIME Conversions name: - - - - - Borenstein & Freed [Page 69] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Published specification: - - (The published specification must be an Internet RFC or - RFC-to-be.) - - Person & email address to contact for further - information: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 70] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix G -- Summary of the Seven Content-types - - Content-type: text - - Subtypes defined by this document: plain, richtext - - Important Parameters: charset - - Encoding notes: quoted-printable generally preferred if an - encoding is needed and the character set is mostly an - ASCII superset. - - Security considerations: Rich text formats such as TeX and - Troff often contain mechanisms for executing arbitrary - commands or file system operations, and should not be - used automatically unless these security problems have - been addressed. Even plain text may contain control - characters that can be used to exploit the capabilities - of "intelligent" terminals and cause security - violations. User interfaces designed to run on such - terminals should be aware of and try to prevent such - problems. - ________________________________________________________________ - - Content-type: multipart - - Subtypes defined by this document: mixed, alternative, - digest, parallel. - - Important Parameters: boundary - - Encoding notes: No content-transfer-encoding is permitted. - - ________________________________________________________________ - - Content-type: message - - Subtypes defined by this document: rfc822, partial, - external-body - - Important Parameters: id, number, total - - Encoding notes: No content-transfer-encoding is permitted. - - ________________________________________________________________ - - Content-type: application - - Subtypes defined by this document: octet-stream, - postscript, oda - - Important Parameters: profile - - - - - - Borenstein & Freed [Page 71] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Encoding notes: base64 generally preferred for octet-stream - or other unreadable subtypes. - - Security considerations: This type is intended for the - transmission of data to be interpreted by locally-installed - programs. If used, for example, to transmit executable - binary programs or programs in general-purpose interpreted - languages, such as LISP programs or shell scripts, severe - security problems could result. In general, authors of - mail-reading agents are cautioned against giving their - systems the power to execute mail-based application data - without carefully considering the security implications. - While it is certainly possible to define safe application - formats and even safe interpreters for unsafe formats, each - interpreter should be evaluated separately for possible - security problems. - ________________________________________________________________ - - Content-type: image - - Subtypes defined by this document: jpeg, gif - - Important Parameters: none - - Encoding notes: base64 generally preferred - - ________________________________________________________________ - - Content-type: audio - - Subtypes defined by this document: basic - - Important Parameters: none - - Encoding notes: base64 generally preferred - - ________________________________________________________________ - - Content-type: video - - Subtypes defined by this document: mpeg - - Important Parameters: none - - Encoding notes: base64 generally preferred - - - - - - - - - - - - - Borenstein & Freed [Page 72] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Appendix H -- Canonical Encoding Model - - - - There was some confusion, in earlier drafts of this memo, - regarding the model for when email data was to be converted - to canonical form and encoded, and in particular how this - process would affect the treatment of CRLFs, given that the - representation of newlines varies greatly from system to - system. For this reason, a canonical model for encoding is - presented below. - - The process of composing a MIME message part can be modelled - as being done in a number of steps. Note that these steps - are roughly similar to those steps used in RFC1113: - - Step 1. Creation of local form. - - The body part to be transmitted is created in the system's - native format. The native character set is used, and where - appropriate local end of line conventions are used as well. - The may be a UNIX-style text file, or a Sun raster image, or - a VMS indexed file, or audio data in a system-dependent - format stored only in memory, or anything else that - corresponds to the local model for the representation of - some form of information. - - Step 2. Conversion to canonical form. - - The entire body part, including "out-of-band" information - such as record lengths and possibly file attribute - information, is converted to a universal canonical form. - The specific content type of the body part as well as its - associated attributes dictate the nature of the canonical - form that is used. Conversion to the proper canonical form - may involve character set conversion, transformation of - audio data, compression, or various other operations - specific to the various content types. - - For example, in the case of text/plain data, the text must - be converted to a supported character set and lines must be - delimited with CRLF delimiters in accordance with RFC822. - Note that the restriction on line lengths implied by RFC822 - is eliminated if the next step employs either quoted- - printable or base64 encoding. - - Step 3. Apply transfer encoding. - - A Content-Transfer-Encoding appropriate for this body part - is applied. Note that there is no fixed relationship - between the content type and the transfer encoding. In - particular, it may be appropriate to base the choice of - base64 or quoted-printable on character frequency counts - which are specific to a given instance of body part. - - - - Borenstein & Freed [Page 73] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Step 4. Insertion into message. - - The encoded object is inserted into a MIME message with - appropriate body part headers and boundary markers. - - It is vital to note that these steps are only a model; they - are specifically NOT a blueprint for how an actual system - would be built. In particular, the model fails to account - for two common designs: - - 1. In many cases the conversion to a canonical - form prior to encoding will be subsumed into the - encoder itself, which understands local formats - directly. For example, the local newline - convention for text bodyparts might be carried - through to the encoder itself along with knowledge - of what that format is. - - 2. The output of the encoders may have to pass - through one or more additional steps prior to - being transmitted as a message. As such, the - output of the encoder may not be compliant with - the formats specified by RFC822. In particular, - once again it may be appropriate for the - converter's output to be expressed using local - newline conventions rather than using the standard - RFC822 CRLF delimiters. - - Other implementation variations are conceivable as well. - The only important aspect of this discussion is that the - resulting messages are consistent with those produced by the - model described here. - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 74] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - References - - [US-ASCII] Coded Character Set--7-Bit American Standard Code - for Information Interchange, ANSI X3.4-1986. - - [ATK] Borenstein, Nathaniel S., Multimedia Applications - Development with the Andrew Toolkit, Prentice-Hall, 1990. - - [GIF] Graphics Interchange Format (Version 89a), Compuserve, - Inc., Columbus, Ohio, 1990. - - [ISO-2022] International Standard--Information Processing-- - ISO 7-bit and 8-bit coded character sets--Code extension - techniques, ISO 2022:1986. - - [ISO-8859] Information Processing -- 8-bit Single-Byte Coded - Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO - 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, - 1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part - 4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5: - Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6: - Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: - Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: - Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin - alphabet No. 5, ISO 8859-9, 1990. - - [ISO-646] International Standard--Information Processing-- - ISO 7-bit coded character set for information interchange, - ISO 646:1983. - - [MPEG] Video Coding Draft Standard ISO 11172 CD, ISO - IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991. - - [ODA] ISO 8613; Information Processing: Text and Office - System; Office Document Architecture (ODA) and Interchange - Format (ODIF), Part 1-8, 1989. - - [PCM] CCITT, Fascicle III.4 - Recommendation G.711, Geneva, - 1972, "Pulse Code Modulation (PCM) of Voice Frequencies". - - [POSTSCRIPT] Adobe Systems, Inc., PostScript Language - Reference Manual, Addison-Wesley, 1985. - - [X400] Schicker, Pietro, "Message Handling Systems, X.400", - Message Handling Systems and Distributed Applications, E. - Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- - Holland, 1989, pp. 3-41. - - [RFC-783] Sollins, K.R. TFTP Protocol (revision 2). June, - 1981, MIT, RFC-783. - - [RFC-821] Postel, J.B. Simple Mail Transfer Protocol. - August, 1982, USC/Information Sciences Institute, RFC-821. - - - - - Borenstein & Freed [Page 75] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - [RFC-822] Crocker, D. Standard for the format of ARPA - Internet text messages. August, 1982, UDEL, RFC-822. - - [RFC-934] Rose, M.T.; Stefferud, E.A. Proposed standard - for message encapsulation. January, 1985, Delaware - and NMA, RFC-934. - - [RFC-959] Postel, J.B.; Reynolds, J.K. File Transfer - Protocol. October, 1985, USC/Information Sciences - Institute, RFC-959. - - [RFC-1049] Sirbu, M.A. Content-Type header field for - Internet messages. March, 1988, CMU, RFC-1049. - - [RFC-1113] Linn, J. Privacy enhancement for Internet - electronic mail: Part I - message encipherment and - authentication procedures. August, 1989, IAB Privacy Task - Force, RFC-1113. - - [RFC-1154] Robinson, D.; Ullmann, R. Encoding header field - for Internet messages. April, 1990, Prime Computer, - Inc., RFC-1154. - - [RFC-1342] Moore, Keith, Representation of Non-Ascii Text in - Internet Message Headers. June, 1992, University of - Tennessee, RFC-1342. - - Security Considerations - - Security issues are discussed in Section 7.4.2 and in - Appendix G. Implementors should pay special attention to - the security implications of any mail content-types that can - cause the remote execution of any actions in the recipient's - environment. In such cases, the discussion of the - applicaton/postscript content-type in Section 7.4.2 may - serve as a model for considering other content-types with - remote execution capabilities. - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 76] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - Authors' Addresses - - For more information, the authors of this document may be - contacted via Internet mail: - - Nathaniel S. Borenstein - MRE 2D-296, Bellcore - 445 South St. - Morristown, NJ 07962-1910 - - Phone: +1 201 829 4270 - Fax: +1 201 829 7019 - Email: nsb@bellcore.com - - - Ned Freed - Innosoft International, Inc. - 250 West First Street - Suite 240 - Claremont, CA 91711 - - Phone: +1 714 624 7907 - Fax: +1 714 621 5319 - Email: ned@innosoft.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page 77] - - - - - RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 - - - - - - THIS PAGE INTENTIONALLY LEFT BLANK. - - Please discard this page and place the following table of - contents after the title page. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page i] - - - - - - - - - Table of Contents - - - 1 Introduction....................................... 1 - 2 Notations, Conventions, and Generic BNF Grammar.... 3 - 3 The MIME-Version Header Field...................... 5 - 4 The Content-Type Header Field...................... 6 - 5 The Content-Transfer-Encoding Header Field......... 10 - 5.1 Quoted-Printable Content-Transfer-Encoding......... 14 - 5.2 Base64 Content-Transfer-Encoding................... 17 - 6 Additional Optional Content- Header Fields......... 19 - 6.1 Optional Content-ID Header Field................... 19 - 6.2 Optional Content-Description Header Field.......... 19 - 7 The Predefined Content-Type Values................. 20 - 7.1 The Text Content-Type.............................. 20 - 7.1.1 The charset parameter.............................. 20 - 7.1.2 The Text/plain subtype............................. 23 - 7.1.3 The Text/richtext subtype.......................... 23 - 7.2 The Multipart Content-Type......................... 29 - 7.2.1 Multipart: The common syntax...................... 30 - 7.2.2 The Multipart/mixed (primary) subtype.............. 34 - 7.2.3 The Multipart/alternative subtype.................. 34 - 7.2.4 The Multipart/digest subtype....................... 36 - 7.2.5 The Multipart/parallel subtype..................... 36 - 7.3 The Message Content-Type........................... 37 - 7.3.1 The Message/rfc822 (primary) subtype............... 37 - 7.3.2 The Message/Partial subtype........................ 37 - 7.3.3 The Message/External-Body subtype.................. 40 - 7.4 The Application Content-Type....................... 46 - 7.4.1 The Application/Octet-Stream (primary) subtype..... 46 - 7.4.2 The Application/PostScript subtype................. 47 - 7.4.3 The Application/ODA subtype........................ 50 - 7.5 The Image Content-Type............................. 51 - 7.6 The Audio Content-Type............................. 51 - 7.7 The Video Content-Type............................. 51 - 7.8 Experimental Content-Type Values................... 51 - Summary............................................ 53 - Acknowledgements................................... 54 - Appendix A -- Minimal MIME-Conformance............. 56 - Appendix B -- General Guidelines For Sending Email Data59 - Appendix C -- A Complex Multipart Example.......... 62 - Appendix D -- A Simple Richtext-to-Text Translator in C64 - Appendix E -- Collected Grammar.................... 66 - Appendix F -- IANA Registration Procedures......... 68 - F.1 Registration of New Content-type/subtype Values..68 - F.2 Registration of New Character Set Values...... 69 - F.3 Registration of New Access-type Values for Message/external-body69 - F.4 Registration of New Conversions Values for Application69 - Appendix G -- Summary of the Seven Content-types... 71 - Appendix H -- Canonical Encoding Model............. 73 - References......................................... 75 - Security Considerations............................ 76 - Authors' Addresses................................. 77 - - - - Borenstein & Freed [Page ii] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Borenstein & Freed [Page iii] - diff --git a/proto/rfc2045.txt b/proto/rfc2045.txt @@ -1,1739 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 2045 Innosoft -Obsoletes: 1521, 1522, 1590 N. Borenstein -Category: Standards Track First Virtual - November 1996 - - - Multipurpose Internet Mail Extensions - (MIME) Part One: - Format of Internet Message Bodies - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - STD 11, RFC 822, defines a message representation protocol specifying - considerable detail about US-ASCII message headers, and leaves the - message content, or message body, as flat US-ASCII text. This set of - documents, collectively called the Multipurpose Internet Mail - Extensions, or MIME, redefines the format of messages to allow for - - (1) textual message bodies in character sets other than - US-ASCII, - - (2) an extensible set of different formats for non-textual - message bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than - US-ASCII. - - These documents are based on earlier work documented in RFC 934, STD - 11, and RFC 1049, but extends and revises them. Because RFC 822 said - so little about message bodies, these documents are largely - orthogonal to (rather than a revision of) RFC 822. - - This initial document specifies the various headers used to describe - the structure of MIME messages. The second document, RFC 2046, - defines the general structure of the MIME media typing system and - defines an initial set of media types. The third document, RFC 2047, - describes extensions to RFC 822 to allow non-US-ASCII text data in - - - -Freed & Borenstein Standards Track [Page 1] - -RFC 2045 Internet Message Bodies November 1996 - - - Internet mail header fields. The fourth document, RFC 2048, specifies - various IANA registration procedures for MIME-related facilities. The - fifth and final document, RFC 2049, describes MIME conformance - criteria as well as providing some illustrative examples of MIME - message formats, acknowledgements, and the bibliography. - - These documents are revisions of RFCs 1521, 1522, and 1590, which - themselves were revisions of RFCs 1341 and 1342. An appendix in RFC - 2049 describes differences and changes from previous versions. - -Table of Contents - - 1. Introduction ......................................... 3 - 2. Definitions, Conventions, and Generic BNF Grammar .... 5 - 2.1 CRLF ................................................ 5 - 2.2 Character Set ....................................... 6 - 2.3 Message ............................................. 6 - 2.4 Entity .............................................. 6 - 2.5 Body Part ........................................... 7 - 2.6 Body ................................................ 7 - 2.7 7bit Data ........................................... 7 - 2.8 8bit Data ........................................... 7 - 2.9 Binary Data ......................................... 7 - 2.10 Lines .............................................. 7 - 3. MIME Header Fields ................................... 8 - 4. MIME-Version Header Field ............................ 8 - 5. Content-Type Header Field ............................ 10 - 5.1 Syntax of the Content-Type Header Field ............. 12 - 5.2 Content-Type Defaults ............................... 14 - 6. Content-Transfer-Encoding Header Field ............... 14 - 6.1 Content-Transfer-Encoding Syntax .................... 14 - 6.2 Content-Transfer-Encodings Semantics ................ 15 - 6.3 New Content-Transfer-Encodings ...................... 16 - 6.4 Interpretation and Use .............................. 16 - 6.5 Translating Encodings ............................... 18 - 6.6 Canonical Encoding Model ............................ 19 - 6.7 Quoted-Printable Content-Transfer-Encoding .......... 19 - 6.8 Base64 Content-Transfer-Encoding .................... 24 - 7. Content-ID Header Field .............................. 26 - 8. Content-Description Header Field ..................... 27 - 9. Additional MIME Header Fields ........................ 27 - 10. Summary ............................................. 27 - 11. Security Considerations ............................. 27 - 12. Authors' Addresses .................................. 28 - A. Collected Grammar .................................... 29 - - - - - - -Freed & Borenstein Standards Track [Page 2] - -RFC 2045 Internet Message Bodies November 1996 - - -1. Introduction - - Since its publication in 1982, RFC 822 has defined the standard - format of textual mail messages on the Internet. Its success has - been such that the RFC 822 format has been adopted, wholly or - partially, well beyond the confines of the Internet and the Internet - SMTP transport defined by RFC 821. As the format has seen wider use, - a number of limitations have proven increasingly restrictive for the - user community. - - RFC 822 was intended to specify a format for text messages. As such, - non-text messages, such as multimedia messages that might include - audio or images, are simply not mentioned. Even in the case of text, - however, RFC 822 is inadequate for the needs of mail users whose - languages require the use of character sets richer than US-ASCII. - Since RFC 822 does not specify mechanisms for mail containing audio, - video, Asian language text, or even text in most European languages, - additional specifications are needed. - - One of the notable limitations of RFC 821/822 based mail systems is - the fact that they limit the contents of electronic mail messages to - relatively short lines (e.g. 1000 characters or less [RFC-821]) of - 7bit US-ASCII. This forces users to convert any non-textual data - that they may wish to send into seven-bit bytes representable as - printable US-ASCII characters before invoking a local mail UA (User - Agent, a program with which human users send and receive mail). - Examples of such encodings currently used in the Internet include - pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in - RFC 1421, the Andrew Toolkit Representation [ATK], and many others. - - The limitations of RFC 822 mail become even more apparent as gateways - are designed to allow for the exchange of mail messages between RFC - 822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the - inclusion of non-textual material within electronic mail messages. - The current standards for the mapping of X.400 messages to RFC 822 - messages specify either that X.400 non-textual material must be - converted to (not encoded in) IA5Text format, or that they must be - discarded, notifying the RFC 822 user that discarding has occurred. - This is clearly undesirable, as information that a user may wish to - receive is lost. Even though a user agent may not have the - capability of dealing with the non-textual material, the user might - have some mechanism external to the UA that can extract useful - information from the material. Moreover, it does not allow for the - fact that the message may eventually be gatewayed back into an X.400 - message handling system (i.e., the X.400 message is "tunneled" - through Internet mail), where the non-textual information would - definitely become useful again. - - - - -Freed & Borenstein Standards Track [Page 3] - -RFC 2045 Internet Message Bodies November 1996 - - - This document describes several mechanisms that combine to solve most - of these problems without introducing any serious incompatibilities - with the existing world of RFC 822 mail. In particular, it - describes: - - (1) A MIME-Version header field, which uses a version - number to declare a message to be conformant with MIME - and allows mail processing agents to distinguish - between such messages and those generated by older or - non-conformant software, which are presumed to lack - such a field. - - (2) A Content-Type header field, generalized from RFC 1049, - which can be used to specify the media type and subtype - of data in the body of a message and to fully specify - the native representation (canonical form) of such - data. - - (3) A Content-Transfer-Encoding header field, which can be - used to specify both the encoding transformation that - was applied to the body and the domain of the result. - Encoding transformations other than the identity - transformation are usually applied to data in order to - allow it to pass through mail transport mechanisms - which may have data or character set limitations. - - (4) Two additional header fields that can be used to - further describe the data in a body, the Content-ID and - Content-Description header fields. - - All of the header fields defined in this document are subject to the - general syntactic rules for header fields specified in RFC 822. In - particular, all of these header fields except for Content-Disposition - can include RFC 822 comments, which have no semantic content and - should be ignored during MIME processing. - - Finally, to specify and promote interoperability, RFC 2049 provides a - basic applicability statement for a subset of the above mechanisms - that defines a minimal level of "conformance" with this document. - - HISTORICAL NOTE: Several of the mechanisms described in this set of - documents may seem somewhat strange or even baroque at first reading. - It is important to note that compatibility with existing standards - AND robustness across existing practice were two of the highest - priorities of the working group that developed this set of documents. - In particular, compatibility was always favored over elegance. - - - - - -Freed & Borenstein Standards Track [Page 4] - -RFC 2045 Internet Message Bodies November 1996 - - - Please refer to the current edition of the "Internet Official - Protocol Standards" for the standardization state and status of this - protocol. RFC 822 and STD 3, RFC 1123 also provide essential - background for MIME since no conforming implementation of MIME can - violate them. In addition, several other informational RFC documents - will be of interest to the MIME implementor, in particular RFC 1344, - RFC 1345, and RFC 1524. - -2. Definitions, Conventions, and Generic BNF Grammar - - Although the mechanisms specified in this set of documents are all - described in prose, most are also described formally in the augmented - BNF notation of RFC 822. Implementors will need to be familiar with - this notation in order to understand this set of documents, and are - referred to RFC 822 for a complete explanation of the augmented BNF - notation. - - Some of the augmented BNF in this set of documents makes named - references to syntax rules defined in RFC 822. A complete formal - grammar, then, is obtained by combining the collected grammar - appendices in each document in this set with the BNF of RFC 822 plus - the modifications to RFC 822 defined in RFC 1123 (which specifically - changes the syntax for `return', `date' and `mailbox'). - - All numeric and octet values are given in decimal notation in this - set of documents. All media type values, subtype values, and - parameter names as defined are case-insensitive. However, parameter - values are case-sensitive unless otherwise specified for the specific - parameter. - - FORMATTING NOTE: Notes, such at this one, provide additional - nonessential information which may be skipped by the reader without - missing anything essential. The primary purpose of these non- - essential notes is to convey information about the rationale of this - set of documents, or to place these documents in the proper - historical or evolutionary context. Such information may in - particular be skipped by those who are focused entirely on building a - conformant implementation, but may be of use to those who wish to - understand why certain design choices were made. - -2.1. CRLF - - The term CRLF, in this set of documents, refers to the sequence of - octets corresponding to the two US-ASCII characters CR (decimal value - 13) and LF (decimal value 10) which, taken together, in this order, - denote a line break in RFC 822 mail. - - - - - -Freed & Borenstein Standards Track [Page 5] - -RFC 2045 Internet Message Bodies November 1996 - - -2.2. Character Set - - The term "character set" is used in MIME to refer to a method of - converting a sequence of octets into a sequence of characters. Note - that unconditional and unambiguous conversion in the other direction - is not required, in that not all characters may be representable by a - given character set and a character set may provide more than one - sequence of octets to represent a particular sequence of characters. - - This definition is intended to allow various kinds of character - encodings, from simple single-table mappings such as US-ASCII to - complex table switching methods such as those that use ISO 2022's - techniques, to be used as character sets. However, the definition - associated with a MIME character set name must fully specify the - mapping to be performed. In particular, use of external profiling - information to determine the exact mapping is not permitted. - - NOTE: The term "character set" was originally to describe such - straightforward schemes as US-ASCII and ISO-8859-1 which have a - simple one-to-one mapping from single octets to single characters. - Multi-octet coded character sets and switching techniques make the - situation more complex. For example, some communities use the term - "character encoding" for what MIME calls a "character set", while - using the phrase "coded character set" to denote an abstract mapping - from integers (not octets) to characters. - -2.3. Message - - The term "message", when not further qualified, means either a - (complete or "top-level") RFC 822 message being transferred on a - network, or a message encapsulated in a body of type "message/rfc822" - or "message/partial". - -2.4. Entity - - The term "entity", refers specifically to the MIME-defined header - fields and contents of either a message or one of the parts in the - body of a multipart entity. The specification of such entities is - the essence of MIME. Since the contents of an entity are often - called the "body", it makes sense to speak about the body of an - entity. Any sort of field may be present in the header of an entity, - but only those fields whose names begin with "content-" actually have - any MIME-related meaning. Note that this does NOT imply thay they - have no meaning at all -- an entity that is also a message has non- - MIME header fields whose meanings are defined by RFC 822. - - - - - - -Freed & Borenstein Standards Track [Page 6] - -RFC 2045 Internet Message Bodies November 1996 - - -2.5. Body Part - - The term "body part" refers to an entity inside of a multipart - entity. - -2.6. Body - - The term "body", when not further qualified, means the body of an - entity, that is, the body of either a message or of a body part. - - NOTE: The previous four definitions are clearly circular. This is - unavoidable, since the overall structure of a MIME message is indeed - recursive. - -2.7. 7bit Data - - "7bit data" refers to data that is all represented as relatively - short lines with 998 octets or less between CRLF line separation - sequences [RFC-821]. No octets with decimal values greater than 127 - are allowed and neither are NULs (octets with decimal value 0). CR - (decimal value 13) and LF (decimal value 10) octets only occur as - part of CRLF line separation sequences. - -2.8. 8bit Data - - "8bit data" refers to data that is all represented as relatively - short lines with 998 octets or less between CRLF line separation - sequences [RFC-821]), but octets with decimal values greater than 127 - may be used. As with "7bit data" CR and LF octets only occur as part - of CRLF line separation sequences and no NULs are allowed. - -2.9. Binary Data - - "Binary data" refers to data where any sequence of octets whatsoever - is allowed. - -2.10. Lines - - "Lines" are defined as sequences of octets separated by a CRLF - sequences. This is consistent with both RFC 821 and RFC 822. - "Lines" only refers to a unit of data in a message, which may or may - not correspond to something that is actually displayed by a user - agent. - - - - - - - - -Freed & Borenstein Standards Track [Page 7] - -RFC 2045 Internet Message Bodies November 1996 - - -3. MIME Header Fields - - MIME defines a number of new RFC 822 header fields that are used to - describe the content of a MIME entity. These header fields occur in - at least two contexts: - - (1) As part of a regular RFC 822 message header. - - (2) In a MIME body part header within a multipart - construct. - - The formal definition of these header fields is as follows: - - entity-headers := [ content CRLF ] - [ encoding CRLF ] - [ id CRLF ] - [ description CRLF ] - *( MIME-extension-field CRLF ) - - MIME-message-headers := entity-headers - fields - version CRLF - ; The ordering of the header - ; fields implied by this BNF - ; definition should be ignored. - - MIME-part-headers := entity-headers - [ fields ] - ; Any field not beginning with - ; "content-" can have no defined - ; meaning and may be ignored. - ; The ordering of the header - ; fields implied by this BNF - ; definition should be ignored. - - The syntax of the various specific MIME header fields will be - described in the following sections. - -4. MIME-Version Header Field - - Since RFC 822 was published in 1982, there has really been only one - format standard for Internet messages, and there has been little - perceived need to declare the format standard in use. This document - is an independent specification that complements RFC 822. Although - the extensions in this document have been defined in such a way as to - be compatible with RFC 822, there are still circumstances in which it - might be desirable for a mail-processing agent to know whether a - message was composed with the new standard in mind. - - - -Freed & Borenstein Standards Track [Page 8] - -RFC 2045 Internet Message Bodies November 1996 - - - Therefore, this document defines a new header field, "MIME-Version", - which is to be used to declare the version of the Internet message - body format standard in use. - - Messages composed in accordance with this document MUST include such - a header field, with the following verbatim text: - - MIME-Version: 1.0 - - The presence of this header field is an assertion that the message - has been composed in compliance with this document. - - Since it is possible that a future document might extend the message - format standard again, a formal BNF is given for the content of the - MIME-Version field: - - version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT - - Thus, future format specifiers, which might replace or extend "1.0", - are constrained to be two integer fields, separated by a period. If - a message is received with a MIME-version value other than "1.0", it - cannot be assumed to conform with this document. - - Note that the MIME-Version header field is required at the top level - of a message. It is not required for each body part of a multipart - entity. It is required for the embedded headers of a body of type - "message/rfc822" or "message/partial" if and only if the embedded - message is itself claimed to be MIME-conformant. - - It is not possible to fully specify how a mail reader that conforms - with MIME as defined in this document should treat a message that - might arrive in the future with some value of MIME-Version other than - "1.0". - - It is also worth noting that version control for specific media types - is not accomplished using the MIME-Version mechanism. In particular, - some formats (such as application/postscript) have version numbering - conventions that are internal to the media format. Where such - conventions exist, MIME does nothing to supersede them. Where no - such conventions exist, a MIME media type might use a "version" - parameter in the content-type field if necessary. - - - - - - - - - - -Freed & Borenstein Standards Track [Page 9] - -RFC 2045 Internet Message Bodies November 1996 - - - NOTE TO IMPLEMENTORS: When checking MIME-Version values any RFC 822 - comment strings that are present must be ignored. In particular, the - following four MIME-Version fields are equivalent: - - MIME-Version: 1.0 - - MIME-Version: 1.0 (produced by MetaSend Vx.x) - - MIME-Version: (produced by MetaSend Vx.x) 1.0 - - MIME-Version: 1.(produced by MetaSend Vx.x)0 - - In the absence of a MIME-Version field, a receiving mail user agent - (whether conforming to MIME requirements or not) may optionally - choose to interpret the body of the message according to local - conventions. Many such conventions are currently in use and it - should be noted that in practice non-MIME messages can contain just - about anything. - - It is impossible to be certain that a non-MIME mail message is - actually plain text in the US-ASCII character set since it might well - be a message that, using some set of nonstandard local conventions - that predate MIME, includes text in another character set or non- - textual data presented in a manner that cannot be automatically - recognized (e.g., a uuencoded compressed UNIX tar file). - -5. Content-Type Header Field - - The purpose of the Content-Type field is to describe the data - contained in the body fully enough that the receiving user agent can - pick an appropriate agent or mechanism to present the data to the - user, or otherwise deal with the data in an appropriate manner. The - value in this field is called a media type. - - HISTORICAL NOTE: The Content-Type header field was first defined in - RFC 1049. RFC 1049 used a simpler and less powerful syntax, but one - that is largely compatible with the mechanism given here. - - The Content-Type header field specifies the nature of the data in the - body of an entity by giving media type and subtype identifiers, and - by providing auxiliary information that may be required for certain - media types. After the media type and subtype names, the remainder - of the header field is simply a set of parameters, specified in an - attribute=value notation. The ordering of parameters is not - significant. - - - - - - -Freed & Borenstein Standards Track [Page 10] - -RFC 2045 Internet Message Bodies November 1996 - - - In general, the top-level media type is used to declare the general - type of data, while the subtype specifies a specific format for that - type of data. Thus, a media type of "image/xyz" is enough to tell a - user agent that the data is an image, even if the user agent has no - knowledge of the specific image format "xyz". Such information can - be used, for example, to decide whether or not to show a user the raw - data from an unrecognized subtype -- such an action might be - reasonable for unrecognized subtypes of text, but not for - unrecognized subtypes of image or audio. For this reason, registered - subtypes of text, image, audio, and video should not contain embedded - information that is really of a different type. Such compound - formats should be represented using the "multipart" or "application" - types. - - Parameters are modifiers of the media subtype, and as such do not - fundamentally affect the nature of the content. The set of - meaningful parameters depends on the media type and subtype. Most - parameters are associated with a single specific subtype. However, a - given top-level media type may define parameters which are applicable - to any subtype of that type. Parameters may be required by their - defining content type or subtype or they may be optional. MIME - implementations must ignore any parameters whose names they do not - recognize. - - For example, the "charset" parameter is applicable to any subtype of - "text", while the "boundary" parameter is required for any subtype of - the "multipart" media type. - - There are NO globally-meaningful parameters that apply to all media - types. Truly global mechanisms are best addressed, in the MIME - model, by the definition of additional Content-* header fields. - - An initial set of seven top-level media types is defined in RFC 2046. - Five of these are discrete types whose content is essentially opaque - as far as MIME processing is concerned. The remaining two are - composite types whose contents require additional handling by MIME - processors. - - This set of top-level media types is intended to be substantially - complete. It is expected that additions to the larger set of - supported types can generally be accomplished by the creation of new - subtypes of these initial types. In the future, more top-level types - may be defined only by a standards-track extension to this standard. - If another top-level type is to be used for any reason, it must be - given a name starting with "X-" to indicate its non-standard status - and to avoid a potential conflict with a future official name. - - - - - -Freed & Borenstein Standards Track [Page 11] - -RFC 2045 Internet Message Bodies November 1996 - - -5.1. Syntax of the Content-Type Header Field - - In the Augmented BNF notation of RFC 822, a Content-Type header field - value is defined as follows: - - content := "Content-Type" ":" type "/" subtype - *(";" parameter) - ; Matching of media type and subtype - ; is ALWAYS case-insensitive. - - type := discrete-type / composite-type - - discrete-type := "text" / "image" / "audio" / "video" / - "application" / extension-token - - composite-type := "message" / "multipart" / extension-token - - extension-token := ietf-token / x-token - - ietf-token := <An extension token defined by a - standards-track RFC and registered - with IANA.> - - x-token := <The two characters "X-" or "x-" followed, with - no intervening white space, by any token> - - subtype := extension-token / iana-token - - iana-token := <A publicly-defined extension token. Tokens - of this form must be registered with IANA - as specified in RFC 2048.> - - parameter := attribute "=" value - - attribute := token - ; Matching of attributes - ; is ALWAYS case-insensitive. - - value := token / quoted-string - - token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, - or tspecials> - - tspecials := "(" / ")" / "<" / ">" / "@" / - "," / ";" / ":" / "\" / <"> - "/" / "[" / "]" / "?" / "=" - ; Must be in quoted-string, - ; to use within parameter values - - - -Freed & Borenstein Standards Track [Page 12] - -RFC 2045 Internet Message Bodies November 1996 - - - Note that the definition of "tspecials" is the same as the RFC 822 - definition of "specials" with the addition of the three characters - "/", "?", and "=", and the removal of ".". - - Note also that a subtype specification is MANDATORY -- it may not be - omitted from a Content-Type header field. As such, there are no - default subtypes. - - The type, subtype, and parameter names are not case sensitive. For - example, TEXT, Text, and TeXt are all equivalent top-level media - types. Parameter values are normally case sensitive, but sometimes - are interpreted in a case-insensitive fashion, depending on the - intended use. (For example, multipart boundaries are case-sensitive, - but the "access-type" parameter for message/External-body is not - case-sensitive.) - - Note that the value of a quoted string parameter does not include the - quotes. That is, the quotation marks in a quoted-string are not a - part of the value of the parameter, but are merely used to delimit - that parameter value. In addition, comments are allowed in - accordance with RFC 822 rules for structured header fields. Thus the - following two forms - - Content-type: text/plain; charset=us-ascii (Plain text) - - Content-type: text/plain; charset="us-ascii" - - are completely equivalent. - - Beyond this syntax, the only syntactic constraint on the definition - of subtype names is the desire that their uses must not conflict. - That is, it would be undesirable to have two different communities - using "Content-Type: application/foobar" to mean two different - things. The process of defining new media subtypes, then, is not - intended to be a mechanism for imposing restrictions, but simply a - mechanism for publicizing their definition and usage. There are, - therefore, two acceptable mechanisms for defining new media subtypes: - - (1) Private values (starting with "X-") may be defined - bilaterally between two cooperating agents without - outside registration or standardization. Such values - cannot be registered or standardized. - - (2) New standard values should be registered with IANA as - described in RFC 2048. - - The second document in this set, RFC 2046, defines the initial set of - media types for MIME. - - - -Freed & Borenstein Standards Track [Page 13] - -RFC 2045 Internet Message Bodies November 1996 - - -5.2. Content-Type Defaults - - Default RFC 822 messages without a MIME Content-Type header are taken - by this protocol to be plain text in the US-ASCII character set, - which can be explicitly specified as: - - Content-type: text/plain; charset=us-ascii - - This default is assumed if no Content-Type header field is specified. - It is also recommend that this default be assumed when a - syntactically invalid Content-Type header field is encountered. In - the presence of a MIME-Version header field and the absence of any - Content-Type header field, a receiving User Agent can also assume - that plain US-ASCII text was the sender's intent. Plain US-ASCII - text may still be assumed in the absence of a MIME-Version or the - presence of an syntactically invalid Content-Type header field, but - the sender's intent might have been otherwise. - -6. Content-Transfer-Encoding Header Field - - Many media types which could be usefully transported via email are - represented, in their "natural" format, as 8bit character or binary - data. Such data cannot be transmitted over some transfer protocols. - For example, RFC 821 (SMTP) restricts mail messages to 7bit US-ASCII - data with lines no longer than 1000 characters including any trailing - CRLF line separator. - - It is necessary, therefore, to define a standard mechanism for - encoding such data into a 7bit short line format. Proper labelling - of unencoded material in less restrictive formats for direct use over - less restrictive transports is also desireable. This document - specifies that such encodings will be indicated by a new "Content- - Transfer-Encoding" header field. This field has not been defined by - any previous standard. - -6.1. Content-Transfer-Encoding Syntax - - The Content-Transfer-Encoding field's value is a single token - specifying the type of encoding, as enumerated below. Formally: - - encoding := "Content-Transfer-Encoding" ":" mechanism - - mechanism := "7bit" / "8bit" / "binary" / - "quoted-printable" / "base64" / - ietf-token / x-token - - These values are not case sensitive -- Base64 and BASE64 and bAsE64 - are all equivalent. An encoding type of 7BIT requires that the body - - - -Freed & Borenstein Standards Track [Page 14] - -RFC 2045 Internet Message Bodies November 1996 - - - is already in a 7bit mail-ready representation. This is the default - value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the - Content-Transfer-Encoding header field is not present. - -6.2. Content-Transfer-Encodings Semantics - - This single Content-Transfer-Encoding token actually provides two - pieces of information. It specifies what sort of encoding - transformation the body was subjected to and hence what decoding - operation must be used to restore it to its original form, and it - specifies what the domain of the result is. - - The transformation part of any Content-Transfer-Encodings specifies, - either explicitly or implicitly, a single, well-defined decoding - algorithm, which for any sequence of encoded octets either transforms - it to the original sequence of octets which was encoded, or shows - that it is illegal as an encoded sequence. Content-Transfer- - Encodings transformations never depend on any additional external - profile information for proper operation. Note that while decoders - must produce a single, well-defined output for a valid encoding no - such restrictions exist for encoders: Encoding a given sequence of - octets to different, equivalent encoded sequences is perfectly legal. - - Three transformations are currently defined: identity, the "quoted- - printable" encoding, and the "base64" encoding. The domains are - "binary", "8bit" and "7bit". - - The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all - mean that the identity (i.e. NO) encoding transformation has been - performed. As such, they serve simply as indicators of the domain of - the body data, and provide useful information about the sort of - encoding that might be needed for transmission in a given transport - system. The terms "7bit data", "8bit data", and "binary data" are - all defined in Section 2. - - The quoted-printable and base64 encodings transform their input from - an arbitrary domain into material in the "7bit" range, thus making it - safe to carry over restricted transports. The specific definition of - the transformations are given below. - - The proper Content-Transfer-Encoding label must always be used. - Labelling unencoded data containing 8bit characters as "7bit" is not - allowed, nor is labelling unencoded non-line-oriented data as - anything other than "binary" allowed. - - Unlike media subtypes, a proliferation of Content-Transfer-Encoding - values is both undesirable and unnecessary. However, establishing - only a single transformation into the "7bit" domain does not seem - - - -Freed & Borenstein Standards Track [Page 15] - -RFC 2045 Internet Message Bodies November 1996 - - - possible. There is a tradeoff between the desire for a compact and - efficient encoding of largely- binary data and the desire for a - somewhat readable encoding of data that is mostly, but not entirely, - 7bit. For this reason, at least two encoding mechanisms are - necessary: a more or less readable encoding (quoted-printable) and a - "dense" or "uniform" encoding (base64). - - Mail transport for unencoded 8bit data is defined in RFC 1652. As of - the initial publication of this document, there are no standardized - Internet mail transports for which it is legitimate to include - unencoded binary data in mail bodies. Thus there are no - circumstances in which the "binary" Content-Transfer-Encoding is - actually valid in Internet mail. However, in the event that binary - mail transport becomes a reality in Internet mail, or when MIME is - used in conjunction with any other binary-capable mail transport - mechanism, binary bodies must be labelled as such using this - mechanism. - - NOTE: The five values defined for the Content-Transfer-Encoding field - imply nothing about the media type other than the algorithm by which - it was encoded or the transport system requirements if unencoded. - -6.3. New Content-Transfer-Encodings - - Implementors may, if necessary, define private Content-Transfer- - Encoding values, but must use an x-token, which is a name prefixed by - "X-", to indicate its non-standard status, e.g., "Content-Transfer- - Encoding: x-my-new-encoding". Additional standardized Content- - Transfer-Encoding values must be specified by a standards-track RFC. - The requirements such specifications must meet are given in RFC 2048. - As such, all content-transfer-encoding namespace except that - beginning with "X-" is explicitly reserved to the IETF for future - use. - - Unlike media types and subtypes, the creation of new Content- - Transfer-Encoding values is STRONGLY discouraged, as it seems likely - to hinder interoperability with little potential benefit - -6.4. Interpretation and Use - - If a Content-Transfer-Encoding header field appears as part of a - message header, it applies to the entire body of that message. If a - Content-Transfer-Encoding header field appears as part of an entity's - headers, it applies only to the body of that entity. If an entity is - of type "multipart" the Content-Transfer-Encoding is not permitted to - have any value other than "7bit", "8bit" or "binary". Even more - severe restrictions apply to some subtypes of the "message" type. - - - - -Freed & Borenstein Standards Track [Page 16] - -RFC 2045 Internet Message Bodies November 1996 - - - It should be noted that most media types are defined in terms of - octets rather than bits, so that the mechanisms described here are - mechanisms for encoding arbitrary octet streams, not bit streams. If - a bit stream is to be encoded via one of these mechanisms, it must - first be converted to an 8bit byte stream using the network standard - bit order ("big-endian"), in which the earlier bits in a stream - become the higher-order bits in a 8bit byte. A bit stream not ending - at an 8bit boundary must be padded with zeroes. RFC 2046 provides a - mechanism for noting the addition of such padding in the case of the - application/octet-stream media type, which has a "padding" parameter. - - The encoding mechanisms defined here explicitly encode all data in - US-ASCII. Thus, for example, suppose an entity has header fields - such as: - - Content-Type: text/plain; charset=ISO-8859-1 - Content-transfer-encoding: base64 - - This must be interpreted to mean that the body is a base64 US-ASCII - encoding of data that was originally in ISO-8859-1, and will be in - that character set again after decoding. - - Certain Content-Transfer-Encoding values may only be used on certain - media types. In particular, it is EXPRESSLY FORBIDDEN to use any - encodings other than "7bit", "8bit", or "binary" with any composite - media type, i.e. one that recursively includes other Content-Type - fields. Currently the only composite media types are "multipart" and - "message". All encodings that are desired for bodies of type - multipart or message must be done at the innermost level, by encoding - the actual body that needs to be encoded. - - It should also be noted that, by definition, if a composite entity - has a transfer-encoding value such as "7bit", but one of the enclosed - entities has a less restrictive value such as "8bit", then either the - outer "7bit" labelling is in error, because 8bit data are included, - or the inner "8bit" labelling placed an unnecessarily high demand on - the transport system because the actual included data were actually - 7bit-safe. - - NOTE ON ENCODING RESTRICTIONS: Though the prohibition against using - content-transfer-encodings on composite body data may seem overly - restrictive, it is necessary to prevent nested encodings, in which - data are passed through an encoding algorithm multiple times, and - must be decoded multiple times in order to be properly viewed. - Nested encodings add considerable complexity to user agents: Aside - from the obvious efficiency problems with such multiple encodings, - they can obscure the basic structure of a message. In particular, - they can imply that several decoding operations are necessary simply - - - -Freed & Borenstein Standards Track [Page 17] - -RFC 2045 Internet Message Bodies November 1996 - - - to find out what types of bodies a message contains. Banning nested - encodings may complicate the job of certain mail gateways, but this - seems less of a problem than the effect of nested encodings on user - agents. - - Any entity with an unrecognized Content-Transfer-Encoding must be - treated as if it has a Content-Type of "application/octet-stream", - regardless of what the Content-Type header field actually says. - - NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-TRANSFER- - ENCODING: It may seem that the Content-Transfer-Encoding could be - inferred from the characteristics of the media that is to be encoded, - or, at the very least, that certain Content-Transfer-Encodings could - be mandated for use with specific media types. There are several - reasons why this is not the case. First, given the varying types of - transports used for mail, some encodings may be appropriate for some - combinations of media types and transports but not for others. (For - example, in an 8bit transport, no encoding would be required for text - in certain character sets, while such encodings are clearly required - for 7bit SMTP.) - - Second, certain media types may require different types of transfer - encoding under different circumstances. For example, many PostScript - bodies might consist entirely of short lines of 7bit data and hence - require no encoding at all. Other PostScript bodies (especially - those using Level 2 PostScript's binary encoding mechanism) may only - be reasonably represented using a binary transport encoding. - Finally, since the Content-Type field is intended to be an open-ended - specification mechanism, strict specification of an association - between media types and encodings effectively couples the - specification of an application protocol with a specific lower-level - transport. This is not desirable since the developers of a media - type should not have to be aware of all the transports in use and - what their limitations are. - -6.5. Translating Encodings - - The quoted-printable and base64 encodings are designed so that - conversion between them is possible. The only issue that arises in - such a conversion is the handling of hard line breaks in quoted- - printable encoding output. When converting from quoted-printable to - base64 a hard line break in the quoted-printable form represents a - CRLF sequence in the canonical form of the data. It must therefore be - converted to a corresponding encoded CRLF in the base64 form of the - data. Similarly, a CRLF sequence in the canonical form of the data - obtained after base64 decoding must be converted to a quoted- - printable hard line break, but ONLY when converting text data. - - - - -Freed & Borenstein Standards Track [Page 18] - -RFC 2045 Internet Message Bodies November 1996 - - -6.6. Canonical Encoding Model - - There was some confusion, in the previous versions of this RFC, - regarding the model for when email data was to be converted to - canonical form and encoded, and in particular how this process would - affect the treatment of CRLFs, given that the representation of - newlines varies greatly from system to system, and the relationship - between content-transfer-encodings and character sets. A canonical - model for encoding is presented in RFC 2049 for this reason. - -6.7. Quoted-Printable Content-Transfer-Encoding - - The Quoted-Printable encoding is intended to represent data that - largely consists of octets that correspond to printable characters in - the US-ASCII character set. It encodes the data in such a way that - the resulting octets are unlikely to be modified by mail transport. - If the data being encoded are mostly US-ASCII text, the encoded form - of the data remains largely recognizable by humans. A body which is - entirely US-ASCII may also be encoded in Quoted-Printable to ensure - the integrity of the data should the message pass through a - character-translating, and/or line-wrapping gateway. - - In this encoding, octets are to be represented as determined by the - following rules: - - (1) (General 8bit representation) Any octet, except a CR or - LF that is part of a CRLF line break of the canonical - (standard) form of the data being encoded, may be - represented by an "=" followed by a two digit - hexadecimal representation of the octet's value. The - digits of the hexadecimal alphabet, for this purpose, - are "0123456789ABCDEF". Uppercase letters must be - used; lowercase letters are not allowed. Thus, for - example, the decimal value 12 (US-ASCII form feed) can - be represented by "=0C", and the decimal value 61 (US- - ASCII EQUAL SIGN) can be represented by "=3D". This - rule must be followed except when the following rules - allow an alternative encoding. - - (2) (Literal representation) Octets with decimal values of - 33 through 60 inclusive, and 62 through 126, inclusive, - MAY be represented as the US-ASCII characters which - correspond to those octets (EXCLAMATION POINT through - LESS THAN, and GREATER THAN through TILDE, - respectively). - - (3) (White Space) Octets with values of 9 and 32 MAY be - represented as US-ASCII TAB (HT) and SPACE characters, - - - -Freed & Borenstein Standards Track [Page 19] - -RFC 2045 Internet Message Bodies November 1996 - - - respectively, but MUST NOT be so represented at the end - of an encoded line. Any TAB (HT) or SPACE characters - on an encoded line MUST thus be followed on that line - by a printable character. In particular, an "=" at the - end of an encoded line, indicating a soft line break - (see rule #5) may follow one or more TAB (HT) or SPACE - characters. It follows that an octet with decimal - value 9 or 32 appearing at the end of an encoded line - must be represented according to Rule #1. This rule is - necessary because some MTAs (Message Transport Agents, - programs which transport messages from one user to - another, or perform a portion of such transfers) are - known to pad lines of text with SPACEs, and others are - known to remove "white space" characters from the end - of a line. Therefore, when decoding a Quoted-Printable - body, any trailing white space on a line must be - deleted, as it will necessarily have been added by - intermediate transport agents. - - (4) (Line Breaks) A line break in a text body, represented - as a CRLF sequence in the text canonical form, must be - represented by a (RFC 822) line break, which is also a - CRLF sequence, in the Quoted-Printable encoding. Since - the canonical representation of media types other than - text do not generally include the representation of - line breaks as CRLF sequences, no hard line breaks - (i.e. line breaks that are intended to be meaningful - and to be displayed to the user) can occur in the - quoted-printable encoding of such types. Sequences - like "=0D", "=0A", "=0A=0D" and "=0D=0A" will routinely - appear in non-text data represented in quoted- - printable, of course. - - Note that many implementations may elect to encode the - local representation of various content types directly - rather than converting to canonical form first, - encoding, and then converting back to local - representation. In particular, this may apply to plain - text material on systems that use newline conventions - other than a CRLF terminator sequence. Such an - implementation optimization is permissible, but only - when the combined canonicalization-encoding step is - equivalent to performing the three steps separately. - - (5) (Soft Line Breaks) The Quoted-Printable encoding - REQUIRES that encoded lines be no more than 76 - characters long. If longer lines are to be encoded - with the Quoted-Printable encoding, "soft" line breaks - - - -Freed & Borenstein Standards Track [Page 20] - -RFC 2045 Internet Message Bodies November 1996 - - - must be used. An equal sign as the last character on a - encoded line indicates such a non-significant ("soft") - line break in the encoded text. - - Thus if the "raw" form of the line is a single unencoded line that - says: - - Now's the time for all folk to come to the aid of their country. - - This can be represented, in the Quoted-Printable encoding, as: - - Now's the time = - for all folk to come= - to the aid of their country. - - This provides a mechanism with which long lines are encoded in such a - way as to be restored by the user agent. The 76 character limit does - not count the trailing CRLF, but counts all other characters, - including any equal signs. - - Since the hyphen character ("-") may be represented as itself in the - Quoted-Printable encoding, care must be taken, when encapsulating a - quoted-printable encoded body inside one or more multipart entities, - to ensure that the boundary delimiter does not appear anywhere in the - encoded body. (A good strategy is to choose a boundary that includes - a character sequence such as "=_" which can never appear in a - quoted-printable body. See the definition of multipart messages in - RFC 2046.) - - NOTE: The quoted-printable encoding represents something of a - compromise between readability and reliability in transport. Bodies - encoded with the quoted-printable encoding will work reliably over - most mail gateways, but may not work perfectly over a few gateways, - notably those involving translation into EBCDIC. A higher level of - confidence is offered by the base64 Content-Transfer-Encoding. A way - to get reasonably reliable transport through EBCDIC gateways is to - also quote the US-ASCII characters - - !"#$@[\]^`{|}~ - - according to rule #1. - - Because quoted-printable data is generally assumed to be line- - oriented, it is to be expected that the representation of the breaks - between the lines of quoted-printable data may be altered in - transport, in the same manner that plain text mail has always been - altered in Internet mail when passing between systems with differing - newline conventions. If such alterations are likely to constitute a - - - -Freed & Borenstein Standards Track [Page 21] - -RFC 2045 Internet Message Bodies November 1996 - - - corruption of the data, it is probably more sensible to use the - base64 encoding rather than the quoted-printable encoding. - - NOTE: Several kinds of substrings cannot be generated according to - the encoding rules for the quoted-printable content-transfer- - encoding, and hence are formally illegal if they appear in the output - of a quoted-printable encoder. This note enumerates these cases and - suggests ways to handle such illegal substrings if any are - encountered in quoted-printable data that is to be decoded. - - (1) An "=" followed by two hexadecimal digits, one or both - of which are lowercase letters in "abcdef", is formally - illegal. A robust implementation might choose to - recognize them as the corresponding uppercase letters. - - (2) An "=" followed by a character that is neither a - hexadecimal digit (including "abcdef") nor the CR - character of a CRLF pair is illegal. This case can be - the result of US-ASCII text having been included in a - quoted-printable part of a message without itself - having been subjected to quoted-printable encoding. A - reasonable approach by a robust implementation might be - to include the "=" character and the following - character in the decoded data without any - transformation and, if possible, indicate to the user - that proper decoding was not possible at this point in - the data. - - (3) An "=" cannot be the ultimate or penultimate character - in an encoded object. This could be handled as in case - (2) above. - - (4) Control characters other than TAB, or CR and LF as - parts of CRLF pairs, must not appear. The same is true - for octets with decimal values greater than 126. If - found in incoming quoted-printable data by a decoder, a - robust implementation might exclude them from the - decoded data and warn the user that illegal characters - were discovered. - - (5) Encoded lines must not be longer than 76 characters, - not counting the trailing CRLF. If longer lines are - found in incoming, encoded data, a robust - implementation might nevertheless decode the lines, and - might report the erroneous encoding to the user. - - - - - - -Freed & Borenstein Standards Track [Page 22] - -RFC 2045 Internet Message Bodies November 1996 - - - WARNING TO IMPLEMENTORS: If binary data is encoded in quoted- - printable, care must be taken to encode CR and LF characters as "=0D" - and "=0A", respectively. In particular, a CRLF sequence in binary - data should be encoded as "=0D=0A". Otherwise, if CRLF were - represented as a hard line break, it might be incorrectly decoded on - platforms with different line break conventions. - - For formalists, the syntax of quoted-printable data is described by - the following grammar: - - quoted-printable := qp-line *(CRLF qp-line) - - qp-line := *(qp-segment transport-padding CRLF) - qp-part transport-padding - - qp-part := qp-section - ; Maximum length of 76 characters - - qp-segment := qp-section *(SPACE / TAB) "=" - ; Maximum length of 76 characters - - qp-section := [*(ptext / SPACE / TAB) ptext] - - ptext := hex-octet / safe-char - - safe-char := <any octet with decimal value of 33 through - 60 inclusive, and 62 through 126> - ; Characters not listed as "mail-safe" in - ; RFC 2049 are also not recommended. - - hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") - ; Octet must be used for characters > 127, =, - ; SPACEs or TABs at the ends of lines, and is - ; recommended for any character not listed in - ; RFC 2049 as "mail-safe". - - transport-padding := *LWSP-char - ; Composers MUST NOT generate - ; non-zero length transport - ; padding, but receivers MUST - ; be able to handle padding - ; added by message transports. - - IMPORTANT: The addition of LWSP between the elements shown in this - BNF is NOT allowed since this BNF does not specify a structured - header field. - - - - - -Freed & Borenstein Standards Track [Page 23] - -RFC 2045 Internet Message Bodies November 1996 - - -6.8. Base64 Content-Transfer-Encoding - - The Base64 Content-Transfer-Encoding is designed to represent - arbitrary sequences of octets in a form that need not be humanly - readable. The encoding and decoding algorithms are simple, but the - encoded data are consistently only about 33 percent larger than the - unencoded data. This encoding is virtually identical to the one used - in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421. - - A 65-character subset of US-ASCII is used, enabling 6 bits to be - represented per printable character. (The extra 65th character, "=", - is used to signify a special processing function.) - - NOTE: This subset has the important property that it is represented - identically in all versions of ISO 646, including US-ASCII, and all - characters in the subset are also represented identically in all - versions of EBCDIC. Other popular encodings, such as the encoding - used by the uuencode utility, Macintosh binhex 4.0 [RFC-1741], and - the base85 encoding specified as part of Level 2 PostScript, do not - share these properties, and thus do not fulfill the portability - requirements a binary transport encoding for mail must meet. - - The encoding process represents 24-bit groups of input bits as output - strings of 4 encoded characters. Proceeding from left to right, a - 24-bit input group is formed by concatenating 3 8bit input groups. - These 24 bits are then treated as 4 concatenated 6-bit groups, each - of which is translated into a single digit in the base64 alphabet. - When encoding a bit stream via the base64 encoding, the bit stream - must be presumed to be ordered with the most-significant-bit first. - That is, the first bit in the stream will be the high-order bit in - the first 8bit byte, and the eighth bit will be the low-order bit in - the first 8bit byte, and so on. - - Each 6-bit group is used as an index into an array of 64 printable - characters. The character referenced by the index is placed in the - output string. These characters, identified in Table 1, below, are - selected so as to be universally representable, and the set excludes - characters with particular significance to SMTP (e.g., ".", CR, LF) - and to the multipart boundary delimiters defined in RFC 2046 (e.g., - "-"). - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 24] - -RFC 2045 Internet Message Bodies November 1996 - - - Table 1: The Base64 Alphabet - - Value Encoding Value Encoding Value Encoding Value Encoding - 0 A 17 R 34 i 51 z - 1 B 18 S 35 j 52 0 - 2 C 19 T 36 k 53 1 - 3 D 20 U 37 l 54 2 - 4 E 21 V 38 m 55 3 - 5 F 22 W 39 n 56 4 - 6 G 23 X 40 o 57 5 - 7 H 24 Y 41 p 58 6 - 8 I 25 Z 42 q 59 7 - 9 J 26 a 43 r 60 8 - 10 K 27 b 44 s 61 9 - 11 L 28 c 45 t 62 + - 12 M 29 d 46 u 63 / - 13 N 30 e 47 v - 14 O 31 f 48 w (pad) = - 15 P 32 g 49 x - 16 Q 33 h 50 y - - The encoded output stream must be represented in lines of no more - than 76 characters each. All line breaks or other characters not - found in Table 1 must be ignored by decoding software. In base64 - data, characters other than those in Table 1, line breaks, and other - white space probably indicate a transmission error, about which a - warning message or even a message rejection might be appropriate - under some circumstances. - - Special processing is performed if fewer than 24 bits are available - at the end of the data being encoded. A full encoding quantum is - always completed at the end of a body. When fewer than 24 input bits - are available in an input group, zero bits are added (on the right) - to form an integral number of 6-bit groups. Padding at the end of - the data is performed using the "=" character. Since all base64 - input is an integral number of octets, only the following cases can - arise: (1) the final quantum of encoding input is an integral - multiple of 24 bits; here, the final unit of encoded output will be - an integral multiple of 4 characters with no "=" padding, (2) the - final quantum of encoding input is exactly 8 bits; here, the final - unit of encoded output will be two characters followed by two "=" - padding characters, or (3) the final quantum of encoding input is - exactly 16 bits; here, the final unit of encoded output will be three - characters followed by one "=" padding character. - - Because it is used only for padding at the end of the data, the - occurrence of any "=" characters may be taken as evidence that the - end of the data has been reached (without truncation in transit). No - - - -Freed & Borenstein Standards Track [Page 25] - -RFC 2045 Internet Message Bodies November 1996 - - - such assurance is possible, however, when the number of octets - transmitted was a multiple of three and no "=" characters are - present. - - Any characters outside of the base64 alphabet are to be ignored in - base64-encoded data. - - Care must be taken to use the proper octets for line breaks if base64 - encoding is applied directly to text material that has not been - converted to canonical form. In particular, text line breaks must be - converted into CRLF sequences prior to base64 encoding. The - important thing to note is that this may be done directly by the - encoder rather than in a prior canonicalization step in some - implementations. - - NOTE: There is no need to worry about quoting potential boundary - delimiters within base64-encoded bodies within multipart entities - because no hyphen characters are used in the base64 encoding. - -7. Content-ID Header Field - - In constructing a high-level user agent, it may be desirable to allow - one body to make reference to another. Accordingly, bodies may be - labelled using the "Content-ID" header field, which is syntactically - identical to the "Message-ID" header field: - - id := "Content-ID" ":" msg-id - - Like the Message-ID values, Content-ID values must be generated to be - world-unique. - - The Content-ID value may be used for uniquely identifying MIME - entities in several contexts, particularly for caching data - referenced by the message/external-body mechanism. Although the - Content-ID header is generally optional, its use is MANDATORY in - implementations which generate data of the optional MIME media type - "message/external-body". That is, each message/external-body entity - must have a Content-ID field to permit caching of such data. - - It is also worth noting that the Content-ID value has special - semantics in the case of the multipart/alternative media type. This - is explained in the section of RFC 2046 dealing with - multipart/alternative. - - - - - - - - -Freed & Borenstein Standards Track [Page 26] - -RFC 2045 Internet Message Bodies November 1996 - - -8. Content-Description Header Field - - The ability to associate some descriptive information with a given - body is often desirable. For example, it may be useful to mark an - "image" body as "a picture of the Space Shuttle Endeavor." Such text - may be placed in the Content-Description header field. This header - field is always optional. - - description := "Content-Description" ":" *text - - The description is presumed to be given in the US-ASCII character - set, although the mechanism specified in RFC 2047 may be used for - non-US-ASCII Content-Description values. - -9. Additional MIME Header Fields - - Future documents may elect to define additional MIME header fields - for various purposes. Any new header field that further describes - the content of a message should begin with the string "Content-" to - allow such fields which appear in a message header to be - distinguished from ordinary RFC 822 message header fields. - - MIME-extension-field := <Any RFC 822 header field which - begins with the string - "Content-"> - -10. Summary - - Using the MIME-Version, Content-Type, and Content-Transfer-Encoding - header fields, it is possible to include, in a standardized way, - arbitrary types of data with RFC 822 conformant mail messages. No - restrictions imposed by either RFC 821 or RFC 822 are violated, and - care has been taken to avoid problems caused by additional - restrictions imposed by the characteristics of some Internet mail - transport mechanisms (see RFC 2049). - - The next document in this set, RFC 2046, specifies the initial set of - media types that can be labelled and transported using these headers. - -11. Security Considerations - - Security issues are discussed in the second document in this set, RFC - 2046. - - - - - - - - -Freed & Borenstein Standards Track [Page 27] - -RFC 2045 Internet Message Bodies November 1996 - - -12. Authors' Addresses - - For more information, the authors of this document are best contacted - via Internet mail: - - Ned Freed - Innosoft International, Inc. - 1050 East Garvey Avenue South - West Covina, CA 91790 - USA - - Phone: +1 818 919 3600 - Fax: +1 818 919 3614 - EMail: ned@innosoft.com - - - Nathaniel S. Borenstein - First Virtual Holdings - 25 Washington Avenue - Morristown, NJ 07960 - USA - - Phone: +1 201 540 8967 - Fax: +1 201 993 3032 - EMail: nsb@nsb.fv.com - - - MIME is a result of the work of the Internet Engineering Task Force - Working Group on RFC 822 Extensions. The chairman of that group, - Greg Vaudreuil, may be reached at: - - Gregory M. Vaudreuil - Octel Network Services - 17080 Dallas Parkway - Dallas, TX 75248-1905 - USA - - EMail: Greg.Vaudreuil@Octel.Com - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 28] - -RFC 2045 Internet Message Bodies November 1996 - - -Appendix A -- Collected Grammar - - This appendix contains the complete BNF grammar for all the syntax - specified by this document. - - By itself, however, this grammar is incomplete. It refers by name to - several syntax rules that are defined by RFC 822. Rather than - reproduce those definitions here, and risk unintentional differences - between the two, this document simply refers the reader to RFC 822 - for the remaining definitions. Wherever a term is undefined, it - refers to the RFC 822 definition. - - attribute := token - ; Matching of attributes - ; is ALWAYS case-insensitive. - - composite-type := "message" / "multipart" / extension-token - - content := "Content-Type" ":" type "/" subtype - *(";" parameter) - ; Matching of media type and subtype - ; is ALWAYS case-insensitive. - - description := "Content-Description" ":" *text - - discrete-type := "text" / "image" / "audio" / "video" / - "application" / extension-token - - encoding := "Content-Transfer-Encoding" ":" mechanism - - entity-headers := [ content CRLF ] - [ encoding CRLF ] - [ id CRLF ] - [ description CRLF ] - *( MIME-extension-field CRLF ) - - extension-token := ietf-token / x-token - - hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") - ; Octet must be used for characters > 127, =, - ; SPACEs or TABs at the ends of lines, and is - ; recommended for any character not listed in - ; RFC 2049 as "mail-safe". - - iana-token := <A publicly-defined extension token. Tokens - of this form must be registered with IANA - as specified in RFC 2048.> - - - - -Freed & Borenstein Standards Track [Page 29] - -RFC 2045 Internet Message Bodies November 1996 - - - ietf-token := <An extension token defined by a - standards-track RFC and registered - with IANA.> - - id := "Content-ID" ":" msg-id - - mechanism := "7bit" / "8bit" / "binary" / - "quoted-printable" / "base64" / - ietf-token / x-token - - MIME-extension-field := <Any RFC 822 header field which - begins with the string - "Content-"> - - MIME-message-headers := entity-headers - fields - version CRLF - ; The ordering of the header - ; fields implied by this BNF - ; definition should be ignored. - - MIME-part-headers := entity-headers - [fields] - ; Any field not beginning with - ; "content-" can have no defined - ; meaning and may be ignored. - ; The ordering of the header - ; fields implied by this BNF - ; definition should be ignored. - - parameter := attribute "=" value - - ptext := hex-octet / safe-char - - qp-line := *(qp-segment transport-padding CRLF) - qp-part transport-padding - - qp-part := qp-section - ; Maximum length of 76 characters - - qp-section := [*(ptext / SPACE / TAB) ptext] - - qp-segment := qp-section *(SPACE / TAB) "=" - ; Maximum length of 76 characters - - quoted-printable := qp-line *(CRLF qp-line) - - - - - -Freed & Borenstein Standards Track [Page 30] - -RFC 2045 Internet Message Bodies November 1996 - - - safe-char := <any octet with decimal value of 33 through - 60 inclusive, and 62 through 126> - ; Characters not listed as "mail-safe" in - ; RFC 2049 are also not recommended. - - subtype := extension-token / iana-token - - token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, - or tspecials> - - transport-padding := *LWSP-char - ; Composers MUST NOT generate - ; non-zero length transport - ; padding, but receivers MUST - ; be able to handle padding - ; added by message transports. - - tspecials := "(" / ")" / "<" / ">" / "@" / - "," / ";" / ":" / "\" / <"> - "/" / "[" / "]" / "?" / "=" - ; Must be in quoted-string, - ; to use within parameter values - - type := discrete-type / composite-type - - value := token / quoted-string - - version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT - - x-token := <The two characters "X-" or "x-" followed, with - no intervening white space, by any token> - - - - - - - - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 31] - diff --git a/proto/rfc2046.txt b/proto/rfc2046.txt @@ -1,2467 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 2046 Innosoft -Obsoletes: 1521, 1522, 1590 N. Borenstein -Category: Standards Track First Virtual - November 1996 - - - Multipurpose Internet Mail Extensions - (MIME) Part Two: - Media Types - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - STD 11, RFC 822 defines a message representation protocol specifying - considerable detail about US-ASCII message headers, but which leaves - the message content, or message body, as flat US-ASCII text. This - set of documents, collectively called the Multipurpose Internet Mail - Extensions, or MIME, redefines the format of messages to allow for - - (1) textual message bodies in character sets other than - US-ASCII, - - (2) an extensible set of different formats for non-textual - message bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than - US-ASCII. - - These documents are based on earlier work documented in RFC 934, STD - 11, and RFC 1049, but extends and revises them. Because RFC 822 said - so little about message bodies, these documents are largely - orthogonal to (rather than a revision of) RFC 822. - - The initial document in this set, RFC 2045, specifies the various - headers used to describe the structure of MIME messages. This second - document defines the general structure of the MIME media typing - system and defines an initial set of media types. The third document, - RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text - - - -Freed & Borenstein Standards Track [Page 1] - -RFC 2046 Media Types November 1996 - - - data in Internet mail header fields. The fourth document, RFC 2048, - specifies various IANA registration procedures for MIME-related - facilities. The fifth and final document, RFC 2049, describes MIME - conformance criteria as well as providing some illustrative examples - of MIME message formats, acknowledgements, and the bibliography. - - These documents are revisions of RFCs 1521 and 1522, which themselves - were revisions of RFCs 1341 and 1342. An appendix in RFC 2049 - describes differences and changes from previous versions. - -Table of Contents - - 1. Introduction ......................................... 3 - 2. Definition of a Top-Level Media Type ................. 4 - 3. Overview Of The Initial Top-Level Media Types ........ 4 - 4. Discrete Media Type Values ........................... 6 - 4.1 Text Media Type ..................................... 6 - 4.1.1 Representation of Line Breaks ..................... 7 - 4.1.2 Charset Parameter ................................. 7 - 4.1.3 Plain Subtype ..................................... 11 - 4.1.4 Unrecognized Subtypes ............................. 11 - 4.2 Image Media Type .................................... 11 - 4.3 Audio Media Type .................................... 11 - 4.4 Video Media Type .................................... 12 - 4.5 Application Media Type .............................. 12 - 4.5.1 Octet-Stream Subtype .............................. 13 - 4.5.2 PostScript Subtype ................................ 14 - 4.5.3 Other Application Subtypes ........................ 17 - 5. Composite Media Type Values .......................... 17 - 5.1 Multipart Media Type ................................ 17 - 5.1.1 Common Syntax ..................................... 19 - 5.1.2 Handling Nested Messages and Multiparts ........... 24 - 5.1.3 Mixed Subtype ..................................... 24 - 5.1.4 Alternative Subtype ............................... 24 - 5.1.5 Digest Subtype .................................... 26 - 5.1.6 Parallel Subtype .................................. 27 - 5.1.7 Other Multipart Subtypes .......................... 28 - 5.2 Message Media Type .................................. 28 - 5.2.1 RFC822 Subtype .................................... 28 - 5.2.2 Partial Subtype ................................... 29 - 5.2.2.1 Message Fragmentation and Reassembly ............ 30 - 5.2.2.2 Fragmentation and Reassembly Example ............ 31 - 5.2.3 External-Body Subtype ............................. 33 - 5.2.4 Other Message Subtypes ............................ 40 - 6. Experimental Media Type Values ....................... 40 - 7. Summary .............................................. 41 - 8. Security Considerations .............................. 41 - 9. Authors' Addresses ................................... 42 - - - -Freed & Borenstein Standards Track [Page 2] - -RFC 2046 Media Types November 1996 - - - A. Collected Grammar .................................... 43 - -1. Introduction - - The first document in this set, RFC 2045, defines a number of header - fields, including Content-Type. The Content-Type field is used to - specify the nature of the data in the body of a MIME entity, by - giving media type and subtype identifiers, and by providing auxiliary - information that may be required for certain media types. After the - type and subtype names, the remainder of the header field is simply a - set of parameters, specified in an attribute/value notation. The - ordering of parameters is not significant. - - In general, the top-level media type is used to declare the general - type of data, while the subtype specifies a specific format for that - type of data. Thus, a media type of "image/xyz" is enough to tell a - user agent that the data is an image, even if the user agent has no - knowledge of the specific image format "xyz". Such information can - be used, for example, to decide whether or not to show a user the raw - data from an unrecognized subtype -- such an action might be - reasonable for unrecognized subtypes of "text", but not for - unrecognized subtypes of "image" or "audio". For this reason, - registered subtypes of "text", "image", "audio", and "video" should - not contain embedded information that is really of a different type. - Such compound formats should be represented using the "multipart" or - "application" types. - - Parameters are modifiers of the media subtype, and as such do not - fundamentally affect the nature of the content. The set of - meaningful parameters depends on the media type and subtype. Most - parameters are associated with a single specific subtype. However, a - given top-level media type may define parameters which are applicable - to any subtype of that type. Parameters may be required by their - defining media type or subtype or they may be optional. MIME - implementations must also ignore any parameters whose names they do - not recognize. - - MIME's Content-Type header field and media type mechanism has been - carefully designed to be extensible, and it is expected that the set - of media type/subtype pairs and their associated parameters will grow - significantly over time. Several other MIME facilities, such as - transfer encodings and "message/external-body" access types, are - likely to have new values defined over time. In order to ensure that - the set of such values is developed in an orderly, well-specified, - and public manner, MIME sets up a registration process which uses the - Internet Assigned Numbers Authority (IANA) as a central registry for - MIME's various areas of extensibility. The registration process for - these areas is described in a companion document, RFC 2048. - - - -Freed & Borenstein Standards Track [Page 3] - -RFC 2046 Media Types November 1996 - - - The initial seven standard top-level media type are defined and - described in the remainder of this document. - -2. Definition of a Top-Level Media Type - - The definition of a top-level media type consists of: - - (1) a name and a description of the type, including - criteria for whether a particular type would qualify - under that type, - - (2) the names and definitions of parameters, if any, which - are defined for all subtypes of that type (including - whether such parameters are required or optional), - - (3) how a user agent and/or gateway should handle unknown - subtypes of this type, - - (4) general considerations on gatewaying entities of this - top-level type, if any, and - - (5) any restrictions on content-transfer-encodings for - entities of this top-level type. - -3. Overview Of The Initial Top-Level Media Types - - The five discrete top-level media types are: - - (1) text -- textual information. The subtype "plain" in - particular indicates plain text containing no - formatting commands or directives of any sort. Plain - text is intended to be displayed "as-is". No special - software is required to get the full meaning of the - text, aside from support for the indicated character - set. Other subtypes are to be used for enriched text in - forms where application software may enhance the - appearance of the text, but such software must not be - required in order to get the general idea of the - content. Possible subtypes of "text" thus include any - word processor format that can be read without - resorting to software that understands the format. In - particular, formats that employ embeddded binary - formatting information are not considered directly - readable. A very simple and portable subtype, - "richtext", was defined in RFC 1341, with a further - revision in RFC 1896 under the name "enriched". - - - - - -Freed & Borenstein Standards Track [Page 4] - -RFC 2046 Media Types November 1996 - - - (2) image -- image data. "Image" requires a display device - (such as a graphical display, a graphics printer, or a - FAX machine) to view the information. An initial - subtype is defined for the widely-used image format - JPEG. . subtypes are defined for two widely-used image - formats, jpeg and gif. - - (3) audio -- audio data. "Audio" requires an audio output - device (such as a speaker or a telephone) to "display" - the contents. An initial subtype "basic" is defined in - this document. - - (4) video -- video data. "Video" requires the capability - to display moving images, typically including - specialized hardware and software. An initial subtype - "mpeg" is defined in this document. - - (5) application -- some other kind of data, typically - either uninterpreted binary data or information to be - processed by an application. The subtype "octet- - stream" is to be used in the case of uninterpreted - binary data, in which case the simplest recommended - action is to offer to write the information into a file - for the user. The "PostScript" subtype is also defined - for the transport of PostScript material. Other - expected uses for "application" include spreadsheets, - data for mail-based scheduling systems, and languages - for "active" (computational) messaging, and word - processing formats that are not directly readable. - Note that security considerations may exist for some - types of application data, most notably - "application/PostScript" and any form of active - messaging. These issues are discussed later in this - document. - - The two composite top-level media types are: - - (1) multipart -- data consisting of multiple entities of - independent data types. Four subtypes are initially - defined, including the basic "mixed" subtype specifying - a generic mixed set of parts, "alternative" for - representing the same data in multiple formats, - "parallel" for parts intended to be viewed - simultaneously, and "digest" for multipart entities in - which each part has a default type of "message/rfc822". - - - - - - -Freed & Borenstein Standards Track [Page 5] - -RFC 2046 Media Types November 1996 - - - (2) message -- an encapsulated message. A body of media - type "message" is itself all or a portion of some kind - of message object. Such objects may or may not in turn - contain other entities. The "rfc822" subtype is used - when the encapsulated content is itself an RFC 822 - message. The "partial" subtype is defined for partial - RFC 822 messages, to permit the fragmented transmission - of bodies that are thought to be too large to be passed - through transport facilities in one piece. Another - subtype, "external-body", is defined for specifying - large bodies by reference to an external data source. - - It should be noted that the list of media type values given here may - be augmented in time, via the mechanisms described above, and that - the set of subtypes is expected to grow substantially. - -4. Discrete Media Type Values - - Five of the seven initial media type values refer to discrete bodies. - The content of these types must be handled by non-MIME mechanisms; - they are opaque to MIME processors. - -4.1. Text Media Type - - The "text" media type is intended for sending material which is - principally textual in form. A "charset" parameter may be used to - indicate the character set of the body text for "text" subtypes, - notably including the subtype "text/plain", which is a generic - subtype for plain text. Plain text does not provide for or allow - formatting commands, font attribute specifications, processing - instructions, interpretation directives, or content markup. Plain - text is seen simply as a linear sequence of characters, possibly - interrupted by line breaks or page breaks. Plain text may allow the - stacking of several characters in the same position in the text. - Plain text in scripts like Arabic and Hebrew may also include - facilitites that allow the arbitrary mixing of text segments with - opposite writing directions. - - Beyond plain text, there are many formats for representing what might - be known as "rich text". An interesting characteristic of many such - representations is that they are to some extent readable even without - the software that interprets them. It is useful, then, to - distinguish them, at the highest level, from such unreadable data as - images, audio, or text represented in an unreadable form. In the - absence of appropriate interpretation software, it is reasonable to - show subtypes of "text" to the user, while it is not reasonable to do - so with most nontextual data. Such formatted textual data should be - represented using subtypes of "text". - - - -Freed & Borenstein Standards Track [Page 6] - -RFC 2046 Media Types November 1996 - - -4.1.1. Representation of Line Breaks - - The canonical form of any MIME "text" subtype MUST always represent a - line break as a CRLF sequence. Similarly, any occurrence of CRLF in - MIME "text" MUST represent a line break. Use of CR and LF outside of - line break sequences is also forbidden. - - This rule applies regardless of format or character set or sets - involved. - - NOTE: The proper interpretation of line breaks when a body is - displayed depends on the media type. In particular, while it is - appropriate to treat a line break as a transition to a new line when - displaying a "text/plain" body, this treatment is actually incorrect - for other subtypes of "text" like "text/enriched" [RFC-1896]. - Similarly, whether or not line breaks should be added during display - operations is also a function of the media type. It should not be - necessary to add any line breaks to display "text/plain" correctly, - whereas proper display of "text/enriched" requires the appropriate - addition of line breaks. - - NOTE: Some protocols defines a maximum line length. E.g. SMTP [RFC- - 821] allows a maximum of 998 octets before the next CRLF sequence. - To be transported by such protocols, data which includes too long - segments without CRLF sequences must be encoded with a suitable - content-transfer-encoding. - -4.1.2. Charset Parameter - - A critical parameter that may be specified in the Content-Type field - for "text/plain" data is the character set. This is specified with a - "charset" parameter, as in: - - Content-type: text/plain; charset=iso-8859-1 - - Unlike some other parameter values, the values of the charset - parameter are NOT case sensitive. The default character set, which - must be assumed in the absence of a charset parameter, is US-ASCII. - - The specification for any future subtypes of "text" must specify - whether or not they will also utilize a "charset" parameter, and may - possibly restrict its values as well. For other subtypes of "text" - than "text/plain", the semantics of the "charset" parameter should be - defined to be identical to those specified here for "text/plain", - i.e., the body consists entirely of characters in the given charset. - In particular, definers of future "text" subtypes should pay close - attention to the implications of multioctet character sets for their - subtype definitions. - - - -Freed & Borenstein Standards Track [Page 7] - -RFC 2046 Media Types November 1996 - - - The charset parameter for subtypes of "text" gives a name of a - character set, as "character set" is defined in RFC 2045. The rules - regarding line breaks detailed in the previous section must also be - observed -- a character set whose definition does not conform to - these rules cannot be used in a MIME "text" subtype. - - An initial list of predefined character set names can be found at the - end of this section. Additional character sets may be registered - with IANA. - - Other media types than subtypes of "text" might choose to employ the - charset parameter as defined here, but with the CRLF/line break - restriction removed. Therefore, all character sets that conform to - the general definition of "character set" in RFC 2045 can be - registered for MIME use. - - Note that if the specified character set includes 8-bit characters - and such characters are used in the body, a Content-Transfer-Encoding - header field and a corresponding encoding on the data are required in - order to transmit the body via some mail transfer protocols, such as - SMTP [RFC-821]. - - The default character set, US-ASCII, has been the subject of some - confusion and ambiguity in the past. Not only were there some - ambiguities in the definition, there have been wide variations in - practice. In order to eliminate such ambiguity and variations in the - future, it is strongly recommended that new user agents explicitly - specify a character set as a media type parameter in the Content-Type - header field. "US-ASCII" does not indicate an arbitrary 7-bit - character set, but specifies that all octets in the body must be - interpreted as characters according to the US-ASCII character set. - National and application-oriented versions of ISO 646 [ISO-646] are - usually NOT identical to US-ASCII, and in that case their use in - Internet mail is explicitly discouraged. The omission of the ISO 646 - character set from this document is deliberate in this regard. The - character set name of "US-ASCII" explicitly refers to the character - set defined in ANSI X3.4-1986 [US- ASCII]. The new international - reference version (IRV) of the 1991 edition of ISO 646 is identical - to US-ASCII. The character set name "ASCII" is reserved and must not - be used for any purpose. - - NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier - version of the American Standard. Insofar as one of the purposes of - specifying a media type and character set is to permit the receiver - to unambiguously determine how the sender intended the coded message - to be interpreted, assuming anything other than "strict ASCII" as the - default would risk unintentional and incompatible changes to the - semantics of messages now being transmitted. This also implies that - - - -Freed & Borenstein Standards Track [Page 8] - -RFC 2046 Media Types November 1996 - - - messages containing characters coded according to other versions of - ISO 646 than US-ASCII and the 1991 IRV, or using code-switching - procedures (e.g., those of ISO 2022), as well as 8bit or multiple - octet character encodings MUST use an appropriate character set - specification to be consistent with MIME. - - The complete US-ASCII character set is listed in ANSI X3.4- 1986. - Note that the control characters including DEL (0-31, 127) have no - defined meaning in apart from the combination CRLF (US-ASCII values - 13 and 10) indicating a new line. Two of the characters have de - facto meanings in wide use: FF (12) often means "start subsequent - text on the beginning of a new page"; and TAB or HT (9) often (though - not always) means "move the cursor to the next available column after - the current position where the column number is a multiple of 8 - (counting the first column as column 0)." Aside from these - conventions, any use of the control characters or DEL in a body must - either occur - - (1) because a subtype of text other than "plain" - specifically assigns some additional meaning, or - - (2) within the context of a private agreement between the - sender and recipient. Such private agreements are - discouraged and should be replaced by the other - capabilities of this document. - - NOTE: An enormous proliferation of character sets exist beyond US- - ASCII. A large number of partially or totally overlapping character - sets is NOT a good thing. A SINGLE character set that can be used - universally for representing all of the world's languages in Internet - mail would be preferrable. Unfortunately, existing practice in - several communities seems to point to the continued use of multiple - character sets in the near future. A small number of standard - character sets are, therefore, defined for Internet use in this - document. - - The defined charset values are: - - (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII]. - - (2) ISO-8859-X -- where "X" is to be replaced, as - necessary, for the parts of ISO-8859 [ISO-8859]. Note - that the ISO 646 character sets have deliberately been - omitted in favor of their 8859 replacements, which are - the designated character sets for Internet mail. As of - the publication of this document, the legitimate values - for "X" are the digits 1 through 10. - - - - -Freed & Borenstein Standards Track [Page 9] - -RFC 2046 Media Types November 1996 - - - Characters in the range 128-159 has no assigned meaning in ISO-8859- - X. Characters with values below 128 in ISO-8859-X have the same - assigned meaning as they do in US-ASCII. - - Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew - alphabet) includes both characters for which the normal writing - direction is right to left and characters for which it is left to - right, but do not define a canonical ordering method for representing - bi-directional text. The charset values "ISO-8859-6" and "ISO-8859- - 8", however, specify that the visual method is used [RFC-1556]. - - All of these character sets are used as pure 7bit or 8bit sets - without any shift or escape functions. The meaning of shift and - escape sequences in these character sets is not defined. - - The character sets specified above are the ones that were relatively - uncontroversial during the drafting of MIME. This document does not - endorse the use of any particular character set other than US-ASCII, - and recognizes that the future evolution of world character sets - remains unclear. - - Note that the character set used, if anything other than US- ASCII, - must always be explicitly specified in the Content-Type field. - - No character set name other than those defined above may be used in - Internet mail without the publication of a formal specification and - its registration with IANA, or by private agreement, in which case - the character set name must begin with "X-". - - Implementors are discouraged from defining new character sets unless - absolutely necessary. - - The "charset" parameter has been defined primarily for the purpose of - textual data, and is described in this section for that reason. - However, it is conceivable that non-textual data might also wish to - specify a charset value for some purpose, in which case the same - syntax and values should be used. - - In general, composition software should always use the "lowest common - denominator" character set possible. For example, if a body contains - only US-ASCII characters, it SHOULD be marked as being in the US- - ASCII character set, not ISO-8859-1, which, like all the ISO-8859 - family of character sets, is a superset of US-ASCII. More generally, - if a widely-used character set is a subset of another character set, - and a body contains only characters in the widely-used subset, it - should be labelled as being in that subset. This will increase the - chances that the recipient will be able to view the resulting entity - correctly. - - - -Freed & Borenstein Standards Track [Page 10] - -RFC 2046 Media Types November 1996 - - -4.1.3. Plain Subtype - - The simplest and most important subtype of "text" is "plain". This - indicates plain text that does not contain any formatting commands or - directives. Plain text is intended to be displayed "as-is", that is, - no interpretation of embedded formatting commands, font attribute - specifications, processing instructions, interpretation directives, - or content markup should be necessary for proper display. The - default media type of "text/plain; charset=us-ascii" for Internet - mail describes existing Internet practice. That is, it is the type - of body defined by RFC 822. - - No other "text" subtype is defined by this document. - -4.1.4. Unrecognized Subtypes - - Unrecognized subtypes of "text" should be treated as subtype "plain" - as long as the MIME implementation knows how to handle the charset. - Unrecognized subtypes which also specify an unrecognized charset - should be treated as "application/octet- stream". - -4.2. Image Media Type - - A media type of "image" indicates that the body contains an image. - The subtype names the specific image format. These names are not - case sensitive. An initial subtype is "jpeg" for the JPEG format - using JFIF encoding [JPEG]. - - The list of "image" subtypes given here is neither exclusive nor - exhaustive, and is expected to grow as more types are registered with - IANA, as described in RFC 2048. - - Unrecognized subtypes of "image" should at a miniumum be treated as - "application/octet-stream". Implementations may optionally elect to - pass subtypes of "image" that they do not specifically recognize to a - secure and robust general-purpose image viewing application, if such - an application is available. - - NOTE: Using of a generic-purpose image viewing application this way - inherits the security problems of the most dangerous type supported - by the application. - -4.3. Audio Media Type - - A media type of "audio" indicates that the body contains audio data. - Although there is not yet a consensus on an "ideal" audio format for - use with computers, there is a pressing need for a format capable of - providing interoperable behavior. - - - -Freed & Borenstein Standards Track [Page 11] - -RFC 2046 Media Types November 1996 - - - The initial subtype of "basic" is specified to meet this requirement - by providing an absolutely minimal lowest common denominator audio - format. It is expected that richer formats for higher quality and/or - lower bandwidth audio will be defined by a later document. - - The content of the "audio/basic" subtype is single channel audio - encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. - - Unrecognized subtypes of "audio" should at a miniumum be treated as - "application/octet-stream". Implementations may optionally elect to - pass subtypes of "audio" that they do not specifically recognize to a - robust general-purpose audio playing application, if such an - application is available. - -4.4. Video Media Type - - A media type of "video" indicates that the body contains a time- - varying-picture image, possibly with color and coordinated sound. - The term 'video' is used in its most generic sense, rather than with - reference to any particular technology or format, and is not meant to - preclude subtypes such as animated drawings encoded compactly. The - subtype "mpeg" refers to video coded according to the MPEG standard - [MPEG]. - - Note that although in general this document strongly discourages the - mixing of multiple media in a single body, it is recognized that many - so-called video formats include a representation for synchronized - audio, and this is explicitly permitted for subtypes of "video". - - Unrecognized subtypes of "video" should at a minumum be treated as - "application/octet-stream". Implementations may optionally elect to - pass subtypes of "video" that they do not specifically recognize to a - robust general-purpose video display application, if such an - application is available. - -4.5. Application Media Type - - The "application" media type is to be used for discrete data which do - not fit in any of the other categories, and particularly for data to - be processed by some type of application program. This is - information which must be processed by an application before it is - viewable or usable by a user. Expected uses for the "application" - media type include file transfer, spreadsheets, data for mail-based - scheduling systems, and languages for "active" (computational) - material. (The latter, in particular, can pose security problems - which must be understood by implementors, and are considered in - detail in the discussion of the "application/PostScript" media type.) - - - - -Freed & Borenstein Standards Track [Page 12] - -RFC 2046 Media Types November 1996 - - - For example, a meeting scheduler might define a standard - representation for information about proposed meeting dates. An - intelligent user agent would use this information to conduct a dialog - with the user, and might then send additional material based on that - dialog. More generally, there have been several "active" messaging - languages developed in which programs in a suitably specialized - language are transported to a remote location and automatically run - in the recipient's environment. - - Such applications may be defined as subtypes of the "application" - media type. This document defines two subtypes: - - octet-stream, and PostScript. - - The subtype of "application" will often be either the name or include - part of the name of the application for which the data are intended. - This does not mean, however, that any application program name may be - used freely as a subtype of "application". - -4.5.1. Octet-Stream Subtype - - The "octet-stream" subtype is used to indicate that a body contains - arbitrary binary data. The set of currently defined parameters is: - - (1) TYPE -- the general type or category of binary data. - This is intended as information for the human recipient - rather than for any automatic processing. - - (2) PADDING -- the number of bits of padding that were - appended to the bit-stream comprising the actual - contents to produce the enclosed 8bit byte-oriented - data. This is useful for enclosing a bit-stream in a - body when the total number of bits is not a multiple of - 8. - - Both of these parameters are optional. - - An additional parameter, "CONVERSIONS", was defined in RFC 1341 but - has since been removed. RFC 1341 also defined the use of a "NAME" - parameter which gave a suggested file name to be used if the data - were to be written to a file. This has been deprecated in - anticipation of a separate Content-Disposition header field, to be - defined in a subsequent RFC. - - The recommended action for an implementation that receives an - "application/octet-stream" entity is to simply offer to put the data - in a file, with any Content-Transfer-Encoding undone, or perhaps to - use it as input to a user-specified process. - - - -Freed & Borenstein Standards Track [Page 13] - -RFC 2046 Media Types November 1996 - - - To reduce the danger of transmitting rogue programs, it is strongly - recommended that implementations NOT implement a path-search - mechanism whereby an arbitrary program named in the Content-Type - parameter (e.g., an "interpreter=" parameter) is found and executed - using the message body as input. - -4.5.2. PostScript Subtype - - A media type of "application/postscript" indicates a PostScript - program. Currently two variants of the PostScript language are - allowed; the original level 1 variant is described in [POSTSCRIPT] - and the more recent level 2 variant is described in [POSTSCRIPT2]. - - PostScript is a registered trademark of Adobe Systems, Inc. Use of - the MIME media type "application/postscript" implies recognition of - that trademark and all the rights it entails. - - The PostScript language definition provides facilities for internal - labelling of the specific language features a given program uses. - This labelling, called the PostScript document structuring - conventions, or DSC, is very general and provides substantially more - information than just the language level. The use of document - structuring conventions, while not required, is strongly recommended - as an aid to interoperability. Documents which lack proper - structuring conventions cannot be tested to see whether or not they - will work in a given environment. As such, some systems may assume - the worst and refuse to process unstructured documents. - - The execution of general-purpose PostScript interpreters entails - serious security risks, and implementors are discouraged from simply - sending PostScript bodies to "off- the-shelf" interpreters. While it - is usually safe to send PostScript to a printer, where the potential - for harm is greatly constrained by typical printer environments, - implementors should consider all of the following before they add - interactive display of PostScript bodies to their MIME readers. - - The remainder of this section outlines some, though probably not all, - of the possible problems with the transport of PostScript entities. - - (1) Dangerous operations in the PostScript language - include, but may not be limited to, the PostScript - operators "deletefile", "renamefile", "filenameforall", - and "file". "File" is only dangerous when applied to - something other than standard input or output. - Implementations may also define additional nonstandard - file operators; these may also pose a threat to - security. "Filenameforall", the wildcard file search - operator, may appear at first glance to be harmless. - - - -Freed & Borenstein Standards Track [Page 14] - -RFC 2046 Media Types November 1996 - - - Note, however, that this operator has the potential to - reveal information about what files the recipient has - access to, and this information may itself be - sensitive. Message senders should avoid the use of - potentially dangerous file operators, since these - operators are quite likely to be unavailable in secure - PostScript implementations. Message receiving and - displaying software should either completely disable - all potentially dangerous file operators or take - special care not to delegate any special authority to - their operation. These operators should be viewed as - being done by an outside agency when interpreting - PostScript documents. Such disabling and/or checking - should be done completely outside of the reach of the - PostScript language itself; care should be taken to - insure that no method exists for re-enabling full- - function versions of these operators. - - (2) The PostScript language provides facilities for exiting - the normal interpreter, or server, loop. Changes made - in this "outer" environment are customarily retained - across documents, and may in some cases be retained - semipermanently in nonvolatile memory. The operators - associated with exiting the interpreter loop have the - potential to interfere with subsequent document - processing. As such, their unrestrained use - constitutes a threat of service denial. PostScript - operators that exit the interpreter loop include, but - may not be limited to, the exitserver and startjob - operators. Message sending software should not - generate PostScript that depends on exiting the - interpreter loop to operate, since the ability to exit - will probably be unavailable in secure PostScript - implementations. Message receiving and displaying - software should completely disable the ability to make - retained changes to the PostScript environment by - eliminating or disabling the "startjob" and - "exitserver" operations. If these operations cannot be - eliminated or completely disabled the password - associated with them should at least be set to a hard- - to-guess value. - - (3) PostScript provides operators for setting system-wide - and device-specific parameters. These parameter - settings may be retained across jobs and may - potentially pose a threat to the correct operation of - the interpreter. The PostScript operators that set - system and device parameters include, but may not be - - - -Freed & Borenstein Standards Track [Page 15] - -RFC 2046 Media Types November 1996 - - - limited to, the "setsystemparams" and "setdevparams" - operators. Message sending software should not - generate PostScript that depends on the setting of - system or device parameters to operate correctly. The - ability to set these parameters will probably be - unavailable in secure PostScript implementations. - Message receiving and displaying software should - disable the ability to change system and device - parameters. If these operators cannot be completely - disabled the password associated with them should at - least be set to a hard-to-guess value. - - (4) Some PostScript implementations provide nonstandard - facilities for the direct loading and execution of - machine code. Such facilities are quite obviously open - to substantial abuse. Message sending software should - not make use of such features. Besides being totally - hardware-specific, they are also likely to be - unavailable in secure implementations of PostScript. - Message receiving and displaying software should not - allow such operators to be used if they exist. - - (5) PostScript is an extensible language, and many, if not - most, implementations of it provide a number of their - own extensions. This document does not deal with such - extensions explicitly since they constitute an unknown - factor. Message sending software should not make use - of nonstandard extensions; they are likely to be - missing from some implementations. Message receiving - and displaying software should make sure that any - nonstandard PostScript operators are secure and don't - present any kind of threat. - - (6) It is possible to write PostScript that consumes huge - amounts of various system resources. It is also - possible to write PostScript programs that loop - indefinitely. Both types of programs have the - potential to cause damage if sent to unsuspecting - recipients. Message-sending software should avoid the - construction and dissemination of such programs, which - is antisocial. Message receiving and displaying - software should provide appropriate mechanisms to abort - processing after a reasonable amount of time has - elapsed. In addition, PostScript interpreters should be - limited to the consumption of only a reasonable amount - of any given system resource. - - - - - -Freed & Borenstein Standards Track [Page 16] - -RFC 2046 Media Types November 1996 - - - (7) It is possible to include raw binary information inside - PostScript in various forms. This is not recommended - for use in Internet mail, both because it is not - supported by all PostScript interpreters and because it - significantly complicates the use of a MIME Content- - Transfer-Encoding. (Without such binary, PostScript - may typically be viewed as line-oriented data. The - treatment of CRLF sequences becomes extremely - problematic if binary and line-oriented data are mixed - in a single Postscript data stream.) - - (8) Finally, bugs may exist in some PostScript interpreters - which could possibly be exploited to gain unauthorized - access to a recipient's system. Apart from noting this - possibility, there is no specific action to take to - prevent this, apart from the timely correction of such - bugs if any are found. - -4.5.3. Other Application Subtypes - - It is expected that many other subtypes of "application" will be - defined in the future. MIME implementations must at a minimum treat - any unrecognized subtypes as being equivalent to "application/octet- - stream". - -5. Composite Media Type Values - - The remaining two of the seven initial Content-Type values refer to - composite entities. Composite entities are handled using MIME - mechanisms -- a MIME processor typically handles the body directly. - -5.1. Multipart Media Type - - In the case of multipart entities, in which one or more different - sets of data are combined in a single body, a "multipart" media type - field must appear in the entity's header. The body must then contain - one or more body parts, each preceded by a boundary delimiter line, - and the last one followed by a closing boundary delimiter line. - After its boundary delimiter line, each body part then consists of a - header area, a blank line, and a body area. Thus a body part is - similar to an RFC 822 message in syntax, but different in meaning. - - A body part is an entity and hence is NOT to be interpreted as - actually being an RFC 822 message. To begin with, NO header fields - are actually required in body parts. A body part that starts with a - blank line, therefore, is allowed and is a body part for which all - default values are to be assumed. In such a case, the absence of a - Content-Type header usually indicates that the corresponding body has - - - -Freed & Borenstein Standards Track [Page 17] - -RFC 2046 Media Types November 1996 - - - a content-type of "text/plain; charset=US-ASCII". - - The only header fields that have defined meaning for body parts are - those the names of which begin with "Content-". All other header - fields may be ignored in body parts. Although they should generally - be retained if at all possible, they may be discarded by gateways if - necessary. Such other fields are permitted to appear in body parts - but must not be depended on. "X-" fields may be created for - experimental or private purposes, with the recognition that the - information they contain may be lost at some gateways. - - NOTE: The distinction between an RFC 822 message and a body part is - subtle, but important. A gateway between Internet and X.400 mail, - for example, must be able to tell the difference between a body part - that contains an image and a body part that contains an encapsulated - message, the body of which is a JPEG image. In order to represent - the latter, the body part must have "Content-Type: message/rfc822", - and its body (after the blank line) must be the encapsulated message, - with its own "Content-Type: image/jpeg" header field. The use of - similar syntax facilitates the conversion of messages to body parts, - and vice versa, but the distinction between the two must be - understood by implementors. (For the special case in which parts - actually are messages, a "digest" subtype is also defined.) - - As stated previously, each body part is preceded by a boundary - delimiter line that contains the boundary delimiter. The boundary - delimiter MUST NOT appear inside any of the encapsulated parts, on a - line by itself or as the prefix of any line. This implies that it is - crucial that the composing agent be able to choose and specify a - unique boundary parameter value that does not contain the boundary - parameter value of an enclosing multipart as a prefix. - - All present and future subtypes of the "multipart" type must use an - identical syntax. Subtypes may differ in their semantics, and may - impose additional restrictions on syntax, but must conform to the - required syntax for the "multipart" type. This requirement ensures - that all conformant user agents will at least be able to recognize - and separate the parts of any multipart entity, even those of an - unrecognized subtype. - - As stated in the definition of the Content-Transfer-Encoding field - [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is - permitted for entities of type "multipart". The "multipart" boundary - delimiters and header fields are always represented as 7bit US-ASCII - in any case (though the header fields may encode non-US-ASCII header - text as per RFC 2047) and data within the body parts can be encoded - on a part-by-part basis, with Content-Transfer-Encoding fields for - each appropriate body part. - - - -Freed & Borenstein Standards Track [Page 18] - -RFC 2046 Media Types November 1996 - - -5.1.1. Common Syntax - - This section defines a common syntax for subtypes of "multipart". - All subtypes of "multipart" must use this syntax. A simple example - of a multipart message also appears in this section. An example of a - more complex multipart message is given in RFC 2049. - - The Content-Type field for multipart entities requires one parameter, - "boundary". The boundary delimiter line is then defined as a line - consisting entirely of two hyphen characters ("-", decimal value 45) - followed by the boundary parameter value from the Content-Type header - field, optional linear whitespace, and a terminating CRLF. - - NOTE: The hyphens are for rough compatibility with the earlier RFC - 934 method of message encapsulation, and for ease of searching for - the boundaries in some implementations. However, it should be noted - that multipart messages are NOT completely compatible with RFC 934 - encapsulations; in particular, they do not obey RFC 934 quoting - conventions for embedded lines that begin with hyphens. This - mechanism was chosen over the RFC 934 mechanism because the latter - causes lines to grow with each level of quoting. The combination of - this growth with the fact that SMTP implementations sometimes wrap - long lines made the RFC 934 mechanism unsuitable for use in the event - that deeply-nested multipart structuring is ever desired. - - WARNING TO IMPLEMENTORS: The grammar for parameters on the Content- - type field is such that it is often necessary to enclose the boundary - parameter values in quotes on the Content-type line. This is not - always necessary, but never hurts. Implementors should be sure to - study the grammar carefully in order to avoid producing invalid - Content-type fields. Thus, a typical "multipart" Content-Type header - field might look like this: - - Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p - - But the following is not valid: - - Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p - - (because of the colon) and must instead be represented as - - Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p" - - This Content-Type value indicates that the content consists of one or - more parts, each with a structure that is syntactically identical to - an RFC 822 message, except that the header area is allowed to be - completely empty, and that the parts are each preceded by the line - - - - -Freed & Borenstein Standards Track [Page 19] - -RFC 2046 Media Types November 1996 - - - --gc0pJq0M:08jU534c0p - - The boundary delimiter MUST occur at the beginning of a line, i.e., - following a CRLF, and the initial CRLF is considered to be attached - to the boundary delimiter line rather than part of the preceding - part. The boundary may be followed by zero or more characters of - linear whitespace. It is then terminated by either another CRLF and - the header fields for the next part, or by two CRLFs, in which case - there are no header fields for the next part. If no Content-Type - field is present it is assumed to be "message/rfc822" in a - "multipart/digest" and "text/plain" otherwise. - - NOTE: The CRLF preceding the boundary delimiter line is conceptually - attached to the boundary so that it is possible to have a part that - does not end with a CRLF (line break). Body parts that must be - considered to end with line breaks, therefore, must have two CRLFs - preceding the boundary delimiter line, the first of which is part of - the preceding body part, and the second of which is part of the - encapsulation boundary. - - Boundary delimiters must not appear within the encapsulated material, - and must be no longer than 70 characters, not counting the two - leading hyphens. - - The boundary delimiter line following the last body part is a - distinguished delimiter that indicates that no further body parts - will follow. Such a delimiter line is identical to the previous - delimiter lines, with the addition of two more hyphens after the - boundary parameter value. - - --gc0pJq0M:08jU534c0p-- - - NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the - boundary value with the beginning of each candidate line. An exact - match of the entire candidate line is not required; it is sufficient - that the boundary appear in its entirety following the CRLF. - - There appears to be room for additional information prior to the - first boundary delimiter line and following the final boundary - delimiter line. These areas should generally be left blank, and - implementations must ignore anything that appears before the first - boundary delimiter line or after the last one. - - NOTE: These "preamble" and "epilogue" areas are generally not used - because of the lack of proper typing of these parts and the lack of - clear semantics for handling these areas at gateways, particularly - X.400 gateways. However, rather than leaving the preamble area - blank, many MIME implementations have found this to be a convenient - - - -Freed & Borenstein Standards Track [Page 20] - -RFC 2046 Media Types November 1996 - - - place to insert an explanatory note for recipients who read the - message with pre-MIME software, since such notes will be ignored by - MIME-compliant software. - - NOTE: Because boundary delimiters must not appear in the body parts - being encapsulated, a user agent must exercise care to choose a - unique boundary parameter value. The boundary parameter value in the - example above could have been the result of an algorithm designed to - produce boundary delimiters with a very low probability of already - existing in the data to be encapsulated without having to prescan the - data. Alternate algorithms might result in more "readable" boundary - delimiters for a recipient with an old user agent, but would require - more attention to the possibility that the boundary delimiter might - appear at the beginning of some line in the encapsulated part. The - simplest boundary delimiter line possible is something like "---", - with a closing boundary delimiter line of "-----". - - As a very simple example, the following multipart message has two - parts, both of them plain text, one of them explicitly typed and one - of them implicitly typed: - - From: Nathaniel Borenstein <nsb@bellcore.com> - To: Ned Freed <ned@innosoft.com> - Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) - Subject: Sample message - MIME-Version: 1.0 - Content-type: multipart/mixed; boundary="simple boundary" - - This is the preamble. It is to be ignored, though it - is a handy place for composition agents to include an - explanatory note to non-MIME conformant readers. - - --simple boundary - - This is implicitly typed plain US-ASCII text. - It does NOT end with a linebreak. - --simple boundary - Content-type: text/plain; charset=us-ascii - - This is explicitly typed plain US-ASCII text. - It DOES end with a linebreak. - - --simple boundary-- - - This is the epilogue. It is also to be ignored. - - - - - - -Freed & Borenstein Standards Track [Page 21] - -RFC 2046 Media Types November 1996 - - - The use of a media type of "multipart" in a body part within another - "multipart" entity is explicitly allowed. In such cases, for obvious - reasons, care must be taken to ensure that each nested "multipart" - entity uses a different boundary delimiter. See RFC 2049 for an - example of nested "multipart" entities. - - The use of the "multipart" media type with only a single body part - may be useful in certain contexts, and is explicitly permitted. - - NOTE: Experience has shown that a "multipart" media type with a - single body part is useful for sending non-text media types. It has - the advantage of providing the preamble as a place to include - decoding instructions. In addition, a number of SMTP gateways move - or remove the MIME headers, and a clever MIME decoder can take a good - guess at multipart boundaries even in the absence of the Content-Type - header and thereby successfully decode the message. - - The only mandatory global parameter for the "multipart" media type is - the boundary parameter, which consists of 1 to 70 characters from a - set of characters known to be very robust through mail gateways, and - NOT ending with white space. (If a boundary delimiter line appears to - end with white space, the white space must be presumed to have been - added by a gateway, and must be deleted.) It is formally specified - by the following BNF: - - boundary := 0*69<bchars> bcharsnospace - - bchars := bcharsnospace / " " - - bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / - "+" / "_" / "," / "-" / "." / - "/" / ":" / "=" / "?" - - Overall, the body of a "multipart" entity may be specified as - follows: - - dash-boundary := "--" boundary - ; boundary taken from the value of - ; boundary parameter of the - ; Content-Type field. - - multipart-body := [preamble CRLF] - dash-boundary transport-padding CRLF - body-part *encapsulation - close-delimiter transport-padding - [CRLF epilogue] - - - - - -Freed & Borenstein Standards Track [Page 22] - -RFC 2046 Media Types November 1996 - - - transport-padding := *LWSP-char - ; Composers MUST NOT generate - ; non-zero length transport - ; padding, but receivers MUST - ; be able to handle padding - ; added by message transports. - - encapsulation := delimiter transport-padding - CRLF body-part - - delimiter := CRLF dash-boundary - - close-delimiter := delimiter "--" - - preamble := discard-text - - epilogue := discard-text - - discard-text := *(*text CRLF) *text - ; May be ignored or discarded. - - body-part := MIME-part-headers [CRLF *OCTET] - ; Lines in a body-part must not start - ; with the specified dash-boundary and - ; the delimiter must not appear anywhere - ; in the body part. Note that the - ; semantics of a body-part differ from - ; the semantics of a message, as - ; described in the text. - - OCTET := <any 0-255 octet value> - - IMPORTANT: The free insertion of linear-white-space and RFC 822 - comments between the elements shown in this BNF is NOT allowed since - this BNF does not specify a structured header field. - - NOTE: In certain transport enclaves, RFC 822 restrictions such as - the one that limits bodies to printable US-ASCII characters may not - be in force. (That is, the transport domains may exist that resemble - standard Internet mail transport as specified in RFC 821 and assumed - by RFC 822, but without certain restrictions.) The relaxation of - these restrictions should be construed as locally extending the - definition of bodies, for example to include octets outside of the - US-ASCII range, as long as these extensions are supported by the - transport and adequately documented in the Content- Transfer-Encoding - header field. However, in no event are headers (either message - headers or body part headers) allowed to contain anything other than - US-ASCII characters. - - - -Freed & Borenstein Standards Track [Page 23] - -RFC 2046 Media Types November 1996 - - - NOTE: Conspicuously missing from the "multipart" type is a notion of - structured, related body parts. It is recommended that those wishing - to provide more structured or integrated multipart messaging - facilities should define subtypes of multipart that are syntactically - identical but define relationships between the various parts. For - example, subtypes of multipart could be defined that include a - distinguished part which in turn is used to specify the relationships - between the other parts, probably referring to them by their - Content-ID field. Old implementations will not recognize the new - subtype if this approach is used, but will treat it as - multipart/mixed and will thus be able to show the user the parts that - are recognized. - -5.1.2. Handling Nested Messages and Multiparts - - The "message/rfc822" subtype defined in a subsequent section of this - document has no terminating condition other than running out of data. - Similarly, an improperly truncated "multipart" entity may not have - any terminating boundary marker, and can turn up operationally due to - mail system malfunctions. - - It is essential that such entities be handled correctly when they are - themselves imbedded inside of another "multipart" structure. MIME - implementations are therefore required to recognize outer level - boundary markers at ANY level of inner nesting. It is not sufficient - to only check for the next expected marker or other terminating - condition. - -5.1.3. Mixed Subtype - - The "mixed" subtype of "multipart" is intended for use when the body - parts are independent and need to be bundled in a particular order. - Any "multipart" subtypes that an implementation does not recognize - must be treated as being of subtype "mixed". - -5.1.4. Alternative Subtype - - The "multipart/alternative" type is syntactically identical to - "multipart/mixed", but the semantics are different. In particular, - each of the body parts is an "alternative" version of the same - information. - - Systems should recognize that the content of the various parts are - interchangeable. Systems should choose the "best" type based on the - local environment and references, in some cases even through user - interaction. As with "multipart/mixed", the order of body parts is - significant. In this case, the alternatives appear in an order of - increasing faithfulness to the original content. In general, the - - - -Freed & Borenstein Standards Track [Page 24] - -RFC 2046 Media Types November 1996 - - - best choice is the LAST part of a type supported by the recipient - system's local environment. - - "Multipart/alternative" may be used, for example, to send a message - in a fancy text format in such a way that it can easily be displayed - anywhere: - - From: Nathaniel Borenstein <nsb@bellcore.com> - To: Ned Freed <ned@innosoft.com> - Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST) - Subject: Formatted text mail - MIME-Version: 1.0 - Content-Type: multipart/alternative; boundary=boundary42 - - --boundary42 - Content-Type: text/plain; charset=us-ascii - - ... plain text version of message goes here ... - - --boundary42 - Content-Type: text/enriched - - ... RFC 1896 text/enriched version of same message - goes here ... - - --boundary42 - Content-Type: application/x-whatever - - ... fanciest version of same message goes here ... - - --boundary42-- - - In this example, users whose mail systems understood the - "application/x-whatever" format would see only the fancy version, - while other users would see only the enriched or plain text version, - depending on the capabilities of their system. - - In general, user agents that compose "multipart/alternative" entities - must place the body parts in increasing order of preference, that is, - with the preferred format last. For fancy text, the sending user - agent should put the plainest format first and the richest format - last. Receiving user agents should pick and display the last format - they are capable of displaying. In the case where one of the - alternatives is itself of type "multipart" and contains unrecognized - sub-parts, the user agent may choose either to show that alternative, - an earlier alternative, or both. - - - - - -Freed & Borenstein Standards Track [Page 25] - -RFC 2046 Media Types November 1996 - - - NOTE: From an implementor's perspective, it might seem more sensible - to reverse this ordering, and have the plainest alternative last. - However, placing the plainest alternative first is the friendliest - possible option when "multipart/alternative" entities are viewed - using a non-MIME-conformant viewer. While this approach does impose - some burden on conformant MIME viewers, interoperability with older - mail readers was deemed to be more important in this case. - - It may be the case that some user agents, if they can recognize more - than one of the formats, will prefer to offer the user the choice of - which format to view. This makes sense, for example, if a message - includes both a nicely- formatted image version and an easily-edited - text version. What is most critical, however, is that the user not - automatically be shown multiple versions of the same data. Either - the user should be shown the last recognized version or should be - given the choice. - - THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a - "multipart/alternative" entity represents the same data, but the - mappings between the two are not necessarily without information - loss. For example, information is lost when translating ODA to - PostScript or plain text. It is recommended that each part should - have a different Content-ID value in the case where the information - content of the two parts is not identical. And when the information - content is identical -- for example, where several parts of type - "message/external-body" specify alternate ways to access the - identical data -- the same Content-ID field value should be used, to - optimize any caching mechanisms that might be present on the - recipient's end. However, the Content-ID values used by the parts - should NOT be the same Content-ID value that describes the - "multipart/alternative" as a whole, if there is any such Content-ID - field. That is, one Content-ID value will refer to the - "multipart/alternative" entity, while one or more other Content-ID - values will refer to the parts inside it. - -5.1.5. Digest Subtype - - This document defines a "digest" subtype of the "multipart" Content- - Type. This type is syntactically identical to "multipart/mixed", but - the semantics are different. In particular, in a digest, the default - Content-Type value for a body part is changed from "text/plain" to - "message/rfc822". This is done to allow a more readable digest - format that is largely compatible (except for the quoting convention) - with RFC 934. - - Note: Though it is possible to specify a Content-Type value for a - body part in a digest which is other than "message/rfc822", such as a - "text/plain" part containing a description of the material in the - - - -Freed & Borenstein Standards Track [Page 26] - -RFC 2046 Media Types November 1996 - - - digest, actually doing so is undesireble. The "multipart/digest" - Content-Type is intended to be used to send collections of messages. - If a "text/plain" part is needed, it should be included as a seperate - part of a "multipart/mixed" message. - - A digest in this format might, then, look something like this: - - From: Moderator-Address - To: Recipient-List - Date: Mon, 22 Mar 1994 13:34:51 +0000 - Subject: Internet Digest, volume 42 - MIME-Version: 1.0 - Content-Type: multipart/mixed; - boundary="---- main boundary ----" - - ------ main boundary ---- - - ...Introductory text or table of contents... - - ------ main boundary ---- - Content-Type: multipart/digest; - boundary="---- next message ----" - - ------ next message ---- - - From: someone-else - Date: Fri, 26 Mar 1993 11:13:32 +0200 - Subject: my opinion - - ...body goes here ... - - ------ next message ---- - - From: someone-else-again - Date: Fri, 26 Mar 1993 10:07:13 -0500 - Subject: my different opinion - - ... another body goes here ... - - ------ next message ------ - - ------ main boundary ------ - -5.1.6. Parallel Subtype - - This document defines a "parallel" subtype of the "multipart" - Content-Type. This type is syntactically identical to - "multipart/mixed", but the semantics are different. In particular, - - - -Freed & Borenstein Standards Track [Page 27] - -RFC 2046 Media Types November 1996 - - - in a parallel entity, the order of body parts is not significant. - - A common presentation of this type is to display all of the parts - simultaneously on hardware and software that are capable of doing so. - However, composing agents should be aware that many mail readers will - lack this capability and will show the parts serially in any event. - -5.1.7. Other Multipart Subtypes - - Other "multipart" subtypes are expected in the future. MIME - implementations must in general treat unrecognized subtypes of - "multipart" as being equivalent to "multipart/mixed". - -5.2. Message Media Type - - It is frequently desirable, in sending mail, to encapsulate another - mail message. A special media type, "message", is defined to - facilitate this. In particular, the "rfc822" subtype of "message" is - used to encapsulate RFC 822 messages. - - NOTE: It has been suggested that subtypes of "message" might be - defined for forwarded or rejected messages. However, forwarded and - rejected messages can be handled as multipart messages in which the - first part contains any control or descriptive information, and a - second part, of type "message/rfc822", is the forwarded or rejected - message. Composing rejection and forwarding messages in this manner - will preserve the type information on the original message and allow - it to be correctly presented to the recipient, and hence is strongly - encouraged. - - Subtypes of "message" often impose restrictions on what encodings are - allowed. These restrictions are described in conjunction with each - specific subtype. - - Mail gateways, relays, and other mail handling agents are commonly - known to alter the top-level header of an RFC 822 message. In - particular, they frequently add, remove, or reorder header fields. - These operations are explicitly forbidden for the encapsulated - headers embedded in the bodies of messages of type "message." - -5.2.1. RFC822 Subtype - - A media type of "message/rfc822" indicates that the body contains an - encapsulated message, with the syntax of an RFC 822 message. - However, unlike top-level RFC 822 messages, the restriction that each - "message/rfc822" body must include a "From", "Date", and at least one - destination header is removed and replaced with the requirement that - at least one of "From", "Subject", or "Date" must be present. - - - -Freed & Borenstein Standards Track [Page 28] - -RFC 2046 Media Types November 1996 - - - It should be noted that, despite the use of the numbers "822", a - "message/rfc822" entity isn't restricted to material in strict - conformance to RFC822, nor are the semantics of "message/rfc822" - objects restricted to the semantics defined in RFC822. More - specifically, a "message/rfc822" message could well be a News article - or a MIME message. - - No encoding other than "7bit", "8bit", or "binary" is permitted for - the body of a "message/rfc822" entity. The message header fields are - always US-ASCII in any case, and data within the body can still be - encoded, in which case the Content-Transfer-Encoding header field in - the encapsulated message will reflect this. Non-US-ASCII text in the - headers of an encapsulated message can be specified using the - mechanisms described in RFC 2047. - -5.2.2. Partial Subtype - - The "partial" subtype is defined to allow large entities to be - delivered as several separate pieces of mail and automatically - reassembled by a receiving user agent. (The concept is similar to IP - fragmentation and reassembly in the basic Internet Protocols.) This - mechanism can be used when intermediate transport agents limit the - size of individual messages that can be sent. The media type - "message/partial" thus indicates that the body contains a fragment of - a larger entity. - - Because data of type "message" may never be encoded in base64 or - quoted-printable, a problem might arise if "message/partial" entities - are constructed in an environment that supports binary or 8bit - transport. The problem is that the binary data would be split into - multiple "message/partial" messages, each of them requiring binary - transport. If such messages were encountered at a gateway into a - 7bit transport environment, there would be no way to properly encode - them for the 7bit world, aside from waiting for all of the fragments, - reassembling the inner message, and then encoding the reassembled - data in base64 or quoted-printable. Since it is possible that - different fragments might go through different gateways, even this is - not an acceptable solution. For this reason, it is specified that - entities of type "message/partial" must always have a content- - transfer-encoding of 7bit (the default). In particular, even in - environments that support binary or 8bit transport, the use of a - content- transfer-encoding of "8bit" or "binary" is explicitly - prohibited for MIME entities of type "message/partial". This in turn - implies that the inner message must not use "8bit" or "binary" - encoding. - - - - - - -Freed & Borenstein Standards Track [Page 29] - -RFC 2046 Media Types November 1996 - - - Because some message transfer agents may choose to automatically - fragment large messages, and because such agents may use very - different fragmentation thresholds, it is possible that the pieces of - a partial message, upon reassembly, may prove themselves to comprise - a partial message. This is explicitly permitted. - - Three parameters must be specified in the Content-Type field of type - "message/partial": The first, "id", is a unique identifier, as close - to a world-unique identifier as possible, to be used to match the - fragments together. (In general, the identifier is essentially a - message-id; if placed in double quotes, it can be ANY message-id, in - accordance with the BNF for "parameter" given in RFC 2045.) The - second, "number", an integer, is the fragment number, which indicates - where this fragment fits into the sequence of fragments. The third, - "total", another integer, is the total number of fragments. This - third subfield is required on the final fragment, and is optional - (though encouraged) on the earlier fragments. Note also that these - parameters may be given in any order. - - Thus, the second piece of a 3-piece message may have either of the - following header fields: - - Content-Type: Message/Partial; number=2; total=3; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com" - - Content-Type: Message/Partial; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; - number=2 - - But the third piece MUST specify the total number of fragments: - - Content-Type: Message/Partial; number=3; total=3; - id="oc=jpbe0M2Yt4s@thumper.bellcore.com" - - Note that fragment numbering begins with 1, not 0. - - When the fragments of an entity broken up in this manner are put - together, the result is always a complete MIME entity, which may have - its own Content-Type header field, and thus may contain any other - data type. - -5.2.2.1. Message Fragmentation and Reassembly - - The semantics of a reassembled partial message must be those of the - "inner" message, rather than of a message containing the inner - message. This makes it possible, for example, to send a large audio - message as several partial messages, and still have it appear to the - recipient as a simple audio message rather than as an encapsulated - - - -Freed & Borenstein Standards Track [Page 30] - -RFC 2046 Media Types November 1996 - - - message containing an audio message. That is, the encapsulation of - the message is considered to be "transparent". - - When generating and reassembling the pieces of a "message/partial" - message, the headers of the encapsulated message must be merged with - the headers of the enclosing entities. In this process the following - rules must be observed: - - (1) Fragmentation agents must split messages at line - boundaries only. This restriction is imposed because - splits at points other than the ends of lines in turn - depends on message transports being able to preserve - the semantics of messages that don't end with a CRLF - sequence. Many transports are incapable of preserving - such semantics. - - (2) All of the header fields from the initial enclosing - message, except those that start with "Content-" and - the specific header fields "Subject", "Message-ID", - "Encrypted", and "MIME-Version", must be copied, in - order, to the new message. - - (3) The header fields in the enclosed message which start - with "Content-", plus the "Subject", "Message-ID", - "Encrypted", and "MIME-Version" fields, must be - appended, in order, to the header fields of the new - message. Any header fields in the enclosed message - which do not start with "Content-" (except for the - "Subject", "Message-ID", "Encrypted", and "MIME- - Version" fields) will be ignored and dropped. - - (4) All of the header fields from the second and any - subsequent enclosing messages are discarded by the - reassembly process. - -5.2.2.2. Fragmentation and Reassembly Example - - If an audio message is broken into two pieces, the first piece might - look something like this: - - X-Weird-Header-1: Foo - From: Bill@host.com - To: joe@otherhost.com - Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) - Subject: Audio mail (part 1 of 2) - Message-ID: <id1@host.com> - MIME-Version: 1.0 - Content-type: message/partial; id="ABC@host.com"; - - - -Freed & Borenstein Standards Track [Page 31] - -RFC 2046 Media Types November 1996 - - - number=1; total=2 - - X-Weird-Header-1: Bar - X-Weird-Header-2: Hello - Message-ID: <anotherid@foo.com> - Subject: Audio mail - MIME-Version: 1.0 - Content-type: audio/basic - Content-transfer-encoding: base64 - - ... first half of encoded audio data goes here ... - - and the second half might look something like this: - - From: Bill@host.com - To: joe@otherhost.com - Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) - Subject: Audio mail (part 2 of 2) - MIME-Version: 1.0 - Message-ID: <id2@host.com> - Content-type: message/partial; - id="ABC@host.com"; number=2; total=2 - - ... second half of encoded audio data goes here ... - - Then, when the fragmented message is reassembled, the resulting - message to be displayed to the user should look something like this: - - X-Weird-Header-1: Foo - From: Bill@host.com - To: joe@otherhost.com - Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) - Subject: Audio mail - Message-ID: <anotherid@foo.com> - MIME-Version: 1.0 - Content-type: audio/basic - Content-transfer-encoding: base64 - - ... first half of encoded audio data goes here ... - ... second half of encoded audio data goes here ... - - The inclusion of a "References" field in the headers of the second - and subsequent pieces of a fragmented message that references the - Message-Id on the previous piece may be of benefit to mail readers - that understand and track references. However, the generation of - such "References" fields is entirely optional. - - - - - -Freed & Borenstein Standards Track [Page 32] - -RFC 2046 Media Types November 1996 - - - Finally, it should be noted that the "Encrypted" header field has - been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421, - RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless - believed to describe the correct way to treat it if it is encountered - in the context of conversion to and from "message/partial" fragments. - -5.2.3. External-Body Subtype - - The external-body subtype indicates that the actual body data are not - included, but merely referenced. In this case, the parameters - describe a mechanism for accessing the external data. - - When a MIME entity is of type "message/external-body", it consists of - a header, two consecutive CRLFs, and the message header for the - encapsulated message. If another pair of consecutive CRLFs appears, - this of course ends the message header for the encapsulated message. - However, since the encapsulated message's body is itself external, it - does NOT appear in the area that follows. For example, consider the - following message: - - Content-type: message/external-body; - access-type=local-file; - name="/u/nsb/Me.jpeg" - - Content-type: image/jpeg - Content-ID: <id42@guppylake.bellcore.com> - Content-Transfer-Encoding: binary - - THIS IS NOT REALLY THE BODY! - - The area at the end, which might be called the "phantom body", is - ignored for most external-body messages. However, it may be used to - contain auxiliary information for some such messages, as indeed it is - when the access-type is "mail- server". The only access-type defined - in this document that uses the phantom body is "mail-server", but - other access-types may be defined in the future in other - specifications that use this area. - - The encapsulated headers in ALL "message/external-body" entities MUST - include a Content-ID header field to give a unique identifier by - which to reference the data. This identifier may be used for caching - mechanisms, and for recognizing the receipt of the data when the - access-type is "mail-server". - - Note that, as specified here, the tokens that describe external-body - data, such as file names and mail server commands, are required to be - in the US-ASCII character set. - - - - -Freed & Borenstein Standards Track [Page 33] - -RFC 2046 Media Types November 1996 - - - If this proves problematic in practice, a new mechanism may be - required as a future extension to MIME, either as newly defined - access-types for "message/external-body" or by some other mechanism. - - As with "message/partial", MIME entities of type "message/external- - body" MUST have a content-transfer-encoding of 7bit (the default). - In particular, even in environments that support binary or 8bit - transport, the use of a content- transfer-encoding of "8bit" or - "binary" is explicitly prohibited for entities of type - "message/external-body". - -5.2.3.1. General External-Body Parameters - - The parameters that may be used with any "message/external- body" - are: - - (1) ACCESS-TYPE -- A word indicating the supported access - mechanism by which the file or data may be obtained. - This word is not case sensitive. Values include, but - are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL- - FILE", and "MAIL-SERVER". Future values, except for - experimental values beginning with "X-", must be - registered with IANA, as described in RFC 2048. - This parameter is unconditionally mandatory and MUST be - present on EVERY "message/external-body". - - (2) EXPIRATION -- The date (in the RFC 822 "date-time" - syntax, as extended by RFC 1123 to permit 4 digits in - the year field) after which the existence of the - external data is not guaranteed. This parameter may be - used with ANY access-type and is ALWAYS optional. - - (3) SIZE -- The size (in octets) of the data. The intent - of this parameter is to help the recipient decide - whether or not to expend the necessary resources to - retrieve the external data. Note that this describes - the size of the data in its canonical form, that is, - before any Content-Transfer-Encoding has been applied - or after the data have been decoded. This parameter - may be used with ANY access-type and is ALWAYS - optional. - - (4) PERMISSION -- A case-insensitive field that indicates - whether or not it is expected that clients might also - attempt to overwrite the data. By default, or if - permission is "read", the assumption is that they are - not, and that if the data is retrieved once, it is - never needed again. If PERMISSION is "read-write", - - - -Freed & Borenstein Standards Track [Page 34] - -RFC 2046 Media Types November 1996 - - - this assumption is invalid, and any local copy must be - considered no more than a cache. "Read" and "Read- - write" are the only defined values of permission. This - parameter may be used with ANY access-type and is - ALWAYS optional. - - The precise semantics of the access-types defined here are described - in the sections that follow. - -5.2.3.2. The 'ftp' and 'tftp' Access-Types - - An access-type of FTP or TFTP indicates that the message body is - accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783] - protocols, respectively. For these access-types, the following - additional parameters are mandatory: - - (1) NAME -- The name of the file that contains the actual - body data. - - (2) SITE -- A machine from which the file may be obtained, - using the given protocol. This must be a fully - qualified domain name, not a nickname. - - (3) Before any data are retrieved, using FTP, the user will - generally need to be asked to provide a login id and a - password for the machine named by the site parameter. - For security reasons, such an id and password are not - specified as content-type parameters, but must be - obtained from the user. - - In addition, the following parameters are optional: - - (1) DIRECTORY -- A directory from which the data named by - NAME should be retrieved. - - (2) MODE -- A case-insensitive string indicating the mode - to be used when retrieving the information. The valid - values for access-type "TFTP" are "NETASCII", "OCTET", - and "MAIL", as specified by the TFTP protocol [RFC- - 783]. The valid values for access-type "FTP" are - "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a - decimal integer, typically 8. These correspond to the - representation types "A" "E" "I" and "L n" as specified - by the FTP protocol [RFC-959]. Note that "BINARY" and - "TENEX" are not valid values for MODE and that "OCTET" - or "IMAGE" or "LOCAL8" should be used instead. IF MODE - is not specified, the default value is "NETASCII" for - TFTP and "ASCII" otherwise. - - - -Freed & Borenstein Standards Track [Page 35] - -RFC 2046 Media Types November 1996 - - -5.2.3.3. The 'anon-ftp' Access-Type - - The "anon-ftp" access-type is identical to the "ftp" access type, - except that the user need not be asked to provide a name and password - for the specified site. Instead, the ftp protocol will be used with - login "anonymous" and a password that corresponds to the user's mail - address. - -5.2.3.4. The 'local-file' Access-Type - - An access-type of "local-file" indicates that the actual body is - accessible as a file on the local machine. Two additional parameters - are defined for this access type: - - (1) NAME -- The name of the file that contains the actual - body data. This parameter is mandatory for the - "local-file" access-type. - - (2) SITE -- A domain specifier for a machine or set of - machines that are known to have access to the data - file. This optional parameter is used to describe the - locality of reference for the data, that is, the site - or sites at which the file is expected to be visible. - Asterisks may be used for wildcard matching to a part - of a domain name, such as "*.bellcore.com", to indicate - a set of machines on which the data should be directly - visible, while a single asterisk may be used to - indicate a file that is expected to be universally - available, e.g., via a global file system. - -5.2.3.5. The 'mail-server' Access-Type - - The "mail-server" access-type indicates that the actual body is - available from a mail server. Two additional parameters are defined - for this access-type: - - (1) SERVER -- The addr-spec of the mail server from which - the actual body data can be obtained. This parameter - is mandatory for the "mail-server" access-type. - - (2) SUBJECT -- The subject that is to be used in the mail - that is sent to obtain the data. Note that keying mail - servers on Subject lines is NOT recommended, but such - mail servers are known to exist. This is an optional - parameter. - - - - - - -Freed & Borenstein Standards Track [Page 36] - -RFC 2046 Media Types November 1996 - - - Because mail servers accept a variety of syntaxes, some of which is - multiline, the full command to be sent to a mail server is not - included as a parameter in the content-type header field. Instead, - it is provided as the "phantom body" when the media type is - "message/external-body" and the access-type is mail-server. - - Note that MIME does not define a mail server syntax. Rather, it - allows the inclusion of arbitrary mail server commands in the phantom - body. Implementations must include the phantom body in the body of - the message it sends to the mail server address to retrieve the - relevant data. - - Unlike other access-types, mail-server access is asynchronous and - will happen at an unpredictable time in the future. For this reason, - it is important that there be a mechanism by which the returned data - can be matched up with the original "message/external-body" entity. - MIME mail servers must use the same Content-ID field on the returned - message that was used in the original "message/external-body" - entities, to facilitate such matching. - -5.2.3.6. External-Body Security Issues - - "Message/external-body" entities give rise to two important security - issues: - - (1) Accessing data via a "message/external-body" reference - effectively results in the message recipient performing - an operation that was specified by the message - originator. It is therefore possible for the message - originator to trick a recipient into doing something - they would not have done otherwise. For example, an - originator could specify a action that attempts - retrieval of material that the recipient is not - authorized to obtain, causing the recipient to - unwittingly violate some security policy. For this - reason, user agents capable of resolving external - references must always take steps to describe the - action they are to take to the recipient and ask for - explicit permisssion prior to performing it. - - The 'mail-server' access-type is particularly - vulnerable, in that it causes the recipient to send a - new message whose contents are specified by the - original message's originator. Given the potential for - abuse, any such request messages that are constructed - should contain a clear indication that they were - generated automatically (e.g. in a Comments: header - field) in an attempt to resolve a MIME - - - -Freed & Borenstein Standards Track [Page 37] - -RFC 2046 Media Types November 1996 - - - "message/external-body" reference. - - (2) MIME will sometimes be used in environments that - provide some guarantee of message integrity and - authenticity. If present, such guarantees may apply - only to the actual direct content of messages -- they - may or may not apply to data accessed through MIME's - "message/external-body" mechanism. In particular, it - may be possible to subvert certain access mechanisms - even when the messaging system itself is secure. - - It should be noted that this problem exists either with - or without the availabilty of MIME mechanisms. A - casual reference to an FTP site containing a document - in the text of a secure message brings up similar - issues -- the only difference is that MIME provides for - automatic retrieval of such material, and users may - place unwarranted trust is such automatic retrieval - mechanisms. - -5.2.3.7. Examples and Further Explanations - - When the external-body mechanism is used in conjunction with the - "multipart/alternative" media type it extends the functionality of - "multipart/alternative" to include the case where the same entity is - provided in the same format but via different accces mechanisms. - When this is done the originator of the message must order the parts - first in terms of preferred formats and then by preferred access - mechanisms. The recipient's viewer should then evaluate the list - both in terms of format and access mechanisms. - - With the emerging possibility of very wide-area file systems, it - becomes very hard to know in advance the set of machines where a file - will and will not be accessible directly from the file system. - Therefore it may make sense to provide both a file name, to be tried - directly, and the name of one or more sites from which the file is - known to be accessible. An implementation can try to retrieve remote - files using FTP or any other protocol, using anonymous file retrieval - or prompting the user for the necessary name and password. If an - external body is accessible via multiple mechanisms, the sender may - include multiple entities of type "message/external-body" within the - body parts of an enclosing "multipart/alternative" entity. - - However, the external-body mechanism is not intended to be limited to - file retrieval, as shown by the mail-server access-type. Beyond - this, one can imagine, for example, using a video server for external - references to video clips. - - - - -Freed & Borenstein Standards Track [Page 38] - -RFC 2046 Media Types November 1996 - - - The embedded message header fields which appear in the body of the - "message/external-body" data must be used to declare the media type - of the external body if it is anything other than plain US-ASCII - text, since the external body does not have a header section to - declare its type. Similarly, any Content-transfer-encoding other - than "7bit" must also be declared here. Thus a complete - "message/external-body" message, referring to an object in PostScript - format, might look like this: - - From: Whomever - To: Someone - Date: Whenever - Subject: whatever - MIME-Version: 1.0 - Message-ID: <id1@host.com> - Content-Type: multipart/alternative; boundary=42 - Content-ID: <id001@guppylake.bellcore.com> - - --42 - Content-Type: message/external-body; name="BodyFormats.ps"; - site="thumper.bellcore.com"; mode="image"; - access-type=ANON-FTP; directory="pub"; - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - Content-ID: <id42@guppylake.bellcore.com> - - --42 - Content-Type: message/external-body; access-type=local-file; - name="/u/nsb/writing/rfcs/RFC-MIME.ps"; - site="thumper.bellcore.com"; - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - Content-ID: <id42@guppylake.bellcore.com> - - --42 - Content-Type: message/external-body; - access-type=mail-server - server="listserv@bogus.bitnet"; - expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" - - Content-type: application/postscript - Content-ID: <id42@guppylake.bellcore.com> - - get RFC-MIME.DOC - - --42-- - - - -Freed & Borenstein Standards Track [Page 39] - -RFC 2046 Media Types November 1996 - - - Note that in the above examples, the default Content-transfer- - encoding of "7bit" is assumed for the external postscript data. - - Like the "message/partial" type, the "message/external-body" media - type is intended to be transparent, that is, to convey the data type - in the external body rather than to convey a message with a body of - that type. Thus the headers on the outer and inner parts must be - merged using the same rules as for "message/partial". In particular, - this means that the Content-type and Subject fields are overridden, - but the From field is preserved. - - Note that since the external bodies are not transported along with - the external body reference, they need not conform to transport - limitations that apply to the reference itself. In particular, - Internet mail transports may impose 7bit and line length limits, but - these do not automatically apply to binary external body references. - Thus a Content-Transfer-Encoding is not generally necessary, though - it is permitted. - - Note that the body of a message of type "message/external-body" is - governed by the basic syntax for an RFC 822 message. In particular, - anything before the first consecutive pair of CRLFs is header - information, while anything after it is body information, which is - ignored for most access-types. - -5.2.4. Other Message Subtypes - - MIME implementations must in general treat unrecognized subtypes of - "message" as being equivalent to "application/octet-stream". - - Future subtypes of "message" intended for use with email should be - restricted to "7bit" encoding. A type other than "message" should be - used if restriction to "7bit" is not possible. - -6. Experimental Media Type Values - - A media type value beginning with the characters "X-" is a private - value, to be used by consenting systems by mutual agreement. Any - format without a rigorous and public definition must be named with an - "X-" prefix, and publicly specified values shall never begin with - "X-". (Older versions of the widely used Andrew system use the "X- - BE2" name, so new systems should probably choose a different name.) - - In general, the use of "X-" top-level types is strongly discouraged. - Implementors should invent subtypes of the existing types whenever - possible. In many cases, a subtype of "application" will be more - appropriate than a new top-level type. - - - - -Freed & Borenstein Standards Track [Page 40] - -RFC 2046 Media Types November 1996 - - -7. Summary - - The five discrete media types provide provide a standardized - mechanism for tagging entities as "audio", "image", or several other - kinds of data. The composite "multipart" and "message" media types - allow mixing and hierarchical structuring of entities of different - types in a single message. A distinguished parameter syntax allows - further specification of data format details, particularly the - specification of alternate character sets. Additional optional - header fields provide mechanisms for certain extensions deemed - desirable by many implementors. Finally, a number of useful media - types are defined for general use by consenting user agents, notably - "message/partial" and "message/external-body". - -9. Security Considerations - - Security issues are discussed in the context of the - "application/postscript" type, the "message/external-body" type, and - in RFC 2048. Implementors should pay special attention to the - security implications of any media types that can cause the remote - execution of any actions in the recipient's environment. In such - cases, the discussion of the "application/postscript" type may serve - as a model for considering other media types with remote execution - capabilities. - - - - - - - - - - - - - - - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 41] - -RFC 2046 Media Types November 1996 - - -9. Authors' Addresses - - For more information, the authors of this document are best contacted - via Internet mail: - - Ned Freed - Innosoft International, Inc. - 1050 East Garvey Avenue South - West Covina, CA 91790 - USA - - Phone: +1 818 919 3600 - Fax: +1 818 919 3614 - EMail: ned@innosoft.com - - - Nathaniel S. Borenstein - First Virtual Holdings - 25 Washington Avenue - Morristown, NJ 07960 - USA - - Phone: +1 201 540 8967 - Fax: +1 201 993 3032 - EMail: nsb@nsb.fv.com - - - MIME is a result of the work of the Internet Engineering Task Force - Working Group on RFC 822 Extensions. The chairman of that group, - Greg Vaudreuil, may be reached at: - - Gregory M. Vaudreuil - Octel Network Services - 17080 Dallas Parkway - Dallas, TX 75248-1905 - USA - - EMail: Greg.Vaudreuil@Octel.Com - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 42] - -RFC 2046 Media Types November 1996 - - -Appendix A -- Collected Grammar - - This appendix contains the complete BNF grammar for all the syntax - specified by this document. - - By itself, however, this grammar is incomplete. It refers by name to - several syntax rules that are defined by RFC 822. Rather than - reproduce those definitions here, and risk unintentional differences - between the two, this document simply refers the reader to RFC 822 - for the remaining definitions. Wherever a term is undefined, it - refers to the RFC 822 definition. - - boundary := 0*69<bchars> bcharsnospace - - bchars := bcharsnospace / " " - - bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / - "+" / "_" / "," / "-" / "." / - "/" / ":" / "=" / "?" - - body-part := <"message" as defined in RFC 822, with all - header fields optional, not starting with the - specified dash-boundary, and with the - delimiter not occurring anywhere in the - body part. Note that the semantics of a - part differ from the semantics of a message, - as described in the text.> - - close-delimiter := delimiter "--" - - dash-boundary := "--" boundary - ; boundary taken from the value of - ; boundary parameter of the - ; Content-Type field. - - delimiter := CRLF dash-boundary - - discard-text := *(*text CRLF) - ; May be ignored or discarded. - - encapsulation := delimiter transport-padding - CRLF body-part - - epilogue := discard-text - - multipart-body := [preamble CRLF] - dash-boundary transport-padding CRLF - body-part *encapsulation - - - -Freed & Borenstein Standards Track [Page 43] - -RFC 2046 Media Types November 1996 - - - close-delimiter transport-padding - [CRLF epilogue] - - preamble := discard-text - - transport-padding := *LWSP-char - ; Composers MUST NOT generate - ; non-zero length transport - ; padding, but receivers MUST - ; be able to handle padding - ; added by message transports. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 44] - diff --git a/proto/rfc2047.txt b/proto/rfc2047.txt @@ -1,843 +0,0 @@ - - - - - - -Network Working Group K. Moore -Request for Comments: 2047 University of Tennessee -Obsoletes: 1521, 1522, 1590 November 1996 -Category: Standards Track - - - MIME (Multipurpose Internet Mail Extensions) Part Three: - Message Header Extensions for Non-ASCII Text - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - STD 11, RFC 822, defines a message representation protocol specifying - considerable detail about US-ASCII message headers, and leaves the - message content, or message body, as flat US-ASCII text. This set of - documents, collectively called the Multipurpose Internet Mail - Extensions, or MIME, redefines the format of messages to allow for - - (1) textual message bodies in character sets other than US-ASCII, - - (2) an extensible set of different formats for non-textual message - bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than US-ASCII. - - These documents are based on earlier work documented in RFC 934, STD - 11, and RFC 1049, but extends and revises them. Because RFC 822 said - so little about message bodies, these documents are largely - orthogonal to (rather than a revision of) RFC 822. - - This particular document is the third document in the series. It - describes extensions to RFC 822 to allow non-US-ASCII text data in - Internet mail header fields. - - - - - - - - - -Moore Standards Track [Page 1] - -RFC 2047 Message Header Extensions November 1996 - - - Other documents in this series include: - - + RFC 2045, which specifies the various headers used to describe - the structure of MIME messages. - - + RFC 2046, which defines the general structure of the MIME media - typing system and defines an initial set of media types, - - + RFC 2048, which specifies various IANA registration procedures - for MIME-related facilities, and - - + RFC 2049, which describes MIME conformance criteria and - provides some illustrative examples of MIME message formats, - acknowledgements, and the bibliography. - - These documents are revisions of RFCs 1521, 1522, and 1590, which - themselves were revisions of RFCs 1341 and 1342. An appendix in RFC - 2049 describes differences and changes from previous versions. - -1. Introduction - - RFC 2045 describes a mechanism for denoting textual body parts which - are coded in various character sets, as well as methods for encoding - such body parts as sequences of printable US-ASCII characters. This - memo describes similar techniques to allow the encoding of non-ASCII - text in various portions of a RFC 822 [2] message header, in a manner - which is unlikely to confuse existing message handling software. - - Like the encoding techniques described in RFC 2045, the techniques - outlined here were designed to allow the use of non-ASCII characters - in message headers in a way which is unlikely to be disturbed by the - quirks of existing Internet mail handling programs. In particular, - some mail relaying programs are known to (a) delete some message - header fields while retaining others, (b) rearrange the order of - addresses in To or Cc fields, (c) rearrange the (vertical) order of - header fields, and/or (d) "wrap" message headers at different places - than those in the original message. In addition, some mail reading - programs are known to have difficulty correctly parsing message - headers which, while legal according to RFC 822, make use of - backslash-quoting to "hide" special characters such as "<", ",", or - ":", or which exploit other infrequently-used features of that - specification. - - While it is unfortunate that these programs do not correctly - interpret RFC 822 headers, to "break" these programs would cause - severe operational problems for the Internet mail system. The - extensions described in this memo therefore do not rely on little- - used features of RFC 822. - - - -Moore Standards Track [Page 2] - -RFC 2047 Message Header Extensions November 1996 - - - Instead, certain sequences of "ordinary" printable ASCII characters - (known as "encoded-words") are reserved for use as encoded data. The - syntax of encoded-words is such that they are unlikely to - "accidentally" appear as normal text in message headers. - Furthermore, the characters used in encoded-words are restricted to - those which do not have special meanings in the context in which the - encoded-word appears. - - Generally, an "encoded-word" is a sequence of printable ASCII - characters that begins with "=?", ends with "?=", and has two "?"s in - between. It specifies a character set and an encoding method, and - also includes the original text encoded as graphic ASCII characters, - according to the rules for that encoding method. - - A mail composer that implements this specification will provide a - means of inputting non-ASCII text in header fields, but will - translate these fields (or appropriate portions of these fields) into - encoded-words before inserting them into the message header. - - A mail reader that implements this specification will recognize - encoded-words when they appear in certain portions of the message - header. Instead of displaying the encoded-word "as is", it will - reverse the encoding and display the original text in the designated - character set. - -NOTES - - This memo relies heavily on notation and terms defined RFC 822 and - RFC 2045. In particular, the syntax for the ABNF used in this memo - is defined in RFC 822, as well as many of the terminal or nonterminal - symbols from RFC 822 are used in the grammar for the header - extensions defined here. Among the symbols defined in RFC 822 and - referenced in this memo are: 'addr-spec', 'atom', 'CHAR', 'comment', - 'CTLs', 'ctext', 'linear-white-space', 'phrase', 'quoted-pair'. - 'quoted-string', 'SPACE', and 'word'. Successful implementation of - this protocol extension requires careful attention to the RFC 822 - definitions of these terms. - - When the term "ASCII" appears in this memo, it refers to the "7-Bit - American Standard Code for Information Interchange", ANSI X3.4-1986. - The MIME charset name for this character set is "US-ASCII". When not - specifically referring to the MIME charset name, this document uses - the term "ASCII", both for brevity and for consistency with RFC 822. - However, implementors are warned that the character set name must be - spelled "US-ASCII" in MIME message and body part headers. - - - - - - -Moore Standards Track [Page 3] - -RFC 2047 Message Header Extensions November 1996 - - - This memo specifies a protocol for the representation of non-ASCII - text in message headers. It specifically DOES NOT define any - translation between "8-bit headers" and pure ASCII headers, nor is - any such translation assumed to be possible. - -2. Syntax of encoded-words - - An 'encoded-word' is defined by the following ABNF grammar. The - notation of RFC 822 is used, with the exception that white space - characters MUST NOT appear between components of an 'encoded-word'. - - encoded-word = "=?" charset "?" encoding "?" encoded-text "?=" - - charset = token ; see section 3 - - encoding = token ; see section 4 - - token = 1*<Any CHAR except SPACE, CTLs, and especials> - - especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / " - <"> / "/" / "[" / "]" / "?" / "." / "=" - - encoded-text = 1*<Any printable ASCII character other than "?" - or SPACE> - ; (but see "Use of encoded-words in message - ; headers", section 5) - - Both 'encoding' and 'charset' names are case-independent. Thus the - charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the - encoding named "Q" may be spelled either "Q" or "q". - - An 'encoded-word' may not be more than 75 characters long, including - 'charset', 'encoding', 'encoded-text', and delimiters. If it is - desirable to encode more text than will fit in an 'encoded-word' of - 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may - be used. - - While there is no limit to the length of a multiple-line header - field, each line of a header field that contains one or more - 'encoded-word's is limited to 76 characters. - - The length restrictions are included both to ease interoperability - through internetwork mail gateways, and to impose a limit on the - amount of lookahead a header parser must employ (while looking for a - final ?= delimiter) before it can decide whether a token is an - "encoded-word" or something else. - - - - - -Moore Standards Track [Page 4] - -RFC 2047 Message Header Extensions November 1996 - - - IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's - by an RFC 822 parser. As a consequence, unencoded white space - characters (such as SPACE and HTAB) are FORBIDDEN within an - 'encoded-word'. For example, the character sequence - - =?iso-8859-1?q?this is some text?= - - would be parsed as four 'atom's, rather than as a single 'atom' (by - an RFC 822 parser) or 'encoded-word' (by a parser which understands - 'encoded-words'). The correct way to encode the string "this is some - text" is to encode the SPACE characters as well, e.g. - - =?iso-8859-1?q?this=20is=20some=20text?= - - The characters which may appear in 'encoded-text' are further - restricted by the rules in section 5. - -3. Character sets - - The 'charset' portion of an 'encoded-word' specifies the character - set associated with the unencoded text. A 'charset' can be any of - the character set names allowed in an MIME "charset" parameter of a - "text/plain" body part, or any character set name registered with - IANA for use with the MIME text/plain content-type. - - Some character sets use code-switching techniques to switch between - "ASCII mode" and other modes. If unencoded text in an 'encoded-word' - contains a sequence which causes the charset interpreter to switch - out of ASCII mode, it MUST contain additional control codes such that - ASCII mode is again selected at the end of the 'encoded-word'. (This - rule applies separately to each 'encoded-word', including adjacent - 'encoded-word's within a single header field.) - - When there is a possibility of using more than one character set to - represent the text in an 'encoded-word', and in the absence of - private agreements between sender and recipients of a message, it is - recommended that members of the ISO-8859-* series be used in - preference to other character sets. - -4. Encodings - - Initially, the legal values for "encoding" are "Q" and "B". These - encodings are described below. The "Q" encoding is recommended for - use when most of the characters to be encoded are in the ASCII - character set; otherwise, the "B" encoding should be used. - Nevertheless, a mail reader which claims to recognize 'encoded-word's - MUST be able to accept either encoding for any character set which it - supports. - - - -Moore Standards Track [Page 5] - -RFC 2047 Message Header Extensions November 1996 - - - Only a subset of the printable ASCII characters may be used in - 'encoded-text'. Space and tab characters are not allowed, so that - the beginning and end of an 'encoded-word' are obvious. The "?" - character is used within an 'encoded-word' to separate the various - portions of the 'encoded-word' from one another, and thus cannot - appear in the 'encoded-text' portion. Other characters are also - illegal in certain contexts. For example, an 'encoded-word' in a - 'phrase' preceding an address in a From header field may not contain - any of the "specials" defined in RFC 822. Finally, certain other - characters are disallowed in some contexts, to ensure reliability for - messages that pass through internetwork mail gateways. - - The "B" encoding automatically meets these requirements. The "Q" - encoding allows a wide range of printable characters to be used in - non-critical locations in the message header (e.g., Subject), with - fewer characters available for use in other locations. - -4.1. The "B" encoding - - The "B" encoding is identical to the "BASE64" encoding defined by RFC - 2045. - -4.2. The "Q" encoding - - The "Q" encoding is similar to the "Quoted-Printable" content- - transfer-encoding defined in RFC 2045. It is designed to allow text - containing mostly ASCII characters to be decipherable on an ASCII - terminal without decoding. - - (1) Any 8-bit value may be represented by a "=" followed by two - hexadecimal digits. For example, if the character set in use - were ISO-8859-1, the "=" character would thus be encoded as - "=3D", and a SPACE by "=20". (Upper case should be used for - hexadecimal digits "A" through "F".) - - (2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be - represented as "_" (underscore, ASCII 95.). (This character may - not pass through some internetwork mail gateways, but its use - will greatly enhance readability of "Q" encoded data with mail - readers that do not support this encoding.) Note that the "_" - always represents hexadecimal 20, even if the SPACE character - occupies a different code position in the character set in use. - - (3) 8-bit values which correspond to printable ASCII characters other - than "=", "?", and "_" (underscore), MAY be represented as those - characters. (But see section 5 for restrictions.) In - particular, SPACE and TAB MUST NOT be represented as themselves - within encoded words. - - - -Moore Standards Track [Page 6] - -RFC 2047 Message Header Extensions November 1996 - - -5. Use of encoded-words in message headers - - An 'encoded-word' may appear in a message header or body part header - according to the following rules: - -(1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822) - in any Subject or Comments header field, any extension message - header field, or any MIME body part field for which the field body - is defined as '*text'. An 'encoded-word' may also appear in any - user-defined ("X-") message or body part header field. - - Ordinary ASCII text and 'encoded-word's may appear together in the - same header field. However, an 'encoded-word' that appears in a - header field defined as '*text' MUST be separated from any adjacent - 'encoded-word' or 'text' by 'linear-white-space'. - -(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and - ")", i.e., wherever a 'ctext' is allowed. More precisely, the RFC - 822 ABNF definition for 'comment' is amended as follows: - - comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")" - - A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT - contain the characters "(", ")" or " - 'encoded-word' that appears in a 'comment' MUST be separated from - any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'. - - It is important to note that 'comment's are only recognized inside - "structured" field bodies. In fields whose bodies are defined as - '*text', "(" and ")" are treated as ordinary characters rather than - comment delimiters, and rule (1) of this section applies. (See RFC - 822, sections 3.1.2 and 3.1.3) - -(3) As a replacement for a 'word' entity within a 'phrase', for example, - one that precedes an address in a From, To, or Cc header. The ABNF - definition for 'phrase' from RFC 822 thus becomes: - - phrase = 1*( encoded-word / word ) - - In this case the set of characters that may be used in a "Q"-encoded - 'encoded-word' is restricted to: <upper and lower case ASCII - letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_" - (underscore, ASCII 95.)>. An 'encoded-word' that appears within a - 'phrase' MUST be separated from any adjacent 'word', 'text' or - 'special' by 'linear-white-space'. - - - - - - -Moore Standards Track [Page 7] - -RFC 2047 Message Header Extensions November 1996 - - - These are the ONLY locations where an 'encoded-word' may appear. In - particular: - - + An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'. - - + An 'encoded-word' MUST NOT appear within a 'quoted-string'. - - + An 'encoded-word' MUST NOT be used in a Received header field. - - + An 'encoded-word' MUST NOT be used in parameter of a MIME - Content-Type or Content-Disposition field, or in any structured - field body except within a 'comment' or 'phrase'. - - The 'encoded-text' in an 'encoded-word' must be self-contained; - 'encoded-text' MUST NOT be continued from one 'encoded-word' to - another. This implies that the 'encoded-text' portion of a "B" - 'encoded-word' will be a multiple of 4 characters long; for a "Q" - 'encoded-word', any "=" character that appears in the 'encoded-text' - portion will be followed by two hexadecimal characters. - - Each 'encoded-word' MUST encode an integral number of octets. The - 'encoded-text' in each 'encoded-word' must be well-formed according - to the encoding specified; the 'encoded-text' may not be continued in - the next 'encoded-word'. (For example, "=?charset?Q?=?= - =?charset?Q?AB?=" would be illegal, because the two hex digits "AB" - must follow the "=" in the same 'encoded-word'.) - - Each 'encoded-word' MUST represent an integral number of characters. - A multi-octet character may not be split across adjacent 'encoded- - word's. - - Only printable and white space character data should be encoded using - this scheme. However, since these encoding schemes allow the - encoding of arbitrary octet values, mail readers that implement this - decoding should also ensure that display of the decoded data on the - recipient's terminal will not cause unwanted side-effects. - - Use of these methods to encode non-textual data (e.g., pictures or - sounds) is not defined by this memo. Use of 'encoded-word's to - represent strings of purely ASCII characters is allowed, but - discouraged. In rare cases it may be necessary to encode ordinary - text that looks like an 'encoded-word'. - - - - - - - - - -Moore Standards Track [Page 8] - -RFC 2047 Message Header Extensions November 1996 - - -6. Support of 'encoded-word's by mail readers - -6.1. Recognition of 'encoded-word's in message headers - - A mail reader must parse the message and body part headers according - to the rules in RFC 822 to correctly recognize 'encoded-word's. - - 'encoded-word's are to be recognized as follows: - - (1) Any message or body part header field defined as '*text', or any - user-defined header field, should be parsed as follows: Beginning - at the start of the field-body and immediately following each - occurrence of 'linear-white-space', each sequence of up to 75 - printable characters (not containing any 'linear-white-space') - should be examined to see if it is an 'encoded-word' according to - the syntax rules in section 2. Any other sequence of printable - characters should be treated as ordinary ASCII text. - - (2) Any header field not defined as '*text' should be parsed - according to the syntax rules for that header field. However, - any 'word' that appears within a 'phrase' should be treated as an - 'encoded-word' if it meets the syntax rules in section 2. - Otherwise it should be treated as an ordinary 'word'. - - (3) Within a 'comment', any sequence of up to 75 printable characters - (not containing 'linear-white-space'), that meets the syntax - rules in section 2, should be treated as an 'encoded-word'. - Otherwise it should be treated as normal comment text. - - (4) A MIME-Version header field is NOT required to be present for - 'encoded-word's to be interpreted according to this - specification. One reason for this is that the mail reader is - not expected to parse the entire message header before displaying - lines that may contain 'encoded-word's. - -6.2. Display of 'encoded-word's - - Any 'encoded-word's so recognized are decoded, and if possible, the - resulting unencoded text is displayed in the original character set. - - NOTE: Decoding and display of encoded-words occurs *after* a - structured field body is parsed into tokens. It is therefore - possible to hide 'special' characters in encoded-words which, when - displayed, will be indistinguishable from 'special' characters in the - surrounding text. For this and other reasons, it is NOT generally - possible to translate a message header containing 'encoded-word's to - an unencoded form which can be parsed by an RFC 822 mail reader. - - - - -Moore Standards Track [Page 9] - -RFC 2047 Message Header Extensions November 1996 - - - When displaying a particular header field that contains multiple - 'encoded-word's, any 'linear-white-space' that separates a pair of - adjacent 'encoded-word's is ignored. (This is to allow the use of - multiple 'encoded-word's to represent long strings of unencoded text, - without having to separate 'encoded-word's where spaces occur in the - unencoded text.) - - In the event other encodings are defined in the future, and the mail - reader does not support the encoding used, it may either (a) display - the 'encoded-word' as ordinary text, or (b) substitute an appropriate - message indicating that the text could not be decoded. - - If the mail reader does not support the character set used, it may - (a) display the 'encoded-word' as ordinary text (i.e., as it appears - in the header), (b) make a "best effort" to display using such - characters as are available, or (c) substitute an appropriate message - indicating that the decoded text could not be displayed. - - If the character set being used employs code-switching techniques, - display of the encoded text implicitly begins in "ASCII mode". In - addition, the mail reader must ensure that the output device is once - again in "ASCII mode" after the 'encoded-word' is displayed. - -6.3. Mail reader handling of incorrectly formed 'encoded-word's - - It is possible that an 'encoded-word' that is legal according to the - syntax defined in section 2, is incorrectly formed according to the - rules for the encoding being used. For example: - - (1) An 'encoded-word' which contains characters which are not legal - for a particular encoding (for example, a "-" in the "B" - encoding, or a SPACE or HTAB in either the "B" or "Q" encoding), - is incorrectly formed. - - (2) Any 'encoded-word' which encodes a non-integral number of - characters or octets is incorrectly formed. - - A mail reader need not attempt to display the text associated with an - 'encoded-word' that is incorrectly formed. However, a mail reader - MUST NOT prevent the display or handling of a message because an - 'encoded-word' is incorrectly formed. - -7. Conformance - - A mail composing program claiming compliance with this specification - MUST ensure that any string of non-white-space printable ASCII - characters within a '*text' or '*ctext' that begins with "=?" and - ends with "?=" be a valid 'encoded-word'. ("begins" means: at the - - - -Moore Standards Track [Page 10] - -RFC 2047 Message Header Extensions November 1996 - - - start of the field-body, immediately following 'linear-white-space', - or immediately following a "(" for an 'encoded-word' within '*ctext'; - "ends" means: at the end of the field-body, immediately preceding - 'linear-white-space', or immediately preceding a ")" for an - 'encoded-word' within '*ctext'.) In addition, any 'word' within a - 'phrase' that begins with "=?" and ends with "?=" must be a valid - 'encoded-word'. - - A mail reading program claiming compliance with this specification - must be able to distinguish 'encoded-word's from 'text', 'ctext', or - 'word's, according to the rules in section 6, anytime they appear in - appropriate places in message headers. It must support both the "B" - and "Q" encodings for any character set which it supports. The - program must be able to display the unencoded text if the character - set is "US-ASCII". For the ISO-8859-* character sets, the mail - reading program must at least be able to display the characters which - are also in the ASCII set. - -8. Examples - - The following are examples of message headers containing 'encoded- - word's: - - From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu> - To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk> - CC: =?ISO-8859-1?Q?Andr=E9?= Pirard <PIRARD@vm1.ulg.ac.be> - Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= - =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= - - Note: In the first 'encoded-word' of the Subject field above, the - last "=" at the end of the 'encoded-text' is necessary because each - 'encoded-word' must be self-contained (the "=" character completes a - group of 4 base64 characters representing 2 octets). An additional - octet could have been encoded in the first 'encoded-word' (so that - the encoded-word would contain an exact multiple of 3 encoded - octets), except that the second 'encoded-word' uses a different - 'charset' than the first one. - - From: =?ISO-8859-1?Q?Olle_J=E4rnefors?= <ojarnef@admin.kth.se> - To: ietf-822@dimacs.rutgers.edu, ojarnef@admin.kth.se - Subject: Time for ISO 10646? - - To: Dave Crocker <dcrocker@mordor.stanford.edu> - Cc: ietf-822@dimacs.rutgers.edu, paf@comsol.se - From: =?ISO-8859-1?Q?Patrik_F=E4ltstr=F6m?= <paf@nada.kth.se> - Subject: Re: RFC-HDR care and feeding - - - - - -Moore Standards Track [Page 11] - -RFC 2047 Message Header Extensions November 1996 - - - From: Nathaniel Borenstein <nsb@thumper.bellcore.com> - (=?iso-8859-8?b?7eXs+SDv4SDp7Oj08A==?=) - To: Greg Vaudreuil <gvaudre@NRI.Reston.VA.US>, Ned Freed - <ned@innosoft.com>, Keith Moore <moore@cs.utk.edu> - Subject: Test of new header generator - MIME-Version: 1.0 - Content-type: text/plain; charset=ISO-8859-1 - - The following examples illustrate how text containing 'encoded-word's - which appear in a structured field body. The rules are slightly - different for fields defined as '*text' because "(" and ")" are not - recognized as 'comment' delimiters. [Section 5, paragraph (1)]. - - In each of the following examples, if the same sequence were to occur - in a '*text' field, the "displayed as" form would NOT be treated as - encoded words, but be identical to the "encoded form". This is - because each of the encoded-words in the following examples is - adjacent to a "(" or ")" character. - - encoded form displayed as - --------------------------------------------------------------------- - (=?ISO-8859-1?Q?a?=) (a) - - (=?ISO-8859-1?Q?a?= b) (a b) - - Within a 'comment', white space MUST appear between an - 'encoded-word' and surrounding text. [Section 5, - paragraph (2)]. However, white space is not needed between - the initial "(" that begins the 'comment', and the - 'encoded-word'. - - - (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab) - - White space between adjacent 'encoded-word's is not - displayed. - - (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab) - - Even multiple SPACEs between 'encoded-word's are ignored - for the purpose of display. - - (=?ISO-8859-1?Q?a?= (ab) - =?ISO-8859-1?Q?b?=) - - Any amount of linear-space-white between 'encoded-word's, - even if it includes a CRLF followed by one or more SPACEs, - is ignored for the purposes of display. - - - -Moore Standards Track [Page 12] - -RFC 2047 Message Header Extensions November 1996 - - - (=?ISO-8859-1?Q?a_b?=) (a b) - - In order to cause a SPACE to be displayed within a portion - of encoded text, the SPACE MUST be encoded as part of the - 'encoded-word'. - - (=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=) (a b) - - In order to cause a SPACE to be displayed between two strings - of encoded text, the SPACE MAY be encoded as part of one of - the 'encoded-word's. - -9. References - - [RFC 822] Crocker, D., "Standard for the Format of ARPA Internet Text - Messages", STD 11, RFC 822, UDEL, August 1982. - - [RFC 2049] Borenstein, N., and N. Freed, "Multipurpose Internet Mail - Extensions (MIME) Part Five: Conformance Criteria and Examples", - RFC 2049, November 1996. - - [RFC 2045] Borenstein, N., and N. Freed, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message Bodies", - RFC 2045, November 1996. - - [RFC 2046] Borenstein N., and N. Freed, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - November 1996. - - [RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) Part Four: Registration - Procedures", RFC 2048, November 1996. - - - - - - - - - - - - - - - - - - - -Moore Standards Track [Page 13] - -RFC 2047 Message Header Extensions November 1996 - - -10. Security Considerations - - Security issues are not discussed in this memo. - -11. Acknowledgements - - The author wishes to thank Nathaniel Borenstein, Issac Chan, Lutz - Donnerhacke, Paul Eggert, Ned Freed, Andreas M. Kirchwitz, Olle - Jarnefors, Mike Rosin, Yutaka Sato, Bart Schaefer, and Kazuhiko - Yamamoto, for their helpful advice, insightful comments, and - illuminating questions in response to earlier versions of this - specification. - -12. Author's Address - - Keith Moore - University of Tennessee - 107 Ayres Hall - Knoxville TN 37996-1301 - - EMail: moore@cs.utk.edu - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Moore Standards Track [Page 14] - -RFC 2047 Message Header Extensions November 1996 - - -Appendix - changes since RFC 1522 (in no particular order) - - + explicitly state that the MIME-Version is not requried to use - 'encoded-word's. - - + add explicit note that SPACEs and TABs are not allowed within - 'encoded-word's, explaining that an 'encoded-word' must look like an - 'atom' to an RFC822 parser.values, to be precise). - - + add examples from Olle Jarnefors (thanks!) which illustrate how - encoded-words with adjacent linear-white-space are displayed. - - + explicitly list terms defined in RFC822 and referenced in this memo - - + fix transcription typos that caused one or two lines and a couple of - characters to disappear in the resulting text, due to nroff quirks. - - + clarify that encoded-words are allowed in '*text' fields in both - RFC822 headers and MIME body part headers, but NOT as parameter - values. - - + clarify the requirement to switch back to ASCII within the encoded - portion of an 'encoded-word', for any charset that uses code switching - sequences. - - + add a note about 'encoded-word's being delimited by "(" and ")" - within a comment, but not in a *text (how bizarre!). - - + fix the Andre Pirard example to get rid of the trailing "_" after - the =E9. (no longer needed post-1342). - - + clarification: an 'encoded-word' may appear immediately following - the initial "(" or immediately before the final ")" that delimits a - comment, not just adjacent to "(" and ")" *within* *ctext. - - + add a note to explain that a "B" 'encoded-word' will always have a - multiple of 4 characters in the 'encoded-text' portion. - - + add note about the "=" in the examples - - + note that processing of 'encoded-word's occurs *after* parsing, and - some of the implications thereof. - - + explicitly state that you can't expect to translate between - 1522 and either vanilla 822 or so-called "8-bit headers". - - + explicitly state that 'encoded-word's are not valid within a - 'quoted-string'. - - - -Moore Standards Track [Page 15] - diff --git a/proto/rfc2048.txt b/proto/rfc2048.txt @@ -1,1180 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 2048 Innosoft -BCP: 13 J. Klensin -Obsoletes: 1521, 1522, 1590 MCI -Category: Best Current Practice J. Postel - ISI - November 1996 - - - Multipurpose Internet Mail Extensions - (MIME) Part Four: - Registration Procedures - -Status of this Memo - - This document specifies an Internet Best Current Practices for the - Internet Community, and requests discussion and suggestions for - improvements. Distribution of this memo is unlimited. - -Abstract - - STD 11, RFC 822, defines a message representation protocol specifying - considerable detail about US-ASCII message headers, and leaves the - message content, or message body, as flat US-ASCII text. This set of - documents, collectively called the Multipurpose Internet Mail - Extensions, or MIME, redefines the format of messages to allow for - - (1) textual message bodies in character sets other than - US-ASCII, - - (2) an extensible set of different formats for non-textual - message bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than - US-ASCII. - - These documents are based on earlier work documented in RFC 934, STD - 11, and RFC 1049, but extends and revises them. Because RFC 822 said - so little about message bodies, these documents are largely - orthogonal to (rather than a revision of) RFC 822. - - - - - - - - - -Freed, et. al. Best Current Practice [Page 1] - -RFC 2048 MIME Registration Procedures November 1996 - - - This fourth document, RFC 2048, specifies various IANA registration - procedures for the following MIME facilities: - - (1) media types, - - (2) external body access types, - - (3) content-transfer-encodings. - - Registration of character sets for use in MIME is covered elsewhere - and is no longer addressed by this document. - - These documents are revisions of RFCs 1521 and 1522, which themselves - were revisions of RFCs 1341 and 1342. An appendix in RFC 2049 - describes differences and changes from previous versions. - -Table of Contents - - 1. Introduction ......................................... 3 - 2. Media Type Registration .............................. 4 - 2.1 Registration Trees and Subtype Names ................ 4 - 2.1.1 IETF Tree ......................................... 4 - 2.1.2 Vendor Tree ....................................... 4 - 2.1.3 Personal or Vanity Tree ........................... 5 - 2.1.4 Special `x.' Tree ................................. 5 - 2.1.5 Additional Registration Trees ..................... 6 - 2.2 Registration Requirements ........................... 6 - 2.2.1 Functionality Requirement ......................... 6 - 2.2.2 Naming Requirements ............................... 6 - 2.2.3 Parameter Requirements ............................ 7 - 2.2.4 Canonicalization and Format Requirements .......... 7 - 2.2.5 Interchange Recommendations ....................... 8 - 2.2.6 Security Requirements ............................. 8 - 2.2.7 Usage and Implementation Non-requirements ......... 9 - 2.2.8 Publication Requirements .......................... 10 - 2.2.9 Additional Information ............................ 10 - 2.3 Registration Procedure .............................. 11 - 2.3.1 Present the Media Type to the Community for Review 11 - 2.3.2 IESG Approval ..................................... 12 - 2.3.3 IANA Registration ................................. 12 - 2.4 Comments on Media Type Registrations ................ 12 - 2.5 Location of Registered Media Type List .............. 12 - 2.6 IANA Procedures for Registering Media Types ......... 12 - 2.7 Change Control ...................................... 13 - 2.8 Registration Template ............................... 14 - 3. External Body Access Types ........................... 14 - 3.1 Registration Requirements ........................... 15 - 3.1.1 Naming Requirements ............................... 15 - - - -Freed, et. al. Best Current Practice [Page 2] - -RFC 2048 MIME Registration Procedures November 1996 - - - 3.1.2 Mechanism Specification Requirements .............. 15 - 3.1.3 Publication Requirements .......................... 15 - 3.1.4 Security Requirements ............................. 15 - 3.2 Registration Procedure .............................. 15 - 3.2.1 Present the Access Type to the Community .......... 16 - 3.2.2 Access Type Reviewer .............................. 16 - 3.2.3 IANA Registration ................................. 16 - 3.3 Location of Registered Access Type List ............. 16 - 3.4 IANA Procedures for Registering Access Types ........ 16 - 4. Transfer Encodings ................................... 17 - 4.1 Transfer Encoding Requirements ...................... 17 - 4.1.1 Naming Requirements ............................... 17 - 4.1.2 Algorithm Specification Requirements .............. 18 - 4.1.3 Input Domain Requirements ......................... 18 - 4.1.4 Output Range Requirements ......................... 18 - 4.1.5 Data Integrity and Generality Requirements ........ 18 - 4.1.6 New Functionality Requirements .................... 18 - 4.2 Transfer Encoding Definition Procedure .............. 19 - 4.3 IANA Procedures for Transfer Encoding Registration... 19 - 4.4 Location of Registered Transfer Encodings List ...... 19 - 5. Authors' Addresses ................................... 20 - A. Grandfathered Media Types ............................ 21 - -1. Introduction - - Recent Internet protocols have been carefully designed to be easily - extensible in certain areas. In particular, MIME [RFC 2045] is an - open-ended framework and can accommodate additional object types, - character sets, and access methods without any changes to the basic - protocol. A registration process is needed, however, to ensure that - the set of such values is developed in an orderly, well-specified, - and public manner. - - This document defines registration procedures which use the Internet - Assigned Numbers Authority (IANA) as a central registry for such - values. - - Historical Note: The registration process for media types was - initially defined in the context of the asynchronous Internet mail - environment. In this mail environment there is a need to limit the - number of possible media types to increase the likelihood of - interoperability when the capabilities of the remote mail system are - not known. As media types are used in new environments, where the - proliferation of media types is not a hindrance to interoperability, - the original procedure was excessively restrictive and had to be - generalized. - - - - - -Freed, et. al. Best Current Practice [Page 3] - -RFC 2048 MIME Registration Procedures November 1996 - - -2. Media Type Registration - - Registration of a new media type or types starts with the - construction of a registration proposal. Registration may occur in - several different registration trees, which have different - requirements as discussed below. In general, the new registration - proposal is circulated and reviewed in a fashion appropriate to the - tree involved. The media type is then registered if the proposal is - acceptable. The following sections describe the requirements and - procedures used for each of the different registration trees. - -2.1. Registration Trees and Subtype Names - - In order to increase the efficiency and flexibility of the - registration process, different structures of subtype names may be - registered to accomodate the different natural requirements for, - e.g., a subtype that will be recommended for wide support and - implementation by the Internet Community or a subtype that is used to - move files associated with proprietary software. The following - subsections define registration "trees", distinguished by the use of - faceted names (e.g., names of the form "tree.subtree...type"). Note - that some media types defined prior to this document do not conform - to the naming conventions described below. See Appendix A for a - discussion of them. - -2.1.1. IETF Tree - - The IETF tree is intended for types of general interest to the - Internet Community. Registration in the IETF tree requires approval - by the IESG and publication of the media type registration as some - form of RFC. - - Media types in the IETF tree are normally denoted by names that are - not explicitly faceted, i.e., do not contain period (".", full stop) - characters. - - The "owner" of a media type registration in the IETF tree is assumed - to be the IETF itself. Modification or alteration of the - specification requires the same level of processing (e.g. standards - track) required for the initial registration. - -2.1.2. Vendor Tree - - The vendor tree is used for media types associated with commercially - available products. "Vendor" or "producer" are construed as - equivalent and very broadly in this context. - - - - - -Freed, et. al. Best Current Practice [Page 4] - -RFC 2048 MIME Registration Procedures November 1996 - - - A registration may be placed in the vendor tree by anyone who has - need to interchange files associated with the particular product. - However, the registration formally belongs to the vendor or - organization producing the software or file format. Changes to the - specification will be made at their request, as discussed in - subsequent sections. - - Registrations in the vendor tree will be distinguished by the leading - facet "vnd.". That may be followed, at the discretion of the - registration, by either a media type name from a well-known producer - (e.g., "vnd.mudpie") or by an IANA-approved designation of the - producer's name which is then followed by a media type or product - designation (e.g., vnd.bigcompany.funnypictures). - - While public exposure and review of media types to be registered in - the vendor tree is not required, using the ietf-types list for review - is strongly encouraged to improve the quality of those - specifications. Registrations in the vendor tree may be submitted - directly to the IANA. - -2.1.3. Personal or Vanity Tree - - Registrations for media types created experimentally or as part of - products that are not distributed commercially may be registered in - the personal or vanity tree. The registrations are distinguished by - the leading facet "prs.". - - The owner of "personal" registrations and associated specifications - is the person or entity making the registration, or one to whom - responsibility has been transferred as described below. - - While public exposure and review of media types to be registered in - the personal tree is not required, using the ietf-types list for - review is strongly encouraged to improve the quality of those - specifications. Registrations in the personl tree may be submitted - directly to the IANA. - -2.1.4. Special `x.' Tree - - For convenience and symmetry with this registration scheme, media - type names with "x." as the first facet may be used for the same - purposes for which names starting in "x-" are normally used. These - types are unregistered, experimental, and should be used only with - the active agreement of the parties exchanging them. - - - - - - - -Freed, et. al. Best Current Practice [Page 5] - -RFC 2048 MIME Registration Procedures November 1996 - - - However, with the simplified registration procedures described above - for vendor and personal trees, it should rarely, if ever, be - necessary to use unregistered experimental types, and as such use of - both "x-" and "x." forms is discouraged. - -2.1.5. Additional Registration Trees - - From time to time and as required by the community, the IANA may, - with the advice and consent of the IESG, create new top-level - registration trees. It is explicitly assumed that these trees may be - created for external registration and management by well-known - permanent bodies, such as scientific societies for media types - specific to the sciences they cover. In general, the quality of - review of specifications for one of these additional registration - trees is expected to be equivalent to that which IETF would give to - registrations in its own tree. Establishment of these new trees will - be announced through RFC publication approved by the IESG. - -2.2. Registration Requirements - - Media type registration proposals are all expected to conform to - various requirements laid out in the following sections. Note that - requirement specifics sometimes vary depending on the registration - tree, again as detailed in the following sections. - -2.2.1. Functionality Requirement - - Media types must function as an actual media format: Registration of - things that are better thought of as a transfer encoding, as a - character set, or as a collection of separate entities of another - type, is not allowed. For example, although applications exist to - decode the base64 transfer encoding [RFC 2045], base64 cannot be - registered as a media type. - - This requirement applies regardless of the registration tree - involved. - -2.2.2. Naming Requirements - - All registered media types must be assigned MIME type and subtype - names. The combination of these names then serves to uniquely - identify the media type and the format of the subtype name identifies - the registration tree. - - The choice of top-level type name must take the nature of media type - involved into account. For example, media normally used for - representing still images should be a subtype of the image content - type, whereas media capable of representing audio information belongs - - - -Freed, et. al. Best Current Practice [Page 6] - -RFC 2048 MIME Registration Procedures November 1996 - - - under the audio content type. See RFC 2046 for additional information - on the basic set of top-level types and their characteristics. - - New subtypes of top-level types must conform to the restrictions of - the top-level type, if any. For example, all subtypes of the - multipart content type must use the same encapsulation syntax. - - In some cases a new media type may not "fit" under any currently - defined top-level content type. Such cases are expected to be quite - rare. However, if such a case arises a new top-level type can be - defined to accommodate it. Such a definition must be done via - standards-track RFC; no other mechanism can be used to define - additional top-level content types. - - These requirements apply regardless of the registration tree - involved. - -2.2.3. Parameter Requirements - - Media types may elect to use one or more MIME content type - parameters, or some parameters may be automatically made available to - the media type by virtue of being a subtype of a content type that - defines a set of parameters applicable to any of its subtypes. In - either case, the names, values, and meanings of any parameters must - be fully specified when a media type is registered in the IETF tree, - and should be specified as completely as possible when media types - are registered in the vendor or personal trees. - - New parameters must not be defined as a way to introduce new - functionality in types registered in the IETF tree, although new - parameters may be added to convey additional information that does - not otherwise change existing functionality. An example of this - would be a "revision" parameter to indicate a revision level of an - external specification such as JPEG. Similar behavior is encouraged - for media types registered in the vendor or personal trees but is not - required. - -2.2.4. Canonicalization and Format Requirements - - All registered media types must employ a single, canonical data - format, regardless of registration tree. - - A precise and openly available specification of the format of each - media type is required for all types registered in the IETF tree and - must at a minimum be referenced by, if it isn't actually included in, - the media type registration proposal itself. - - - - - -Freed, et. al. Best Current Practice [Page 7] - -RFC 2048 MIME Registration Procedures November 1996 - - - The specifications of format and processing particulars may or may - not be publically available for media types registered in the vendor - tree, and such registration proposals are explicitly permitted to - include only a specification of which software and version produce or - process such media types. References to or inclusion of format - specifications in registration proposals is encouraged but not - required. - - Format specifications are still required for registration in the - personal tree, but may be either published as RFCs or otherwise - deposited with IANA. The deposited specifications will meet the same - criteria as those required to register a well-known TCP port and, in - particular, need not be made public. - - Some media types involve the use of patented technology. The - registration of media types involving patented technology is - specifically permitted. However, the restrictions set forth in RFC - 1602 on the use of patented technology in standards-track protocols - must be respected when the specification of a media type is part of a - standards-track protocol. - -2.2.5. Interchange Recommendations - - Media types should, whenever possible, interoperate across as many - systems and applications as possible. However, some media types will - inevitably have problems interoperating across different platforms. - Problems with different versions, byte ordering, and specifics of - gateway handling can and will arise. - - Universal interoperability of media types is not required, but known - interoperability issues should be identified whenever possible. - Publication of a media type does not require an exhaustive review of - interoperability, and the interoperability considerations section is - subject to continuing evaluation. - - These recommendations apply regardless of the registration tree - involved. - -2.2.6. Security Requirements - - An analysis of security issues is required for for all types - registered in the IETF Tree. (This is in accordance with the basic - requirements for all IETF protocols.) A similar analysis for media - types registered in the vendor or personal trees is encouraged but - not required. However, regardless of what security analysis has or - has not been done, all descriptions of security issues must be as - accurate as possible regardless of registration tree. In particular, - a statement that there are "no security issues associated with this - - - -Freed, et. al. Best Current Practice [Page 8] - -RFC 2048 MIME Registration Procedures November 1996 - - - type" must not be confused with "the security issues associates with - this type have not been assessed". - - There is absolutely no requirement that media types registered in any - tree be secure or completely free from risks. Nevertheless, all - known security risks must be identified in the registration of a - media type, again regardless of registration tree. - - The security considerations section of all registrations is subject - to continuing evaluation and modification, and in particular may be - extended by use of the "comments on media types" mechanism described - in subsequent sections. - - Some of the issues that should be looked at in a security analysis of - a media type are: - - (1) Complex media types may include provisions for - directives that institute actions on a recipient's - files or other resources. In many cases provision is - made for originators to specify arbitrary actions in an - unrestricted fashion which may then have devastating - effects. See the registration of the - application/postscript media type in RFC 2046 for - an example of such directives and how to handle them. - - (2) Complex media types may include provisions for - directives that institute actions which, while not - directly harmful to the recipient, may result in - disclosure of information that either facilitates a - subsequent attack or else violates a recipient's - privacy in some way. Again, the registration of the - application/postscript media type illustrates how such - directives can be handled. - - (3) A media type might be targeted for applications that - require some sort of security assurance but not provide - the necessary security mechanisms themselves. For - example, a media type could be defined for storage of - confidential medical information which in turn requires - an external confidentiality service. - -2.2.7. Usage and Implementation Non-requirements - - In the asynchronous mail environment, where information on the - capabilities of the remote mail agent is frequently not available to - the sender, maximum interoperability is attained by restricting the - number of media types used to those "common" formats expected to be - widely implemented. This was asserted in the past as a reason to - - - -Freed, et. al. Best Current Practice [Page 9] - -RFC 2048 MIME Registration Procedures November 1996 - - - limit the number of possible media types and resulted in a - registration process with a significant hurdle and delay for those - registering media types. - - However, the need for "common" media types does not require limiting - the registration of new media types. If a limited set of media types - is recommended for a particular application, that should be asserted - by a separate applicability statement specific for the application - and/or environment. - - As such, universal support and implementation of a media type is NOT - a requirement for registration. If, however, a media type is - explicitly intended for limited use, this should be noted in its - registration. - -2.2.8. Publication Requirements - - Proposals for media types registered in the IETF tree must be - published as RFCs. RFC publication of vendor and personal media type - proposals is encouraged but not required. In all cases IANA will - retain copies of all media type proposals and "publish" them as part - of the media types registration tree itself. - - Other than in the IETF tree, the registration of a data type does not - imply endorsement, approval, or recommendation by IANA or IETF or - even certification that the specification is adequate. To become - Internet Standards, protocol, data objects, or whatever must go - through the IETF standards process. This is too difficult and too - lengthy a process for the convenient registration of media types. - - The IETF tree exists for media types that do require require a - substantive review and approval process with the vendor and personal - trees exist for those that do not. It is expected that applicability - statements for particular applications will be published from time to - time that recommend implementation of, and support for, media types - that have proven particularly useful in those contexts. - - As discussed above, registration of a top-level type requires - standards-track processing and, hence, RFC publication. - -2.2.9. Additional Information - - Various sorts of optional information may be included in the - specification of a media type if it is available: - - (1) Magic number(s) (length, octet values). Magic numbers - are byte sequences that are always present and thus can - be used to identify entities as being of a given media - - - -Freed, et. al. Best Current Practice [Page 10] - -RFC 2048 MIME Registration Procedures November 1996 - - - type. - - (2) File extension(s) commonly used on one or more - platforms to indicate that some file containing a given - type of media. - - (3) Macintosh File Type code(s) (4 octets) used to label - files containing a given type of media. - - Such information is often quite useful to implementors and if - available should be provided. - -2.3. Registration Procedure - - The following procedure has been implemented by the IANA for review - and approval of new media types. This is not a formal standards - process, but rather an administrative procedure intended to allow - community comment and sanity checking without excessive time delay. - For registration in the IETF tree, the normal IETF processes should - be followed, treating posting of an internet-draft and announcement - on the ietf-types list (as described in the next subsection) as a - first step. For registrations in the vendor or personal tree, the - initial review step described below may be omitted and the type - registered directly by submitting the template and an explanation - directly to IANA (at iana@iana.org). However, authors of vendor or - personal media type specifications are encouraged to seek community - review and comment whenever that is feasible. - -2.3.1. Present the Media Type to the Community for Review - - Send a proposed media type registration to the "ietf-types@iana.org" - mailing list for a two week review period. This mailing list has - been established for the purpose of reviewing proposed media and - access types. Proposed media types are not formally registered and - must not be used; the "x-" prefix specified in RFC 2045 can be used - until registration is complete. - - The intent of the public posting is to solicit comments and feedback - on the choice of type/subtype name, the unambiguity of the references - with respect to versions and external profiling information, and a - review of any interoperability or security considerations. The - submitter may submit a revised registration, or withdraw the - registration completely, at any time. - - - - - - - - -Freed, et. al. Best Current Practice [Page 11] - -RFC 2048 MIME Registration Procedures November 1996 - - -2.3.2. IESG Approval - - Media types registered in the IETF tree must be submitted to the IESG - for approval. - -2.3.3. IANA Registration - - Provided that the media type meets the requirements for media types - and has obtained approval that is necessary, the author may submit - the registration request to the IANA, which will register the media - type and make the media type registration available to the community. - -2.4. Comments on Media Type Registrations - - Comments on registered media types may be submitted by members of the - community to IANA. These comments will be passed on to the "owner" - of the media type if possible. Submitters of comments may request - that their comment be attached to the media type registration itself, - and if IANA approves of this the comment will be made accessible in - conjunction with the type registration itself. - -2.5. Location of Registered Media Type List - - Media type registrations will be posted in the anonymous FTP - directory "ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/" - and all registered media types will be listed in the periodically - issued "Assigned Numbers" RFC [currently STD 2, RFC 1700]. The media - type description and other supporting material may also be published - as an Informational RFC by sending it to "rfc-editor@isi.edu" (please - follow the instructions to RFC authors [RFC-1543]). - -2.6. IANA Procedures for Registering Media Types - - The IANA will only register media types in the IETF tree in response - to a communication from the IESG stating that a given registration - has been approved. Vendor and personal types will be registered by - the IANA automatically and without any formal review as long as the - following minimal conditions are met: - - (1) Media types must function as an actual media format. - In particular, character sets and transfer encodings - may not be registered as media types. - - (2) All media types must have properly formed type and - subtype names. All type names must be defined by a - standards-track RFC. All subtype names must be unique, - must conform to the MIME grammar for such names, and - must contain the proper tree prefix. - - - -Freed, et. al. Best Current Practice [Page 12] - -RFC 2048 MIME Registration Procedures November 1996 - - - (3) Types registered in the personal tree must either - provide a format specification or a pointer to one. - - (4) Any security considerations given must not be obviously - bogus. (It is neither possible nor necessary for the - IANA to conduct a comprehensive security review of - media type registrations. Nevertheless, IANA has the - authority to identify obviously incompetent material - and exclude it.) - -2.7. Change Control - - Once a media type has been published by IANA, the author may request - a change to its definition. The descriptions of the different - registration trees above designate the "owners" of each type of - registration. The change request follows the same procedure as the - registration request: - - (1) Publish the revised template on the ietf-types list. - - (2) Leave at least two weeks for comments. - - (3) Publish using IANA after formal review if required. - - Changes should be requested only when there are serious omission or - errors in the published specification. When review is required, a - change request may be denied if it renders entities that were valid - under the previous definition invalid under the new definition. - - The owner of a content type may pass responsibility for the content - type to another person or agency by informing IANA and the ietf-types - list; this can be done without discussion or review. - - The IESG may reassign responsibility for a media type. The most - common case of this will be to enable changes to be made to types - where the author of the registration has died, moved out of contact - or is otherwise unable to make changes that are important to the - community. - - Media type registrations may not be deleted; media types which are no - longer believed appropriate for use can be declared OBSOLETE by a - change to their "intended use" field; such media types will be - clearly marked in the lists published by IANA. - - - - - - - - -Freed, et. al. Best Current Practice [Page 13] - -RFC 2048 MIME Registration Procedures November 1996 - - -2.8. Registration Template - - To: ietf-types@iana.org - Subject: Registration of MIME media type XXX/YYY - - MIME media type name: - - MIME subtype name: - - Required parameters: - - Optional parameters: - - Encoding considerations: - - Security considerations: - - Interoperability considerations: - - Published specification: - - Applications which use this media type: - - Additional information: - - Magic number(s): - File extension(s): - Macintosh File Type Code(s): - - Person & email address to contact for further information: - - Intended usage: - - (One of COMMON, LIMITED USE or OBSOLETE) - - Author/Change controller: - - (Any other information that the author deems interesting may be - added below this line.) - -3. External Body Access Types - - RFC 2046 defines the message/external-body media type, whereby a MIME - entity can act as pointer to the actual body data in lieu of - including the data directly in the entity body. Each - message/external-body reference specifies an access type, which - determines the mechanism used to retrieve the actual body data. RFC - 2046 defines an initial set of access types, but allows for the - - - -Freed, et. al. Best Current Practice [Page 14] - -RFC 2048 MIME Registration Procedures November 1996 - - - registration of additional access types to accommodate new retrieval - mechanisms. - -3.1. Registration Requirements - - New access type specifications must conform to a number of - requirements as described below. - -3.1.1. Naming Requirements - - Each access type must have a unique name. This name appears in the - access-type parameter in the message/external-body content-type - header field, and must conform to MIME content type parameter syntax. - -3.1.2. Mechanism Specification Requirements - - All of the protocols, transports, and procedures used by a given - access type must be described, either in the specification of the - access type itself or in some other publicly available specification, - in sufficient detail for the access type to be implemented by any - competent implementor. Use of secret and/or proprietary methods in - access types are expressly prohibited. The restrictions imposed by - RFC 1602 on the standardization of patented algorithms must be - respected as well. - -3.1.3. Publication Requirements - - All access types must be described by an RFC. The RFC may be - informational rather than standards-track, although standard-track - review and approval are encouraged for all access types. - -3.1.4. Security Requirements - - Any known security issues that arise from the use of the access type - must be completely and fully described. It is not required that the - access type be secure or that it be free from risks, but that the - known risks be identified. Publication of a new access type does not - require an exhaustive security review, and the security - considerations section is subject to continuing evaluation. - Additional security considerations should be addressed by publishing - revised versions of the access type specification. - -3.2. Registration Procedure - - Registration of a new access type starts with the construction of a - draft of an RFC. - - - - - -Freed, et. al. Best Current Practice [Page 15] - -RFC 2048 MIME Registration Procedures November 1996 - - -3.2.1. Present the Access Type to the Community - - Send a proposed access type specification to the "ietf- - types@iana.org" mailing list for a two week review period. This - mailing list has been established for the purpose of reviewing - proposed access and media types. Proposed access types are not - formally registered and must not be used. - - The intent of the public posting is to solicit comments and feedback - on the access type specification and a review of any security - considerations. - -3.2.2. Access Type Reviewer - - When the two week period has passed, the access type reviewer, who is - appointed by the IETF Applications Area Director, either forwards the - request to iana@isi.edu, or rejects it because of significant - objections raised on the list. - - Decisions made by the reviewer must be posted to the ietf-types - mailing list within 14 days. Decisions made by the reviewer may be - appealed to the IESG. - -3.2.3. IANA Registration - - Provided that the access type has either passed review or has been - successfully appealed to the IESG, the IANA will register the access - type and make the registration available to the community. The - specification of the access type must also be published as an RFC. - Informational RFCs are published by sending them to "rfc- - editor@isi.edu" (please follow the instructions to RFC authors [RFC- - 1543]). - -3.3. Location of Registered Access Type List - - Access type registrations will be posted in the anonymous FTP - directory "ftp://ftp.isi.edu/in-notes/iana/assignments/access-types/" - and all registered access types will be listed in the periodically - issued "Assigned Numbers" RFC [currently RFC-1700]. - -3.4. IANA Procedures for Registering Access Types - - The identity of the access type reviewer is communicated to the IANA - by the IESG. The IANA then only acts in response to access type - definitions that either are approved by the access type reviewer and - forwarded by the reviewer to the IANA for registration, or in - response to a communication from the IESG that an access type - definition appeal has overturned the access type reviewer's ruling. - - - -Freed, et. al. Best Current Practice [Page 16] - -RFC 2048 MIME Registration Procedures November 1996 - - -4. Transfer Encodings - - Transfer encodings are tranformations applied to MIME media types - after conversion to the media type's canonical form. Transfer - encodings are used for several purposes: - - (1) Many transports, especially message transports, can - only handle data consisting of relatively short lines - of text. There can also be severe restrictions on what - characters can be used in these lines of text -- some - transports are restricted to a small subset of US-ASCII - and others cannot handle certain character sequences. - Transfer encodings are used to transform binary data - into textual form that can survive such transports. - Examples of this sort of transfer encoding include the - base64 and quoted-printable transfer encodings defined - in RFC 2045. - - (2) Image, audio, video, and even application entities are - sometimes quite large. Compression algorithms are often - quite effective in reducing the size of large entities. - Transfer encodings can be used to apply general-purpose - non-lossy compression algorithms to MIME entities. - - (3) Transport encodings can be defined as a means of - representing existing encoding formats in a MIME - context. - - IMPORTANT: The standardization of a large numbers of different - transfer encodings is seen as a significant barrier to widespread - interoperability and is expressely discouraged. Nevertheless, the - following procedure has been defined to provide a means of defining - additional transfer encodings, should standardization actually be - justified. - -4.1. Transfer Encoding Requirements - - Transfer encoding specifications must conform to a number of - requirements as described below. - -4.1.1. Naming Requirements - - Each transfer encoding must have a unique name. This name appears in - the Content-Transfer-Encoding header field and must conform to the - syntax of that field. - - - - - - -Freed, et. al. Best Current Practice [Page 17] - -RFC 2048 MIME Registration Procedures November 1996 - - -4.1.2. Algorithm Specification Requirements - - All of the algorithms used in a transfer encoding (e.g. conversion - to printable form, compression) must be described in their entirety - in the transfer encoding specification. Use of secret and/or - proprietary algorithms in standardized transfer encodings are - expressly prohibited. The restrictions imposed by RFC 1602 on the - standardization of patented algorithms must be respected as well. - -4.1.3. Input Domain Requirements - - All transfer encodings must be applicable to an arbitrary sequence of - octets of any length. Dependence on particular input forms is not - allowed. - - It should be noted that the 7bit and 8bit encodings do not conform to - this requirement. Aside from the undesireability of having - specialized encodings, the intent here is to forbid the addition of - additional encodings along the lines of 7bit and 8bit. - -4.1.4. Output Range Requirements - - There is no requirement that a particular tranfer encoding produce a - particular form of encoded output. However, the output format for - each transfer encoding must be fully and completely documented. In - particular, each specification must clearly state whether the output - format always lies within the confines of 7bit data, 8bit data, or is - simply pure binary data. - -4.1.5. Data Integrity and Generality Requirements - - All transfer encodings must be fully invertible on any platform; it - must be possible for anyone to recover the original data by - performing the corresponding decoding operation. Note that this - requirement effectively excludes all forms of lossy compression as - well as all forms of encryption from use as a transfer encoding. - -4.1.6. New Functionality Requirements - - All transfer encodings must provide some sort of new functionality. - Some degree of functionality overlap with previously defined transfer - encodings is acceptable, but any new transfer encoding must also - offer something no other transfer encoding provides. - - - - - - - - -Freed, et. al. Best Current Practice [Page 18] - -RFC 2048 MIME Registration Procedures November 1996 - - -4.2. Transfer Encoding Definition Procedure - - Definition of a new transfer encoding starts with the construction of - a draft of a standards-track RFC. The RFC must define the transfer - encoding precisely and completely, and must also provide substantial - justification for defining and standardizing a new transfer encoding. - This specification must then be presented to the IESG for - consideration. The IESG can - - (1) reject the specification outright as being - inappropriate for standardization, - - (2) approve the formation of an IETF working group to work - on the specification in accordance with IETF - procedures, or, - - (3) accept the specification as-is and put it directly on - the standards track. - - Transfer encoding specifications on the standards track follow normal - IETF rules for standards track documents. A transfer encoding is - considered to be defined and available for use once it is on the - standards track. - -4.3. IANA Procedures for Transfer Encoding Registration - - There is no need for a special procedure for registering Transfer - Encodings with the IANA. All legitimate transfer encoding - registrations must appear as a standards-track RFC, so it is the - IESG's responsibility to notify the IANA when a new transfer encoding - has been approved. - -4.4. Location of Registered Transfer Encodings List - - Transfer encoding registrations will be posted in the anonymous FTP - directory "ftp://ftp.isi.edu/in-notes/iana/assignments/transfer- - encodings/" and all registered transfer encodings will be listed in - the periodically issued "Assigned Numbers" RFC [currently RFC-1700]. - - - - - - - - - - - - - -Freed, et. al. Best Current Practice [Page 19] - -RFC 2048 MIME Registration Procedures November 1996 - - -5. Authors' Addresses - - For more information, the authors of this document are best - contacted via Internet mail: - - Ned Freed - Innosoft International, Inc. - 1050 East Garvey Avenue South - West Covina, CA 91790 - USA - - Phone: +1 818 919 3600 - Fax: +1 818 919 3614 - EMail: ned@innosoft.com - - - John Klensin - MCI - 2100 Reston Parkway - Reston, VA 22091 - - Phone: +1 703 715-7361 - Fax: +1 703 715-7436 - EMail: klensin@mci.net - - - Jon Postel - USC/Information Sciences Institute - 4676 Admiralty Way - Marina del Rey, CA 90292 - USA - - - Phone: +1 310 822 1511 - Fax: +1 310 823 6714 - EMail: Postel@ISI.EDU - - - - - - - - - - - - - - - -Freed, et. al. Best Current Practice [Page 20] - -RFC 2048 MIME Registration Procedures November 1996 - - -Appendix A -- Grandfathered Media Types - - A number of media types, registered prior to 1996, would, if - registered under the guidelines in this document, be placed into - either the vendor or personal trees. Reregistration of those types - to reflect the appropriate trees is encouraged, but not required. - Ownership and change control principles outlined in this document - apply to those types as if they had been registered in the trees - described above. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Freed, et. al. Best Current Practice [Page 21] - diff --git a/proto/rfc2049.txt b/proto/rfc2049.txt @@ -1,1347 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 2049 Innosoft -Obsoletes: 1521, 1522, 1590 N. Borenstein -Category: Standards Track First Virtual - November 1996 - - - Multipurpose Internet Mail Extensions - (MIME) Part Five: - Conformance Criteria and Examples - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - STD 11, RFC 822, defines a message representation protocol specifying - considerable detail about US-ASCII message headers, and leaves the - message content, or message body, as flat US-ASCII text. This set of - documents, collectively called the Multipurpose Internet Mail - Extensions, or MIME, redefines the format of messages to allow for - - (1) textual message bodies in character sets other than - US-ASCII, - - (2) an extensible set of different formats for non-textual - message bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than - US-ASCII. - - These documents are based on earlier work documented in RFC 934, STD - 11, and RFC 1049, but extends and revises them. Because RFC 822 said - so little about message bodies, these documents are largely - orthogonal to (rather than a revision of) RFC 822. - - The initial document in this set, RFC 2045, specifies the various - headers used to describe the structure of MIME messages. The second - document defines the general structure of the MIME media typing - system and defines an initial set of media types. The third - document, RFC 2047, describes extensions to RFC 822 to allow non-US- - - - -Freed & Borenstein Standards Track [Page 1] - -RFC 2049 MIME Conformance November 1996 - - - ASCII text data in Internet mail header fields. The fourth document, - RFC 2048, specifies various IANA registration procedures for MIME- - related facilities. This fifth and final document describes MIME - conformance criteria as well as providing some illustrative examples - of MIME message formats, acknowledgements, and the bibliography. - - These documents are revisions of RFCs 1521, 1522, and 1590, which - themselves were revisions of RFCs 1341 and 1342. Appendix B of this - document describes differences and changes from previous versions. - -Table of Contents - - 1. Introduction .......................................... 2 - 2. MIME Conformance ...................................... 2 - 3. Guidelines for Sending Email Data ..................... 6 - 4. Canonical Encoding Model .............................. 9 - 5. Summary ............................................... 12 - 6. Security Considerations ............................... 12 - 7. Authors' Addresses .................................... 12 - 8. Acknowledgements ...................................... 13 - A. A Complex Multipart Example ........................... 15 - B. Changes from RFC 1521, 1522, and 1590 ................. 16 - C. References ............................................ 20 - -1. Introduction - - The first and second documents in this set define MIME header fields - and the initial set of MIME media types. The third document - describes extensions to RFC822 formats to allow for character sets - other than US-ASCII. This document describes what portions of MIME - must be supported by a conformant MIME implementation. It also - describes various pitfalls of contemporary messaging systems as well - as the canonical encoding model MIME is based on. - -2. MIME Conformance - - The mechanisms described in these documents are open-ended. It is - definitely not expected that all implementations will support all - available media types, nor that they will all share the same - extensions. In order to promote interoperability, however, it is - useful to define the concept of "MIME-conformance" to define a - certain level of implementation that allows the useful interworking - of messages with content that differs from US-ASCII text. In this - section, we specify the requirements for such conformance. - - - - - - - -Freed & Borenstein Standards Track [Page 2] - -RFC 2049 MIME Conformance November 1996 - - - A mail user agent that is MIME-conformant MUST: - - (1) Always generate a "MIME-Version: 1.0" header field in - any message it creates. - - (2) Recognize the Content-Transfer-Encoding header field - and decode all received data encoded by either quoted- - printable or base64 implementations. The identity - transformations 7bit, 8bit, and binary must also be - recognized. - - Any non-7bit data that is sent without encoding must be - properly labelled with a content-transfer-encoding of - 8bit or binary, as appropriate. If the underlying - transport does not support 8bit or binary (as SMTP - [RFC-821] does not), the sender is required to both - encode and label data using an appropriate Content- - Transfer-Encoding such as quoted-printable or base64. - - (3) Must treat any unrecognized Content-Transfer-Encoding - as if it had a Content-Type of "application/octet- - stream", regardless of whether or not the actual - Content-Type is recognized. - - (4) Recognize and interpret the Content-Type header field, - and avoid showing users raw data with a Content-Type - field other than text. Implementations must be able - to send at least text/plain messages, with the - character set specified with the charset parameter if - it is not US-ASCII. - - (5) Ignore any content type parameters whose names they do - not recognize. - - (6) Explicitly handle the following media type values, to - at least the following extents: - - Text: - - -- Recognize and display "text" mail with the - character set "US-ASCII." - - -- Recognize other character sets at least to the - extent of being able to inform the user about what - character set the message uses. - - - - - - -Freed & Borenstein Standards Track [Page 3] - -RFC 2049 MIME Conformance November 1996 - - - -- Recognize the "ISO-8859-*" character sets to the - extent of being able to display those characters that - are common to ISO-8859-* and US-ASCII, namely all - characters represented by octet values 1-127. - - -- For unrecognized subtypes in a known character - set, show or offer to show the user the "raw" version - of the data after conversion of the content from - canonical form to local form. - - -- Treat material in an unknown character set as if - it were "application/octet-stream". - - Image, audio, and video: - - -- At a minumum provide facilities to treat any - unrecognized subtypes as if they were - "application/octet-stream". - - Application: - - -- Offer the ability to remove either of the quoted- - printable or base64 encodings defined in this - document if they were used and put the resulting - information in a user file. - - Multipart: - - -- Recognize the mixed subtype. Display all relevant - information on the message level and the body part - header level and then display or offer to display - each of the body parts individually. - - -- Recognize the "alternative" subtype, and avoid - showing the user redundant parts of - multipart/alternative mail. - - -- Recognize the "multipart/digest" subtype, - specifically using "message/rfc822" rather than - "text/plain" as the default media type for body parts - inside "multipart/digest" entities. - - -- Treat any unrecognized subtypes as if they were - "mixed". - - - - - - - -Freed & Borenstein Standards Track [Page 4] - -RFC 2049 MIME Conformance November 1996 - - - Message: - - -- Recognize and display at least the RFC822 message - encapsulation (message/rfc822) in such a way as to - preserve any recursive structure, that is, displaying - or offering to display the encapsulated data in - accordance with its media type. - - -- Treat any unrecognized subtypes as if they were - "application/octet-stream". - - (7) Upon encountering any unrecognized Content-Type field, - an implementation must treat it as if it had a media - type of "application/octet-stream" with no parameter - sub-arguments. How such data are handled is up to an - implementation, but likely options for handling such - unrecognized data include offering the user to write it - into a file (decoded from its mail transport format) or - offering the user to name a program to which the - decoded data should be passed as input. - - (8) Conformant user agents are required, if they provide - non-standard support for non-MIME messages employing - character sets other than US-ASCII, to do so on - received messages only. Conforming user agents must not - send non-MIME messages containing anything other than - US-ASCII text. - - In particular, the use of non-US-ASCII text in mail - messages without a MIME-Version field is strongly - discouraged as it impedes interoperability when sending - messages between regions with different localization - conventions. Conforming user agents MUST include proper - MIME labelling when sending anything other than plain - text in the US-ASCII character set. - - In addition, non-MIME user agents should be upgraded if - at all possible to include appropriate MIME header - information in the messages they send even if nothing - else in MIME is supported. This upgrade will have - little, if any, effect on non-MIME recipients and will - aid MIME in correctly displaying such messages. It - also provides a smooth transition path to eventual - adoption of other MIME capabilities. - - (9) Conforming user agents must ensure that any string of - non-white-space printable US-ASCII characters within a - "*text" or "*ctext" that begins with "=?" and ends with - - - -Freed & Borenstein Standards Track [Page 5] - -RFC 2049 MIME Conformance November 1996 - - - "?=" be a valid encoded-word. ("begins" means: At the - start of the field-body or immediately following - linear-white-space; "ends" means: At the end of the - field-body or immediately preceding linear-white- - space.) In addition, any "word" within a "phrase" that - begins with "=?" and ends with "?=" must be a valid - encoded-word. - - (10) Conforming user agents must be able to distinguish - encoded-words from "text", "ctext", or "word"s, - according to the rules in section 4, anytime they - appear in appropriate places in message headers. It - must support both the "B" and "Q" encodings for any - character set which it supports. The program must be - able to display the unencoded text if the character set - is "US-ASCII". For the ISO-8859-* character sets, the - mail reading program must at least be able to display - the characters which are also in the US-ASCII set. - - A user agent that meets the above conditions is said to be MIME- - conformant. The meaning of this phrase is that it is assumed to be - "safe" to send virtually any kind of properly-marked data to users of - such mail systems, because such systems will at least be able to - treat the data as undifferentiated binary, and will not simply splash - it onto the screen of unsuspecting users. - - There is another sense in which it is always "safe" to send data in a - format that is MIME-conformant, which is that such data will not - break or be broken by any known systems that are conformant with RFC - 821 and RFC 822. User agents that are MIME-conformant have the - additional guarantee that the user will not be shown data that were - never intended to be viewed as text. - -3. Guidelines for Sending Email Data - - Internet email is not a perfect, homogeneous system. Mail may become - corrupted at several stages in its travel to a final destination. - Specifically, email sent throughout the Internet may travel across - many networking technologies. Many networking and mail technologies - do not support the full functionality possible in the SMTP transport - environment. Mail traversing these systems is likely to be modified - in order that it can be transported. - - There exist many widely-deployed non-conformant MTAs in the Internet. - These MTAs, speaking the SMTP protocol, alter messages on the fly to - take advantage of the internal data structure of the hosts they are - implemented on, or are just plain broken. - - - - -Freed & Borenstein Standards Track [Page 6] - -RFC 2049 MIME Conformance November 1996 - - - The following guidelines may be useful to anyone devising a data - format (media type) that is supposed to survive the widest range of - networking technologies and known broken MTAs unscathed. Note that - anything encoded in the base64 encoding will satisfy these rules, but - that some well-known mechanisms, notably the UNIX uuencode facility, - will not. Note also that anything encoded in the Quoted-Printable - encoding will survive most gateways intact, but possibly not some - gateways to systems that use the EBCDIC character set. - - (1) Under some circumstances the encoding used for data may - change as part of normal gateway or user agent - operation. In particular, conversion from base64 to - quoted-printable and vice versa may be necessary. This - may result in the confusion of CRLF sequences with line - breaks in text bodies. As such, the persistence of - CRLF as something other than a line break must not be - relied on. - - (2) Many systems may elect to represent and store text data - using local newline conventions. Local newline - conventions may not match the RFC822 CRLF convention -- - systems are known that use plain CR, plain LF, CRLF, or - counted records. The result is that isolated CR and LF - characters are not well tolerated in general; they may - be lost or converted to delimiters on some systems, and - hence must not be relied on. - - (3) The transmission of NULs (US-ASCII value 0) is - problematic in Internet mail. (This is largely the - result of NULs being used as a termination character by - many of the standard runtime library routines in the C - programming language.) The practice of using NULs as - termination characters is so entrenched now that - messages should not rely on them being preserved. - - (4) TAB (HT) characters may be misinterpreted or may be - automatically converted to variable numbers of spaces. - This is unavoidable in some environments, notably those - not based on the US-ASCII character set. Such - conversion is STRONGLY DISCOURAGED, but it may occur, - and mail formats must not rely on the persistence of - TAB (HT) characters. - - (5) Lines longer than 76 characters may be wrapped or - truncated in some environments. Line wrapping or line - truncation imposed by mail transports is STRONGLY - DISCOURAGED, but unavoidable in some cases. - Applications which require long lines must somehow - - - -Freed & Borenstein Standards Track [Page 7] - -RFC 2049 MIME Conformance November 1996 - - - differentiate between soft and hard line breaks. (A - simple way to do this is to use the quoted-printable - encoding.) - - (6) Trailing "white space" characters (SPACE, TAB (HT)) on - a line may be discarded by some transport agents, while - other transport agents may pad lines with these - characters so that all lines in a mail file are of - equal length. The persistence of trailing white space, - therefore, must not be relied on. - - (7) Many mail domains use variations on the US-ASCII - character set, or use character sets such as EBCDIC - which contain most but not all of the US-ASCII - characters. The correct translation of characters not - in the "invariant" set cannot be depended on across - character converting gateways. For example, this - situation is a problem when sending uuencoded - information across BITNET, an EBCDIC system. Similar - problems can occur without crossing a gateway, since - many Internet hosts use character sets other than US- - ASCII internally. The definition of Printable Strings - in X.400 adds further restrictions in certain special - cases. In particular, the only characters that are - known to be consistent across all gateways are the 73 - characters that correspond to the upper and lower case - letters A-Z and a-z, the 10 digits 0-9, and the - following eleven special characters: - - "'" (US-ASCII decimal value 39) - "(" (US-ASCII decimal value 40) - ")" (US-ASCII decimal value 41) - "+" (US-ASCII decimal value 43) - "," (US-ASCII decimal value 44) - "-" (US-ASCII decimal value 45) - "." (US-ASCII decimal value 46) - "/" (US-ASCII decimal value 47) - ":" (US-ASCII decimal value 58) - "=" (US-ASCII decimal value 61) - "?" (US-ASCII decimal value 63) - - A maximally portable mail representation will confine - itself to relatively short lines of text in which the - only meaningful characters are taken from this set of - 73 characters. The base64 encoding follows this rule. - - (8) Some mail transport agents will corrupt data that - includes certain literal strings. In particular, a - - - -Freed & Borenstein Standards Track [Page 8] - -RFC 2049 MIME Conformance November 1996 - - - period (".") alone on a line is known to be corrupted - by some (incorrect) SMTP implementations, and a line - that starts with the five characters "From " (the fifth - character is a SPACE) are commonly corrupted as well. - A careful composition agent can prevent these - corruptions by encoding the data (e.g., in the quoted- - printable encoding using "=46rom " in place of "From " - at the start of a line, and "=2E" in place of "." alone - on a line). - - Please note that the above list is NOT a list of recommended - practices for MTAs. RFC 821 MTAs are prohibited from altering the - character of white space or wrapping long lines. These BAD and - invalid practices are known to occur on established networks, and - implementations should be robust in dealing with the bad effects they - can cause. - -4. Canonical Encoding Model - - There was some confusion, in earlier versions of these documents, - regarding the model for when email data was to be converted to - canonical form and encoded, and in particular how this process would - affect the treatment of CRLFs, given that the representation of - newlines varies greatly from system to system. For this reason, a - canonical model for encoding is presented below. - - The process of composing a MIME entity can be modeled as being done - in a number of steps. Note that these steps are roughly similar to - those steps used in PEM [RFC-1421] and are performed for each - "innermost level" body: - - (1) Creation of local form. - - The body to be transmitted is created in the system's - native format. The native character set is used and, - where appropriate, local end of line conventions are - used as well. The body may be a UNIX-style text file, - or a Sun raster image, or a VMS indexed file, or audio - data in a system-dependent format stored only in - memory, or anything else that corresponds to the local - model for the representation of some form of - information. Fundamentally, the data is created in the - "native" form that corresponds to the type specified by - the media type. - - - - - - - -Freed & Borenstein Standards Track [Page 9] - -RFC 2049 MIME Conformance November 1996 - - - (2) Conversion to canonical form. - - The entire body, including "out-of-band" information - such as record lengths and possibly file attribute - information, is converted to a universal canonical - form. The specific media type of the body as well as - its associated attributes dictate the nature of the - canonical form that is used. Conversion to the proper - canonical form may involve character set conversion, - transformation of audio data, compression, or various - other operations specific to the various media types. - If character set conversion is involved, however, care - must be taken to understand the semantics of the media - type, which may have strong implications for any - character set conversion, e.g. with regard to - syntactically meaningful characters in a text subtype - other than "plain". - - For example, in the case of text/plain data, the text - must be converted to a supported character set and - lines must be delimited with CRLF delimiters in - accordance with RFC 822. Note that the restriction on - line lengths implied by RFC 822 is eliminated if the - next step employs either quoted-printable or base64 - encoding. - - (3) Apply transfer encoding. - - A Content-Transfer-Encoding appropriate for this body - is applied. Note that there is no fixed relationship - between the media type and the transfer encoding. In - particular, it may be appropriate to base the choice of - base64 or quoted-printable on character frequency - counts which are specific to a given instance of a - body. - - (4) Insertion into entity. - - The encoded body is inserted into a MIME entity with - appropriate headers. The entity is then inserted into - the body of a higher-level entity (message or - multipart) as needed. - - Conversion from entity form to local form is accomplished by - reversing these steps. Note that reversal of these steps may produce - differing results since there is no guarantee that the original and - final local forms are the same. - - - - -Freed & Borenstein Standards Track [Page 10] - -RFC 2049 MIME Conformance November 1996 - - - It is vital to note that these steps are only a model; they are - specifically NOT a blueprint for how an actual system would be built. - In particular, the model fails to account for two common designs: - - (1) In many cases the conversion to a canonical form prior - to encoding will be subsumed into the encoder itself, - which understands local formats directly. For example, - the local newline convention for text bodies might be - carried through to the encoder itself along with - knowledge of what that format is. - - (2) The output of the encoders may have to pass through one - or more additional steps prior to being transmitted as - a message. As such, the output of the encoder may not - be conformant with the formats specified by RFC 822. - In particular, once again it may be appropriate for the - converter's output to be expressed using local newline - conventions rather than using the standard RFC 822 CRLF - delimiters. - - Other implementation variations are conceivable as well. The vital - aspect of this discussion is that, in spite of any optimizations, - collapsings of required steps, or insertion of additional processing, - the resulting messages must be consistent with those produced by the - model described here. For example, a message with the following - header fields: - - Content-type: text/foo; charset=bar - Content-Transfer-Encoding: base64 - - must be first represented in the text/foo form, then (if necessary) - represented in the "bar" character set, and finally transformed via - the base64 algorithm into a mail-safe form. - - NOTE: Some confusion has been caused by systems that represent - messages in a format which uses local newline conventions which - differ from the RFC822 CRLF convention. It is important to note that - these formats are not canonical RFC822/MIME. These formats are - instead *encodings* of RFC822, where CRLF sequences in the canonical - representation of the message are encoded as the local newline - convention. Note that formats which encode CRLF sequences as, for - example, LF are not capable of representing MIME messages containing - binary data which contains LF octets not part of CRLF line separation - sequences. - - - - - - - -Freed & Borenstein Standards Track [Page 11] - -RFC 2049 MIME Conformance November 1996 - - -5. Summary - - This document defines what is meant by MIME Conformance. It also - details various problems known to exist in the Internet email system - and how to use MIME to overcome them. Finally, it describes MIME's - canonical encoding model. - -6. Security Considerations - - Security issues are discussed in the second document in this set, RFC - 2046. - -7. Authors' Addresses - - For more information, the authors of this document are best contacted - via Internet mail: - - Ned Freed - Innosoft International, Inc. - 1050 East Garvey Avenue South - West Covina, CA 91790 - USA - - Phone: +1 818 919 3600 - Fax: +1 818 919 3614 - EMail: ned@innosoft.com - - Nathaniel S. Borenstein - First Virtual Holdings - 25 Washington Avenue - Morristown, NJ 07960 - USA - - Phone: +1 201 540 8967 - Fax: +1 201 993 3032 - EMail: nsb@nsb.fv.com - - MIME is a result of the work of the Internet Engineering Task Force - Working Group on RFC 822 Extensions. The chairman of that group, - Greg Vaudreuil, may be reached at: - - Gregory M. Vaudreuil - Octel Network Services - 17080 Dallas Parkway - Dallas, TX 75248-1905 - USA - - EMail: Greg.Vaudreuil@Octel.Com - - - -Freed & Borenstein Standards Track [Page 12] - -RFC 2049 MIME Conformance November 1996 - - -8. Acknowledgements - - This document is the result of the collective effort of a large - number of people, at several IETF meetings, on the IETF-SMTP and - IETF-822 mailing lists, and elsewhere. Although any enumeration - seems doomed to suffer from egregious omissions, the following are - among the many contributors to this effort: - - Harald Tveit Alvestrand Marc Andreessen - Randall Atkinson Bob Braden - Philippe Brandon Brian Capouch - Kevin Carosso Uhhyung Choi - Peter Clitherow Dave Collier-Brown - Cristian Constantinof John Coonrod - Mark Crispin Dave Crocker - Stephen Crocker Terry Crowley - Walt Daniels Jim Davis - Frank Dawson Axel Deininger - Hitoshi Doi Kevin Donnelly - Steve Dorner Keith Edwards - Chris Eich Dana S. Emery - Johnny Eriksson Craig Everhart - Patrik Faltstrom Erik E. Fair - Roger Fajman Alain Fontaine - Martin Forssen James M. Galvin - Stephen Gildea Philip Gladstone - Thomas Gordon Keld Simonsen - Terry Gray Phill Gross - James Hamilton David Herron - Mark Horton Bruce Howard - Bill Janssen Olle Jarnefors - Risto Kankkunen Phil Karn - Alan Katz Tim Kehres - Neil Katin Steve Kille - Kyuho Kim Anders Klemets - John Klensin Valdis Kletniek - Jim Knowles Stev Knowles - Bob Kummerfeld Pekka Kytolaakso - Stellan Lagerstrom Vincent Lau - Timo Lehtinen Donald Lindsay - Warner Losh Carlyn Lowery - Laurence Lundblade Charles Lynn - John R. MacMillan Larry Masinter - Rick McGowan Michael J. McInerny - Leo Mclaughlin Goli Montaser-Kohsari - Tom Moore John Gardiner Myers - Erik Naggum Mark Needleman - Chris Newman John Noerenberg - - - -Freed & Borenstein Standards Track [Page 13] - -RFC 2049 MIME Conformance November 1996 - - - Mats Ohrman Julian Onions - Michael Patton David J. Pepper - Erik van der Poel Blake C. Ramsdell - Christer Romson Luc Rooijakkers - Marshall T. Rose Jonathan Rosenberg - Guido van Rossum Jan Rynning - Harri Salminen Michael Sanderson - Yutaka Sato Markku Savela - Richard Alan Schafer Masahiro Sekiguchi - Mark Sherman Bob Smart - Peter Speck Henry Spencer - Einar Stefferud Michael Stein - Klaus Steinberger Peter Svanberg - James Thompson Steve Uhler - Stuart Vance Peter Vanderbilt - Greg Vaudreuil Ed Vielmetti - Larry W. Virden Ryan Waldron - Rhys Weatherly Jay Weber - Dave Wecker Wally Wedel - Sven-Ove Westberg Brian Wideen - John Wobus Glenn Wright - Rayan Zachariassen David Zimmerman - - The authors apologize for any omissions from this list, which are - certainly unintentional. - - - - - - - - - - - - - - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 14] - -RFC 2049 MIME Conformance November 1996 - - -Appendix A -- A Complex Multipart Example - - What follows is the outline of a complex multipart message. This - message contains five parts that are to be displayed serially: two - introductory plain text objects, an embedded multipart message, a - text/enriched object, and a closing encapsulated text message in a - non-ASCII character set. The embedded multipart message itself - contains two objects to be displayed in parallel, a picture and an - audio fragment. - - MIME-Version: 1.0 - From: Nathaniel Borenstein <nsb@nsb.fv.com> - To: Ned Freed <ned@innosoft.com> - Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) - Subject: A multipart example - Content-Type: multipart/mixed; - boundary=unique-boundary-1 - - This is the preamble area of a multipart message. - Mail readers that understand multipart format - should ignore this preamble. - - If you are reading this text, you might want to - consider changing to a mail reader that understands - how to properly display multipart messages. - - --unique-boundary-1 - - ... Some text appears here ... - - [Note that the blank between the boundary and the start - of the text in this part means no header fields were - given and this is text in the US-ASCII character set. - It could have been done with explicit typing as in the - next part.] - - --unique-boundary-1 - Content-type: text/plain; charset=US-ASCII - - This could have been part of the previous part, but - illustrates explicit versus implicit typing of body - parts. - - --unique-boundary-1 - Content-Type: multipart/parallel; boundary=unique-boundary-2 - - --unique-boundary-2 - Content-Type: audio/basic - - - -Freed & Borenstein Standards Track [Page 15] - -RFC 2049 MIME Conformance November 1996 - - - Content-Transfer-Encoding: base64 - - ... base64-encoded 8000 Hz single-channel - mu-law-format audio data goes here ... - - --unique-boundary-2 - Content-Type: image/jpeg - Content-Transfer-Encoding: base64 - - ... base64-encoded image data goes here ... - - --unique-boundary-2-- - - --unique-boundary-1 - Content-type: text/enriched - - This is <bold><italic>enriched.</italic></bold> - <smaller>as defined in RFC 1896</smaller> - - Isn't it - <bigger><bigger>cool?</bigger></bigger> - - --unique-boundary-1 - Content-Type: message/rfc822 - - From: (mailbox in US-ASCII) - To: (address in US-ASCII) - Subject: (subject in US-ASCII) - Content-Type: Text/plain; charset=ISO-8859-1 - Content-Transfer-Encoding: Quoted-printable - - ... Additional text in ISO-8859-1 goes here ... - - --unique-boundary-1-- - -Appendix B -- Changes from RFC 1521, 1522, and 1590 - - These documents are a revision of RFC 1521, 1522, and 1590. For the - convenience of those familiar with the earlier documents, the changes - from those documents are summarized in this appendix. For further - history, note that Appendix H in RFC 1521 specified how that document - differed from its predecessor, RFC 1341. - - (1) This document has been completely reformatted and split - into multiple documents. This was done to improve the - quality of the plain text version of this document, - which is required to be the reference copy. - - - - -Freed & Borenstein Standards Track [Page 16] - -RFC 2049 MIME Conformance November 1996 - - - (2) BNF describing the overall structure of MIME object - headers has been added. This is a documentation change - only -- the underlying syntax has not changed in any - way. - - (3) The specific BNF for the seven media types in MIME has - been removed. This BNF was incorrect, incomplete, amd - inconsistent with the type-indendependent BNF. And - since the type-independent BNF already fully specifies - the syntax of the various MIME headers, the type- - specific BNF was, in the final analysis, completely - unnecessary and caused more problems than it solved. - - (4) The more specific "US-ASCII" character set name has - replaced the use of the informal term ASCII in many - parts of these documents. - - (5) The informal concept of a primary subtype has been - removed. - - (6) The term "object" was being used inconsistently. The - definition of this term has been clarified, along with - the related terms "body", "body part", and "entity", - and usage has been corrected where appropriate. - - (7) The BNF for the multipart media type has been - rearranged to make it clear that the CRLF preceeding - the boundary marker is actually part of the marker - itself rather than the preceeding body part. - - (8) The prose and BNF describing the multipart media type - have been changed to make it clear that the body parts - within a multipart object MUST NOT contain any lines - beginning with the boundary parameter string. - - (9) In the rules on reassembling "message/partial" MIME - entities, "Subject" is added to the list of headers to - take from the inner message, and the example is - modified to clarify this point. - - (10) "Message/partial" fragmenters are restricted to - splitting MIME objects only at line boundaries. - - (11) In the discussion of the application/postscript type, - an additional paragraph has been added warning about - possible interoperability problems caused by embedding - of binary data inside a PostScript MIME entity. - - - - -Freed & Borenstein Standards Track [Page 17] - -RFC 2049 MIME Conformance November 1996 - - - (12) Added a clarifying note to the basic syntax rules for - the Content-Type header field to make it clear that the - following two forms: - - Content-type: text/plain; charset=us-ascii (comment) - - Content-type: text/plain; charset="us-ascii" - - are completely equivalent. - - (13) The following sentence has been removed from the - discussion of the MIME-Version header: "However, - conformant software is encouraged to check the version - number and at least warn the user if an unrecognized - MIME-version is encountered." - - (14) A typo was fixed that said "application/external-body" - instead of "message/external-body". - - (15) The definition of a character set has been reorganized - to make the requirements clearer. - - (16) The definition of the "image/gif" media type has been - moved to a separate document. This change was made - because of potential conflicts with IETF rules - governing the standardization of patented technology. - - (17) The definitions of "7bit" and "8bit" have been - tightened so that use of bare CR, LF can only be used - as end-of-line sequences. The document also no longer - requires that NUL characters be preserved, which brings - MIME into alignment with real-world implementations. - - (18) The definition of canonical text in MIME has been - tightened so that line breaks must be represented by a - CRLF sequence. CR and LF characters are not allowed - outside of this usage. The definition of quoted- - printable encoding has been altered accordingly. - - (19) The definition of the quoted-printable encoding now - includes a number of suggestions for how quoted- - printable encoders might best handle improperly encoded - material. - - (20) Prose was added to clarify the use of the "7bit", - "8bit", and "binary" transfer-encodings on multipart or - message entities encapsulating "8bit" or "binary" data. - - - - -Freed & Borenstein Standards Track [Page 18] - -RFC 2049 MIME Conformance November 1996 - - - (21) In the section on MIME Conformance, "multipart/digest" - support was added to the list of requirements for - minimal MIME conformance. Also, the requirement for - "message/rfc822" support were strengthened to clarify - the importance of recognizing recursive structure. - - (22) The various restrictions on subtypes of "message" are - now specified entirely on a subtype by subtype basis. - - (23) The definition of "message/rfc822" was changed to - indicate that at least one of the "From", "Subject", or - "Date" headers must be present. - - (24) The required handling of unrecognized subtypes as - "application/octet-stream" has been made more explicit - in both the type definitions sections and the - conformance guidelines. - - (25) Examples using text/richtext were changed to - text/enriched. - - (26) The BNF definition of subtype has been changed to make - it clear that either an IANA registered subtype or a - nonstandard "X-" subtype must be used in a Content-Type - header field. - - (27) MIME media types that are simply registered for use and - those that are standardized by the IETF are now - distinguished in the MIME BNF. - - (28) All of the various MIME registration procedures have - been extensively revised. IANA registration procedures - for character sets have been moved to a separate - document that is no included in this set of documents. - - (29) The use of escape and shift mechanisms in the US-ASCII - and ISO-8859-X character sets these documents define - have been clarified: Such mechanisms should never be - used in conjunction with these character sets and their - effect if they are used is undefined. - - (30) The definition of the AFS access-type for - message/external-body has been removed. - - (31) The handling of the combination of - multipart/alternative and message/external-body is now - specifically addressed. - - - - -Freed & Borenstein Standards Track [Page 19] - -RFC 2049 MIME Conformance November 1996 - - - (32) Security issues specific to message/external-body are - now discussed in some detail. - -Appendix C -- References - - [ATK] - Borenstein, Nathaniel S., Multimedia Applications - Development with the Andrew Toolkit, Prentice-Hall, 1990. - - [ISO-2022] - International Standard -- Information Processing -- - Character Code Structure and Extension Techniques, - ISO/IEC 2022:1994, 4th ed. - - [ISO-8859] - International Standard -- Information Processing -- 8-bit - Single-Byte Coded Graphic Character Sets - - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed. - - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed. - - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed. - - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed. - - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st - ed. - - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed. - - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed. - - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed. - - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st - ed. - International Standard -- Information Technology -- 8-bit - Single-Byte Coded Graphic Character Sets - - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992, - 1st ed. - - [ISO-646] - International Standard -- Information Technology -- ISO - 7-bit Coded Character Set for Information Interchange, - ISO 646:1991, 3rd ed.. - - [JPEG] - JPEG Draft Standard ISO 10918-1 CD. - - [MPEG] - Video Coding Draft Standard ISO 11172 CD, ISO - IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May, - 1991. - - - - - - -Freed & Borenstein Standards Track [Page 20] - -RFC 2049 MIME Conformance November 1996 - - - [PCM] - CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code - Modulation (PCM) of Voice Frequencies", Geneva, 1972. - - [POSTSCRIPT] - Adobe Systems, Inc., PostScript Language Reference - Manual, Addison-Wesley, 1985. - - [POSTSCRIPT2] - Adobe Systems, Inc., PostScript Language Reference - Manual, Addison-Wesley, Second Ed., 1990. - - [RFC-783] - Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783, - MIT, June 1981. - - [RFC-821] - Postel, J.B., "Simple Mail Transfer Protocol", STD 10, - RFC 821, USC/Information Sciences Institute, August 1982. - - [RFC-822] - Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", STD 11, RFC 822, UDEL, August 1982. - - [RFC-934] - Rose, M. and E. Stefferud, "Proposed Standard for Message - Encapsulation", RFC 934, Delaware and NMA, January 1985. - - [RFC-959] - Postel, J. and J. Reynolds, "File Transfer Protocol", STD - 9, RFC 959, USC/Information Sciences Institute, October - 1985. - - [RFC-1049] - Sirbu, M., "Content-Type Header Field for Internet - Messages", RFC 1049, CMU, March 1988. - - [RFC-1154] - Robinson, D., and R. Ullmann, "Encoding Header Field for - Internet Messages", RFC 1154, Prime Computer, Inc., April - 1990. - - [RFC-1341] - Borenstein, N., and N. Freed, "MIME (Multipurpose - Internet Mail Extensions): Mechanisms for Specifying and - Describing the Format of Internet Message Bodies", RFC - 1341, Bellcore, Innosoft, June 1992. - - - - -Freed & Borenstein Standards Track [Page 21] - -RFC 2049 MIME Conformance November 1996 - - - [RFC-1342] - Moore, K., "Representation of Non-Ascii Text in Internet - Message Headers", RFC 1342, University of Tennessee, June - 1992. - - [RFC-1344] - Borenstein, N., "Implications of MIME for Internet Mail - Gateways", RFC 1344, Bellcore, June 1992. - - [RFC-1345] - Simonsen, K., "Character Mnemonics & Character Sets", RFC - 1345, Rationel Almen Planlaegning, June 1992. - - [RFC-1421] - Linn, J., "Privacy Enhancement for Internet Electronic - Mail: Part I -- Message Encryption and Authentication - Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, - February 1993. - - [RFC-1422] - Kent, S., "Privacy Enhancement for Internet Electronic - Mail: Part II -- Certificate-Based Key Management", RFC - 1422, IAB IRTF PSRG, IETF PEM WG, February 1993. - - [RFC-1423] - Balenson, D., "Privacy Enhancement for Internet - Electronic Mail: Part III -- Algorithms, Modes, and - Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993. - - [RFC-1424] - Kaliski, B., "Privacy Enhancement for Internet Electronic - Mail: Part IV -- Key Certification and Related - Services", IAB IRTF PSRG, IETF PEM WG, February 1993. - - [RFC-1521] - Borenstein, N., and Freed, N., "MIME (Multipurpose - Internet Mail Extensions): Mechanisms for Specifying and - Describing the Format of Internet Message Bodies", RFC - 1521, Bellcore, Innosoft, September, 1993. - - [RFC-1522] - Moore, K., "Representation of Non-ASCII Text in Internet - Message Headers", RFC 1522, University of Tennessee, - September 1993. - - - - - - - -Freed & Borenstein Standards Track [Page 22] - -RFC 2049 MIME Conformance November 1996 - - - [RFC-1524] - Borenstein, N., "A User Agent Configuration Mechanism for - Multimedia Mail Format Information", RFC 1524, Bellcore, - September 1993. - - [RFC-1543] - Postel, J., "Instructions to RFC Authors", RFC 1543, - USC/Information Sciences Institute, October 1993. - - [RFC-1556] - Nussbacher, H., "Handling of Bi-directional Texts in - MIME", RFC 1556, Israeli Inter-University Computer - Center, December 1993. - - [RFC-1590] - Postel, J., "Media Type Registration Procedure", RFC - 1590, USC/Information Sciences Institute, March 1994. - - [RFC-1602] - Internet Architecture Board, Internet Engineering - Steering Group, Huitema, C., Gross, P., "The Internet - Standards Process -- Revision 2", March 1994. - - [RFC-1652] - Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., - Stefferud, E., and Crocker, D., "SMTP Service Extension - for 8bit-MIME transport", RFC 1652, United Nations - University, Innosoft, Dover Beach Consulting, Inc., - Network Management Associates, Inc., The Branch Office, - March 1994. - - [RFC-1700] - Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, - RFC 1700, USC/Information Sciences Institute, October - 1994. - - [RFC-1741] - Faltstrom, P., Crocker, D., and Fair, E., "MIME Content - Type for BinHex Encoded Files", December 1994. - - [RFC-1896] - Resnick, P., and A. Walker, "The text/enriched MIME - Content-type", RFC 1896, February, 1996. - - - - - - - - -Freed & Borenstein Standards Track [Page 23] - -RFC 2049 MIME Conformance November 1996 - - - [RFC-2045] - Freed, N., and and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message - Bodies", RFC 2045, Innosoft, First Virtual Holdings, - November 1996. - - [RFC-2046] - Freed, N., and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - Innosoft, First Virtual Holdings, November 1996. - - [RFC-2047] - Moore, K., "Multipurpose Internet Mail Extensions (MIME) - Part Three: Representation of Non-ASCII Text in Internet - Message Headers", RFC 2047, University of - Tennessee, November 1996. - - [RFC-2048] - Freed, N., Klensin, J., and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) Part Four: MIME - Registration Procedures", RFC 2048, Innosoft, MCI, - ISI, November 1996. - - [RFC-2049] - Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Five: Conformance Criteria and - Examples", RFC 2049 (this document), Innosoft, First - Virtual Holdings, November 1996. - - [US-ASCII] - Coded Character Set -- 7-Bit American Standard Code for - Information Interchange, ANSI X3.4-1986. - - [X400] - Schicker, Pietro, "Message Handling Systems, X.400", - Message Handling Systems and Distributed Applications, E. - Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- - Holland, 1989, pp. 3-41. - - - - - - - - - - - - - -Freed & Borenstein Standards Track [Page 24] - diff --git a/proto/rfc2183.txt b/proto/rfc2183.txt @@ -1,675 +0,0 @@ - - - - - - -Network Working Group R. Troost -Request for Comments: 2183 New Century Systems -Updates: 1806 S. Dorner -Category: Standards Track QUALCOMM Incorporated - K. Moore, Editor - University of Tennessee - August 1997 - - - Communicating Presentation Information in - Internet Messages: - The Content-Disposition Header Field - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - This memo provides a mechanism whereby messages conforming to the - MIME specifications [RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC - 2049] can convey presentational information. It specifies the - "Content-Disposition" header field, which is optional and valid for - any MIME entity ("message" or "body part"). Two values for this - header field are described in this memo; one for the ordinary linear - presentation of the body part, and another to facilitate the use of - mail to transfer files. It is expected that more values will be - defined in the future, and procedures are defined for extending this - set of values. - - This document is intended as an extension to MIME. As such, the - reader is assumed to be familiar with the MIME specifications, and - [RFC 822]. The information presented herein supplements but does not - replace that found in those documents. - - This document is a revision to the Experimental protocol defined in - RFC 1806. As compared to RFC 1806, this document contains minor - editorial updates, adds new parameters needed to support the File - Transfer Body Part, and references a separate specification for the - handling of non-ASCII and/or very long parameter values. - - - - - - - -Troost, et. al. Standards Track [Page 1] - -RFC 2183 Content-Disposition August 1997 - - -1. Introduction - - MIME specifies a standard format for encapsulating multiple pieces of - data into a single Internet message. That document does not address - the issue of presentation styles; it provides a framework for the - interchange of message content, but leaves presentation issues solely - in the hands of mail user agent (MUA) implementors. - - Two common ways of presenting multipart electronic messages are as a - main document with a list of separate attachments, and as a single - document with the various parts expanded (displayed) inline. The - display of an attachment is generally construed to require positive - action on the part of the recipient, while inline message components - are displayed automatically when the message is viewed. A mechanism - is needed to allow the sender to transmit this sort of presentational - information to the recipient; the Content-Disposition header provides - this mechanism, allowing each component of a message to be tagged - with an indication of its desired presentation semantics. - - Tagging messages in this manner will often be sufficient for basic - message formatting. However, in many cases a more powerful and - flexible approach will be necessary. The definition of such - approaches is beyond the scope of this memo; however, such approaches - can benefit from additional Content-Disposition values and - parameters, to be defined at a later date. - - In addition to allowing the sender to specify the presentational - disposition of a message component, it is desirable to allow her to - indicate a default archival disposition; a filename. The optional - "filename" parameter provides for this. Further, the creation-date, - modification-date, and read-date parameters allow preservation of - those file attributes when the file is transmitted over MIME email. - - NB: The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, - SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this - document, are to be interpreted as described in [RFC 2119]. - -2. The Content-Disposition Header Field - - Content-Disposition is an optional header field. In its absence, the - MUA may use whatever presentation method it deems suitable. - - It is desirable to keep the set of possible disposition types small - and well defined, to avoid needless complexity. Even so, evolving - usage will likely require the definition of additional disposition - types or parameters, so the set of disposition values is extensible; - see below. - - - - -Troost, et. al. Standards Track [Page 2] - -RFC 2183 Content-Disposition August 1997 - - - In the extended BNF notation of [RFC 822], the Content-Disposition - header field is defined as follows: - - disposition := "Content-Disposition" ":" - disposition-type - *(";" disposition-parm) - - disposition-type := "inline" - / "attachment" - / extension-token - ; values are not case-sensitive - - disposition-parm := filename-parm - / creation-date-parm - / modification-date-parm - / read-date-parm - / size-parm - / parameter - - filename-parm := "filename" "=" value - - creation-date-parm := "creation-date" "=" quoted-date-time - - modification-date-parm := "modification-date" "=" quoted-date-time - - read-date-parm := "read-date" "=" quoted-date-time - - size-parm := "size" "=" 1*DIGIT - - quoted-date-time := quoted-string - ; contents MUST be an RFC 822 `date-time' - ; numeric timezones (+HHMM or -HHMM) MUST be used - - - - NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters) - parameter value containing only non-`tspecials' characters SHOULD be - represented as a single `token'. A short parameter value containing - only ASCII characters, but including `tspecials' characters, SHOULD - be represented as `quoted-string'. Parameter values longer than 78 - characters, or which contain non-ASCII characters, MUST be encoded as - specified in [RFC 2184]. - - `Extension-token', `parameter', `tspecials' and `value' are defined - according to [RFC 2045] (which references [RFC 822] in the definition - of some of these tokens). `quoted-string' and `DIGIT' are defined in - [RFC 822]. - - - - -Troost, et. al. Standards Track [Page 3] - -RFC 2183 Content-Disposition August 1997 - - -2.1 The Inline Disposition Type - - A bodypart should be marked `inline' if it is intended to be - displayed automatically upon display of the message. Inline - bodyparts should be presented in the order in which they occur, - subject to the normal semantics of multipart messages. - -2.2 The Attachment Disposition Type - - Bodyparts can be designated `attachment' to indicate that they are - separate from the main body of the mail message, and that their - display should not be automatic, but contingent upon some further - action of the user. The MUA might instead present the user of a - bitmap terminal with an iconic representation of the attachments, or, - on character terminals, with a list of attachments from which the - user could select for viewing or storage. - -2.3 The Filename Parameter - - The sender may want to suggest a filename to be used if the entity is - detached and stored in a separate file. If the receiving MUA writes - the entity to a file, the suggested filename should be used as a - basis for the actual filename, where possible. - - It is important that the receiving MUA not blindly use the suggested - filename. The suggested filename SHOULD be checked (and possibly - changed) to see that it conforms to local filesystem conventions, - does not overwrite an existing file, and does not present a security - problem (see Security Considerations below). - - The receiving MUA SHOULD NOT respect any directory path information - that may seem to be present in the filename parameter. The filename - should be treated as a terminal component only. Portable - specification of directory paths might possibly be done in the future - via a separate Content-Disposition parameter, but no provision is - made for it in this draft. - - Current [RFC 2045] grammar restricts parameter values (and hence - Content-Disposition filenames) to US-ASCII. We recognize the great - desirability of allowing arbitrary character sets in filenames, but - it is beyond the scope of this document to define the necessary - mechanisms. We expect that the basic [RFC 1521] `value' - specification will someday be amended to allow use of non-US-ASCII - characters, at which time the same mechanism should be used in the - Content-Disposition filename parameter. - - - - - - -Troost, et. al. Standards Track [Page 4] - -RFC 2183 Content-Disposition August 1997 - - - Beyond the limitation to US-ASCII, the sending MUA may wish to bear - in mind the limitations of common filesystems. Many have severe - length and character set restrictions. Short alphanumeric filenames - are least likely to require modification by the receiving system. - - The presence of the filename parameter does not force an - implementation to write the entity to a separate file. It is - perfectly acceptable for implementations to leave the entity as part - of the normal mail stream unless the user requests otherwise. As a - consequence, the parameter may be used on any MIME entity, even - `inline' ones. These will not normally be written to files, but the - parameter could be used to provide a filename if the receiving user - should choose to write the part to a file. - -2.4 The Creation-Date parameter - - The creation-date parameter MAY be used to indicate the date at which - the file was created. If this parameter is included, the paramter - value MUST be a quoted-string which contains a representation of the - creation date of the file in [RFC 822] `date-time' format. - - UNIX and POSIX implementors are cautioned that the `st_ctime' file - attribute of the `stat' structure is not the creation time of the - file; it is thus not appropriate as a source for the creation-date - parameter value. - -2.5 The Modification-Date parameter - - The modification-date parameter MAY be used to indicate the date at - which the file was last modified. If the modification-date parameter - is included, the paramter value MUST be a quoted-string which - contains a representation of the last modification date of the file - in [RFC 822] `date-time' format. - -2.6 The Read-Date parameter - - The read-date parameter MAY be used to indicate the date at which the - file was last read. If the read-date parameter is included, the - parameter value MUST be a quoted-string which contains a - representation of the last-read date of the file in [RFC 822] `date- - time' format. - -2.7 The Size parameter - - The size parameter indicates an approximate size of the file in - octets. It can be used, for example, to pre-allocate space before - attempting to store the file, or to determine whether enough space - exists. - - - -Troost, et. al. Standards Track [Page 5] - -RFC 2183 Content-Disposition August 1997 - - -2.8 Future Extensions and Unrecognized Disposition Types - - In the likely event that new parameters or disposition types are - needed, they should be registered with the Internet Assigned Numbers - Authority (IANA), in the manner specified in Section 9 of this memo. - - Once new disposition types and parameters are defined, there is of - course the likelihood that implementations will see disposition types - and parameters they do not understand. Furthermore, since x-tokens - are allowed, implementations may also see entirely unregistered - disposition types and parameters. - - Unrecognized parameters should be ignored. Unrecognized disposition - types should be treated as `attachment'. The choice of `attachment' - for unrecognized types is made because a sender who goes to the - trouble of producing a Content-Disposition header with a new - disposition type is more likely aiming for something more elaborate - than inline presentation. - - Unless noted otherwise in the definition of a parameter, Content- - Disposition parameters are valid for all dispositions. (In contrast - to MIME content-type parameters, which are defined on a per-content- - type basis.) Thus, for example, the `filename' parameter still means - the name of the file to which the part should be written, even if the - disposition itself is unrecognized. - -2.9 Content-Disposition and Multipart - - If a Content-Disposition header is used on a multipart body part, it - applies to the multipart as a whole, not the individual subparts. - The disposition types of the subparts do not need to be consulted - until the multipart itself is presented. When the multipart is - displayed, then the dispositions of the subparts should be respected. - - If the `inline' disposition is used, the multipart should be - displayed as normal; however, an `attachment' subpart should require - action from the user to display. - - If the `attachment' disposition is used, presentation of the - multipart should not proceed without explicit user action. Once the - user has chosen to display the multipart, the individual subpart - dispositions should be consulted to determine how to present the - subparts. - - - - - - - - -Troost, et. al. Standards Track [Page 6] - -RFC 2183 Content-Disposition August 1997 - - -2.10 Content-Disposition and the Main Message - - It is permissible to use Content-Disposition on the main body of an - [RFC 822] message. - -3. Examples - - Here is a an example of a body part containing a JPEG image that is - intended to be viewed by the user immediately: - - Content-Type: image/jpeg - Content-Disposition: inline - Content-Description: just a small picture of me - - <jpeg data> - - The following body part contains a JPEG image that should be - displayed to the user only if the user requests it. If the JPEG is - written to a file, the file should be named "genome.jpg". The - recipient's user might also choose to set the last-modified date of - the stored file to date in the modification-date parameter: - - Content-Type: image/jpeg - Content-Disposition: attachment; filename=genome.jpeg; - modification-date="Wed, 12 Feb 1997 16:29:51 -0500"; - Content-Description: a complete map of the human genome - - <jpeg data> - - The following is an example of the use of the `attachment' - disposition with a multipart body part. The user should see text- - part-1 immediately, then take some action to view multipart-2. After - taking action to view multipart-2, the user will see text-part-2 - right away, and be required to take action to view jpeg-1. Subparts - are indented for clarity; they would not be so indented in a real - message. - - - - - - - - - - - - - - - -Troost, et. al. Standards Track [Page 7] - -RFC 2183 Content-Disposition August 1997 - - - Content-Type: multipart/mixed; boundary=outer - Content-Description: multipart-1 - - --outer - Content-Type: text/plain - Content-Disposition: inline - Content-Description: text-part-1 - - Some text goes here - - --outer - Content-Type: multipart/mixed; boundary=inner - Content-Disposition: attachment - Content-Description: multipart-2 - - --inner - Content-Type: text/plain - Content-Disposition: inline - Content-Description: text-part-2 - - Some more text here. - - --inner - Content-Type: image/jpeg - Content-Disposition: attachment - Content-Description: jpeg-1 - - <jpeg data> - --inner-- - --outer-- - -4. Summary - - Content-Disposition takes one of two values, `inline' and - `attachment'. `Inline' indicates that the entity should be - immediately displayed to the user, whereas `attachment' means that - the user should take additional action to view the entity. - - The `filename' parameter can be used to suggest a filename for - storing the bodypart, if the user wishes to store it in an external - file. - - - - - - - - - - -Troost, et. al. Standards Track [Page 8] - -RFC 2183 Content-Disposition August 1997 - - -5. Security Considerations - - There are security issues involved any time users exchange data. - While these are not to be minimized, neither does this memo change - the status quo in that regard, except in one instance. - - Since this memo provides a way for the sender to suggest a filename, - a receiving MUA must take care that the sender's suggested filename - does not represent a hazard. Using UNIX as an example, some hazards - would be: - - + Creating startup files (e.g., ".login"). - - + Creating or overwriting system files (e.g., "/etc/passwd"). - - + Overwriting any existing file. - - + Placing executable files into any command search path - (e.g., "~/bin/more"). - - + Sending the file to a pipe (e.g., "| sh"). - - In general, the receiving MUA should not name or place the file such - that it will get interpreted or executed without the user explicitly - initiating the action. - - It is very important to note that this is not an exhaustive list; it - is intended as a small set of examples only. Implementors must be - alert to the potential hazards on their target systems. - -6. References - - [RFC 2119] - Bradner, S., "Key words for use in RFCs to Indicate Requirement - Levels", RFC 2119, March 1997. - - [RFC 2184] - Freed, N. and K. Moore, "MIME Parameter value and Encoded Words: - Character Sets, Lanaguage, and Continuations", RFC 2184, August - 1997. - - [RFC 2045] - Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail - Extensions) Part One: Format of Internet Message Bodies", RFC - 2045, December 1996. - - - - - - -Troost, et. al. Standards Track [Page 9] - -RFC 2183 Content-Disposition August 1997 - - - [RFC 2046] - Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail - Extensions) Part Two: Media Types", RFC 2046, December 1996. - - [RFC 2047] - Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part - Three: Message Header Extensions for non-ASCII Text", RFC 2047, - December 1996. - - [RFC 2048] - Freed, N., Klensin, J. and J. Postel, "MIME (Multipurpose - Internet Mail Extensions) Part Four: Registration Procedures", - RFC 2048, December 1996. - - [RFC 2049] - Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail - Extensions) Part Five: Conformance Criteria and Examples", RFC - 2049, December 1996. - - [RFC 822] - Crocker, D., "Standard for the Format of ARPA Internet Text - Messages", STD 11, RFC 822, UDEL, August 1982. - -7. Acknowledgements - - We gratefully acknowledge the help these people provided during the - preparation of this draft: - - Nathaniel Borenstein - Ned Freed - Keith Moore - Dave Crocker - Dan Pritchett - - - - - - - - - - - - - - - - - - -Troost, et. al. Standards Track [Page 10] - -RFC 2183 Content-Disposition August 1997 - - -8. Authors' Addresses - - You should blame the editor of this version of the document for any - changes since RFC 1806: - - Keith Moore - Department of Computer Science - University of Tennessee, Knoxville - 107 Ayres Hall - Knoxville TN 37996-1301 - USA - - Phone: +1 (423) 974-5067 - Fax: +1 (423) 974-8296 - Email: moore@cs.utk.edu - - - The authors of RFC 1806 are: - - Rens Troost - New Century Systems - 324 East 41st Street #804 - New York, NY, 10017 USA - - Phone: +1 (212) 557-2050 - Fax: +1 (212) 557-2049 - EMail: rens@century.com - - - Steve Dorner - QUALCOMM Incorporated - 6455 Lusk Boulevard - San Diego, CA 92121 - USA - - EMail: sdorner@qualcomm.com - - -9. Registration of New Content-Disposition Values and Parameters - - New Content-Disposition values (besides "inline" and "attachment") - may be defined only by Internet standards-track documents, or in - Experimental documents approved by the Internet Engineering Steering - Group. - - - - - - - -Troost, et. al. Standards Track [Page 11] - -RFC 2183 Content-Disposition August 1997 - - - New content-disposition parameters may be registered by supplying the - information in the following template and sending it via electronic - mail to IANA@IANA.ORG: - - To: IANA@IANA.ORG - Subject: Registration of new Content-Disposition parameter - - Content-Disposition parameter name: - - Allowable values for this parameter: - (If the parameter can only assume a small number of values, - list each of those values. Otherwise, describe the values - that the parameter can assume.) - Description: - (What is the purpose of this parameter and how is it used?) - -10. Changes since RFC 1806 - - The following changes have been made since the earlier version of - this document, published in RFC 1806 as an Experimental protocol: - - + Updated references to MIME documents. In some cases this - involved substituting a reference to one of the current MIME - RFCs for a reference to RFC 1521; in other cases, a reference to - RFC 1521 was simply replaced with the word "MIME". - - + Added a section on registration procedures, since none of the - procedures in RFC 2048 seemed to be appropriate. - - + Added new parameter types: creation-date, modification-date, - read-date, and size. - - - + Incorporated a reference to draft-freed-pvcsc-* for encoding - long or non-ASCII parameter values. - - + Added reference to RFC 2119 to define MUST, SHOULD, etc. - keywords. - - - - - - - - - - - - - -Troost, et. al. Standards Track [Page 12] - diff --git a/proto/rfc2231.txt b/proto/rfc2231.txt @@ -1,563 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 2231 Innosoft -Updates: 2045, 2047, 2183 K. Moore -Obsoletes: 2184 University of Tennessee -Category: Standards Track November 1997 - - - MIME Parameter Value and Encoded Word Extensions: - Character Sets, Languages, and Continuations - - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1997). All Rights Reserved. - -1. Abstract - - This memo defines extensions to the RFC 2045 media type and RFC 2183 - disposition parameter value mechanisms to provide - - (1) a means to specify parameter values in character sets - other than US-ASCII, - - (2) to specify the language to be used should the value be - displayed, and - - (3) a continuation mechanism for long parameter values to - avoid problems with header line wrapping. - - This memo also defines an extension to the encoded words defined in - RFC 2047 to allow the specification of the language to be used for - display as well as the character set. - -2. Introduction - - The Multipurpose Internet Mail Extensions, or MIME [RFC-2045, RFC- - 2046, RFC-2047, RFC-2048, RFC-2049], define a message format that - allows for: - - - - - -Freed & Moore Standards Track [Page 1] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - (1) textual message bodies in character sets other than - US-ASCII, - - (2) non-textual message bodies, - - (3) multi-part message bodies, and - - (4) textual header information in character sets other than - US-ASCII. - - MIME is now widely deployed and is used by a variety of Internet - protocols, including, of course, Internet email. However, MIME's - success has resulted in the need for additional mechanisms that were - not provided in the original protocol specification. - - In particular, existing MIME mechanisms provide for named media type - (content-type field) parameters as well as named disposition - (content-disposition field). A MIME media type may specify any - number of parameters associated with all of its subtypes, and any - specific subtype may specify additional parameters for its own use. A - MIME disposition value may specify any number of associated - parameters, the most important of which is probably the attachment - disposition's filename parameter. - - These parameter names and values end up appearing in the content-type - and content-disposition header fields in Internet email. This - inherently imposes three crucial limitations: - - (1) Lines in Internet email header fields are folded - according to RFC 822 folding rules. This makes long - parameter values problematic. - - (2) MIME headers, like the RFC 822 headers they often - appear in, are limited to 7bit US-ASCII, and the - encoded-word mechanisms of RFC 2047 are not available - to parameter values. This makes it impossible to have - parameter values in character sets other than US-ASCII - without specifying some sort of private per-parameter - encoding. - - (3) It has recently become clear that character set - information is not sufficient to properly display some - sorts of information -- language information is also - needed [RFC-2130]. For example, support for - handicapped users may require reading text string - - - - - - -Freed & Moore Standards Track [Page 2] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - aloud. The language the text is written in is needed - for this to be done correctly. Some parameter values - may need to be displayed, hence there is a need to - allow for the inclusion of language information. - - The last problem on this list is also an issue for the encoded words - defined by RFC 2047, as encoded words are intended primarily for - display purposes. - - This document defines extensions that address all of these - limitations. All of these extensions are implemented in a fashion - that is completely compatible at a syntactic level with existing MIME - implementations. In addition, the extensions are designed to have as - little impact as possible on existing uses of MIME. - - IMPORTANT NOTE: These mechanisms end up being somewhat gibbous when - they actually are used. As such, these mechanisms should not be used - lightly; they should be reserved for situations where a real need for - them exists. - -2.1. Requirements notation - - This document occasionally uses terms that appear in capital letters. - When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" - appear capitalized, they are being used to indicate particular - requirements of this specification. A discussion of the meanings of - these terms appears in [RFC- 2119]. - -3. Parameter Value Continuations - - Long MIME media type or disposition parameter values do not interact - well with header line wrapping conventions. In particular, proper - header line wrapping depends on there being places where linear - whitespace (LWSP) is allowed, which may or may not be present in a - parameter value, and even if present may not be recognizable as such - since specific knowledge of parameter value syntax may not be - available to the agent doing the line wrapping. The result is that - long parameter values may end up getting truncated or otherwise - damaged by incorrect line wrapping implementations. - - A mechanism is therefore needed to break up parameter values into - smaller units that are amenable to line wrapping. Any such mechanism - MUST be compatible with existing MIME processors. This means that - - (1) the mechanism MUST NOT change the syntax of MIME media - type and disposition lines, and - - - - - -Freed & Moore Standards Track [Page 3] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - (2) the mechanism MUST NOT depend on parameter ordering - since MIME states that parameters are not order - sensitive. Note that while MIME does prohibit - modification of MIME headers during transport, it is - still possible that parameters will be reordered when - user agent level processing is done. - - The obvious solution, then, is to use multiple parameters to contain - a single parameter value and to use some kind of distinguished name - to indicate when this is being done. And this obvious solution is - exactly what is specified here: The asterisk character ("*") followed - by a decimal count is employed to indicate that multiple parameters - are being used to encapsulate a single parameter value. The count - starts at 0 and increments by 1 for each subsequent section of the - parameter value. Decimal values are used and neither leading zeroes - nor gaps in the sequence are allowed. - - The original parameter value is recovered by concatenating the - various sections of the parameter, in order. For example, the - content-type field - - Content-Type: message/external-body; access-type=URL; - URL*0="ftp://"; - URL*1="cs.utk.edu/pub/moore/bulk-mailer/bulk-mailer.tar" - - is semantically identical to - - Content-Type: message/external-body; access-type=URL; - URL="ftp://cs.utk.edu/pub/moore/bulk-mailer/bulk-mailer.tar" - - Note that quotes around parameter values are part of the value - syntax; they are NOT part of the value itself. Furthermore, it is - explicitly permitted to have a mixture of quoted and unquoted - continuation fields. - -4. Parameter Value Character Set and Language Information - - Some parameter values may need to be qualified with character set or - language information. It is clear that a distinguished parameter - name is needed to identify when this information is present along - with a specific syntax for the information in the value itself. In - addition, a lightweight encoding mechanism is needed to accommodate 8 - bit information in parameter values. - - - - - - - - -Freed & Moore Standards Track [Page 4] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - Asterisks ("*") are reused to provide the indicator that language and - character set information is present and encoding is being used. A - single quote ("'") is used to delimit the character set and language - information at the beginning of the parameter value. Percent signs - ("%") are used as the encoding flag, which agrees with RFC 2047. - - Specifically, an asterisk at the end of a parameter name acts as an - indicator that character set and language information may appear at - the beginning of the parameter value. A single quote is used to - separate the character set, language, and actual value information in - the parameter value string, and an percent sign is used to flag - octets encoded in hexadecimal. For example: - - Content-Type: application/x-stuff; - title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A - - Note that it is perfectly permissible to leave either the character - set or language field blank. Note also that the single quote - delimiters MUST be present even when one of the field values is - omitted. This is done when either character set, language, or both - are not relevant to the parameter value at hand. This MUST NOT be - done in order to indicate a default character set or language -- - parameter field definitions MUST NOT assign a default character set - or language. - -4.1. Combining Character Set, Language, and Parameter Continuations - - Character set and language information may be combined with the - parameter continuation mechanism. For example: - - Content-Type: application/x-stuff - title*0*=us-ascii'en'This%20is%20even%20more%20 - title*1*=%2A%2A%2Afun%2A%2A%2A%20 - title*2="isn't it!" - - Note that: - - (1) Language and character set information only appear at - the beginning of a given parameter value. - - (2) Continuations do not provide a facility for using more - than one character set or language in the same - parameter value. - - (3) A value presented using multiple continuations may - contain a mixture of encoded and unencoded segments. - - - - - -Freed & Moore Standards Track [Page 5] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - (4) The first segment of a continuation MUST be encoded if - language and character set information are given. - - (5) If the first segment of a continued parameter value is - encoded the language and character set field delimiters - MUST be present even when the fields are left blank. - -5. Language specification in Encoded Words - - RFC 2047 provides support for non-US-ASCII character sets in RFC 822 - message header comments, phrases, and any unstructured text field. - This is done by defining an encoded word construct which can appear - in any of these places. Given that these are fields intended for - display, it is sometimes necessary to associate language information - with encoded words as well as just the character set. This - specification extends the definition of an encoded word to allow the - inclusion of such information. This is simply done by suffixing the - character set specification with an asterisk followed by the language - tag. For example: - - From: =?US-ASCII*EN?Q?Keith_Moore?= <moore@cs.utk.edu> - -6. IMAP4 Handling of Parameter Values - - IMAP4 [RFC-2060] servers SHOULD decode parameter value continuations - when generating the BODY and BODYSTRUCTURE fetch attributes. - -7. Modifications to MIME ABNF - - The ABNF for MIME parameter values given in RFC 2045 is: - - parameter := attribute "=" value - - attribute := token - ; Matching of attributes - ; is ALWAYS case-insensitive. - - This specification changes this ABNF to: - - parameter := regular-parameter / extended-parameter - - regular-parameter := regular-parameter-name "=" value - - regular-parameter-name := attribute [section] - - attribute := 1*attribute-char - - - - - -Freed & Moore Standards Track [Page 6] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - attribute-char := <any (US-ASCII) CHAR except SPACE, CTLs, - "*", "'", "%", or tspecials> - - section := initial-section / other-sections - - initial-section := "*0" - - other-sections := "*" ("1" / "2" / "3" / "4" / "5" / - "6" / "7" / "8" / "9") *DIGIT) - - extended-parameter := (extended-initial-name "=" - extended-value) / - (extended-other-names "=" - extended-other-values) - - extended-initial-name := attribute [initial-section] "*" - - extended-other-names := attribute other-sections "*" - - extended-initial-value := [charset] "'" [language] "'" - extended-other-values - - extended-other-values := *(ext-octet / attribute-char) - - ext-octet := "%" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") - - charset := <registered character set name> - - language := <registered language tag [RFC-1766]> - - The ABNF given in RFC 2047 for encoded-words is: - - encoded-word := "=?" charset "?" encoding "?" encoded-text "?=" - - This specification changes this ABNF to: - - encoded-word := "=?" charset ["*" language] "?" encoded-text "?=" - -8. Character sets which allow specification of language - - In the future it is likely that some character sets will provide - facilities for inline language labeling. Such facilities are - inherently more flexible than those defined here as they allow for - language switching in the middle of a string. - - - - - - - -Freed & Moore Standards Track [Page 7] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - If and when such facilities are developed they SHOULD be used in - preference to the language labeling facilities specified here. Note - that all the mechanisms defined here allow for the omission of - language labels so as to be able to accommodate this possible future - usage. - -9. Security Considerations - - This RFC does not discuss security issues and is not believed to - raise any security issues not already endemic in electronic mail and - present in fully conforming implementations of MIME. - -10. References - - [RFC-822] - Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", STD 11, RFC 822 August 1982. - - [RFC-1766] - Alvestrand, H., "Tags for the Identification of - Languages", RFC 1766, March 1995. - - [RFC-2045] - Freed, N., and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message - Bodies", RFC 2045, December 1996. - - [RFC-2046] - Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - December 1996. - - [RFC-2047] - Moore, K., "Multipurpose Internet Mail Extensions (MIME) - Part Three: Representation of Non-ASCII Text in Internet - Message Headers", RFC 2047, December 1996. - - [RFC-2048] - Freed, N., Klensin, J. and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) Part Four: MIME - Registration Procedures", RFC 2048, December 1996. - - [RFC-2049] - Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Five: Conformance Criteria and - Examples", RFC 2049, December 1996. - - - - - -Freed & Moore Standards Track [Page 8] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - - [RFC-2060] - Crispin, M., "Internet Message Access Protocol - Version - 4rev1", RFC 2060, December 1996. - - [RFC-2119] - Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", RFC 2119, March 1997. - - [RFC-2130] - Weider, C., Preston, C., Simonsen, K., Alvestrand, H., - Atkinson, R., Crispin, M., and P. Svanberg, "Report from the - IAB Character Set Workshop", RFC 2130, April 1997. - - [RFC-2183] - Troost, R., Dorner, S. and K. Moore, "Communicating - Presentation Information in Internet Messages: The - Content-Disposition Header", RFC 2183, August 1997. - -11. Authors' Addresses - - Ned Freed - Innosoft International, Inc. - 1050 Lakes Drive - West Covina, CA 91790 - USA - - Phone: +1 626 919 3600 - Fax: +1 626 919 3614 - EMail: ned.freed@innosoft.com - - - Keith Moore - Computer Science Dept. - University of Tennessee - 107 Ayres Hall - Knoxville, TN 37996-1301 - USA - - EMail: moore@cs.utk.edu - - - - - - - - - - - - -Freed & Moore Standards Track [Page 9] - -RFC 2231 MIME Value and Encoded Word Extensions November 1997 - - -12. Full Copyright Statement - - Copyright (C) The Internet Society (1997). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Freed & Moore Standards Track [Page 10] - diff --git a/proto/rfc2387.txt b/proto/rfc2387.txt @@ -1,563 +0,0 @@ - - - - - - -Network Working Group E. Levinson -Request for Comments: 2387 August 1998 -Obsoletes: 2112 -Category: Standards Track - - - The MIME Multipart/Related Content-type - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1998). All Rights Reserved. - -Abstract - - The Multipart/Related content-type provides a common mechanism for - representing objects that are aggregates of related MIME body parts. - This document defines the Multipart/Related content-type and provides - examples of its use. - -1. Introduction - - Several applications of MIME, including MIME-PEM, and MIME-Macintosh - and other proposals, require multiple body parts that make sense only - in the aggregate. The present approach to these compound objects has - been to define specific multipart subtypes for each new object. In - keeping with the MIME philosophy of having one mechanism to achieve - the same goal for different purposes, this document describes a - single mechanism for such aggregate or compound objects. - - The Multipart/Related content-type addresses the MIME representation - of compound objects. The object is categorized by a "type" - parameter. Additional parameters are provided to indicate a specific - starting body part or root and auxiliary information which may be - required when unpacking or processing the object. - - Multipart/Related MIME entities may contain Content-Disposition - headers that provide suggestions for the storage and display of a - body part. Multipart/Related processing takes precedence over - Content-Disposition; the interaction between them is discussed in - section 4. - - - -Levinson Standards Track [Page 1] - -RFC 2387 Multipart/Related August 1998 - - - Responsibility for the display or processing of a Multipart/Related's - constituent entities rests with the application that handles the - compound object. - -2. Multipart/Related Registration Information - - The following form is copied from RFC 1590, Appendix A. - - To: IANA@isi.edu - Subject: Registration of new Media Type content-type/subtype - - Media Type name: Multipart - - Media subtype name: Related - - Required parameters: Type, a media type/subtype. - - Optional parameters: Start - Start-info - - Encoding considerations: Multipart content-types cannot have - encodings. - - Security considerations: Depends solely on the referenced type. - - Published specification: RFC-REL (this document). - - Person & email address to contact for further information: - Edward Levinson - 47 Clive Street - Metuchen, NJ 08840-1060 - +1 908 494 1606 - XIson@cnj.digex.net - -3. Intended usage - - The Multipart/Related media type is intended for compound objects - consisting of several inter-related body parts. For a - Multipart/Related object, proper display cannot be achieved by - individually displaying the constituent body parts. The content-type - of the Multipart/Related object is specified by the type parameter. - The "start" parameter, if given, points, via a content-ID, to the - body part that contains the object root. The default root is the - first body part within the Multipart/Related body. - - The relationships among the body parts of a compound object - distinguishes it from other object types. These relationships are - often represented by links internal to the object's components that - - - -Levinson Standards Track [Page 2] - -RFC 2387 Multipart/Related August 1998 - - - reference the other components. Within a single operating - environment the links are often file names, such links may be - represented within a MIME message using content-IDs or the value of - some other "Content-" headers. - -3.1. The Type Parameter - - The type parameter must be specified and its value is the MIME media - type of the "root" body part. It permits a MIME user agent to - determine the content-type without reference to the enclosed body - part. If the value of the type parameter and the root body part's - content-type differ then the User Agent's behavior is undefined. - -3.2. The Start Parameter - - The start parameter, if given, is the content-ID of the compound - object's "root". If not present the "root" is the first body part in - the Multipart/Related entity. The "root" is the element the - applications processes first. - -3.3. The Start-Info Parameter - - Additional information can be provided to an application by the - start-info parameter. It contains either a string or points, via a - content-ID, to another MIME entity in the message. A typical use - might be to provide additional command line parameters or a MIME - entity giving auxiliary information for processing the compound - object. - - Applications that use Multipart/Related must specify the - interpretation of start-info. User Agents shall provide the - parameter's value to the processing application. Processes can - distinguish a start-info reference from a token or quoted-string by - examining the first non-white-space character, "<" indicates a - reference. - -3.4. Syntax - - related-param := [ ";" "start" "=" cid ] - [ ";" "start-info" "=" - ( cid-list / value ) ] - [ ";" "type" "=" type "/" subtype ] - ; order independent - - cid-list := cid cid-list - - cid := msg-id ; c.f. [822] - - - - -Levinson Standards Track [Page 3] - -RFC 2387 Multipart/Related August 1998 - - - value := token / quoted-string ; c.f. [MIME] - ; value cannot begin with "<" - - Note that the parameter values will usually require quoting. Msg-id - contains the special characters "<", ">", "@", and perhaps other - special characters. If msg-id contains quoted-strings, those quote - marks must be escaped. Similarly, the type parameter contains the - special character "/". - -4. Handling Content-Disposition Headers - - Content-Disposition Headers [DISP] suggest presentation styles for - MIME body parts. [DISP] describes two presentation styles, called - the disposition type, INLINE and ATTACHMENT. These, used within a - multipart entity, allow the sender to suggest presentation - information. [DISP] also provides for an optional storage (file) - name. Content-Disposition headers could appear in one or more body - parts contained within a Multipart/Related entity. - - Using Content-Disposition headers in addition to Multipart/Related - provides presentation information to User Agents that do not - recognize Multipart/Related. They will treat the multipart as - Multipart/Mixed and they may find the Content-Disposition information - useful. - - With Multipart/Related however, the application processing the - compound object determines the presentation style for all the - contained parts. In that context the Content-Disposition header - information is redundant or even misleading. Hence, User Agents that - understand Multipart/Related shall ignore the disposition type within - a Multipart/Related body part. - - It may be possible for a User Agent capable of handling both - Multipart/Related and Content-Disposition headers to provide the - invoked application the Content-Disposition header's optional - filename parameter to the Multipart/Related. The use of that - information will depend on the specific application and should be - specified when describing the handling of the corresponding compound - object. Such descriptions would be appropriate in an RFC registering - that object's media type. - -5. Examples - -5.1 Application/X-FixedRecord - - The X-FixedRecord content-type consists of one or more octet-streams - and a list of the lengths of each record. The root, which lists the - record lengths of each record within the streams. The record length - - - -Levinson Standards Track [Page 4] - -RFC 2387 Multipart/Related August 1998 - - - list, type Application/X-FixedRecord, consists of a set of INTEGERs - in ASCII format, one per line. Each INTEGER gives the number of - octets from the octet-stream body part that constitute the next - "record". - - The example below, uses a single data block. - - Content-Type: Multipart/Related; boundary=example-1 - start="<950120.aaCC@XIson.com>"; - type="Application/X-FixedRecord" - start-info="-o ps" - - --example-1 - Content-Type: Application/X-FixedRecord - Content-ID: <950120.aaCC@XIson.com> - - 25 - 10 - 34 - 10 - 25 - 21 - 26 - 10 - --example-1 - Content-Type: Application/octet-stream - Content-Description: The fixed length records - Content-Transfer-Encoding: base64 - Content-ID: <950120.aaCB@XIson.com> - - T2xkIE1hY0RvbmFsZCBoYWQgYSBmYXJtCkUgSS - BFIEkgTwpBbmQgb24gaGlzIGZhcm0gaGUgaGFk - IHNvbWUgZHVja3MKRSBJIEUgSSBPCldpdGggYS - BxdWFjayBxdWFjayBoZXJlLAphIHF1YWNrIHF1 - YWNrIHRoZXJlLApldmVyeSB3aGVyZSBhIHF1YW - NrIHF1YWNrCkUgSSBFIEkgTwo= - - --example-1-- - - - - - - - - - - - - - -Levinson Standards Track [Page 5] - -RFC 2387 Multipart/Related August 1998 - - -5.2 Text/X-Okie - - The Text/X-Okie is an invented markup language permitting the - inclusion of images with text. A feature of this example is the - inclusion of two additional body parts, both picture. They are - referred to internally by the encapsulated document via each - picture's body part content-ID. Usage of "cid:", as in this example, - may be useful for a variety of compound objects. It is not, however, - a part of the Multipart/Related specification. - - Content-Type: Multipart/Related; boundary=example-2; - start="<950118.AEBH@XIson.com>" - type="Text/x-Okie" - - --example-2 - Content-Type: Text/x-Okie; charset=iso-8859-1; - declaration="<950118.AEB0@XIson.com>" - Content-ID: <950118.AEBH@XIson.com> - Content-Description: Document - - {doc} - This picture was taken by an automatic camera mounted ... - {image file=cid:950118.AECB@XIson.com} - {para} - Now this is an enlargement of the area ... - {image file=cid:950118:AFDH@XIson.com} - {/doc} - --example-2 - Content-Type: image/jpeg - Content-ID: <950118.AFDH@XIson.com> - Content-Transfer-Encoding: BASE64 - Content-Description: Picture A - - [encoded jpeg image] - --example-2 - Content-Type: image/jpeg - Content-ID: <950118.AECB@XIson.com> - Content-Transfer-Encoding: BASE64 - Content-Description: Picture B - - [encoded jpeg image] - --example-2-- - -5.3 Content-Disposition - - In the above example each image body part could also have a Content- - Disposition header. For example, - - - - -Levinson Standards Track [Page 6] - -RFC 2387 Multipart/Related August 1998 - - - --example-2 - Content-Type: image/jpeg - Content-ID: <950118.AECB@XIson.com> - Content-Transfer-Encoding: BASE64 - Content-Description: Picture B - Content-Disposition: INLINE - - [encoded jpeg image] - --example-2-- - - User Agents that recognize Multipart/Related will ignore the - Content-Disposition header's disposition type. Other User Agents - will process the Multipart/Related as Multipart/Mixed and may make - use of that header's information. - -6. User Agent Requirements - - User agents that do not recognize Multipart/Related shall, in - accordance with [MIME], treat the entire entity as Multipart/Mixed. - MIME User Agents that do recognize Multipart/Related entities but are - unable to process the given type should give the user the option of - suppressing the entire Multipart/Related body part shall be. - - Existing MIME-capable mail user agents (MUAs) handle the existing - media types in a straightforward manner. For discrete media types - (e.g. text, image, etc.) the body of the entity can be directly - passed to a display process. Similarly the existing composite - subtypes can be reduced to handing one or more discrete types. - Handling Multipart/Related differs in that processing cannot be - reduced to handling the individual entities. - - The following sections discuss what information the processing - application requires. - - It is possible that an application specific "receiving agent" will - manipulate the entities for display prior to invoking actual - application process. Okie, above, is an example of this; it may need - a receiving agent to parse the document and substitute local file - names for the originator's file names. Other applications may just - require a table showing the correspondence between the local file - names and the originator's. The receiving agent takes responsibility - for such processing. - -6.1 Data Requirements - - MIME-capable mail user agents (MUAs) are required to provide the - application: - - - - -Levinson Standards Track [Page 7] - -RFC 2387 Multipart/Related August 1998 - - - (a) the bodies of the MIME entities and the entity Content-* headers, - - (b) the parameters of the Multipart/Related Content-type header, and - - (c) the correspondence between each body's local file name, that - body's header data, and, if present, the body part's content-ID. - -6.2 Storing Multipart/Related Entities - - The Multipart/Related media type will be used for objects that have - internal linkages between the body parts. When the objects are - stored the linkages may require processing by the application or its - receiving agent. - -6.3 Recursion - - MIME is a recursive structure. Hence one must expect a - Multipart/Related entity to contain other Multipart/Related entities. - When a Multipart/Related entity is being processed for display or - storage, any enclosed Multipart/Related entities shall be processed - as though they were being stored. - -6.4 Configuration Considerations - - It is suggested that MUAs that use configuration mechanisms, see - [CFG] for an example, refer to Multipart/Related as Multi- - part/Related/<type>, were <type> is the value of the "type" - parameter. - -7. Security Considerations - - Security considerations relevant to Multipart/Related are identical - to those of the underlying content-type. - -8. Acknowledgments - - This proposal is the result of conversations the author has had with - many people. In particular, Harald A. Alvestrand, James Clark, - Charles Goldfarb, Gary Houston, Ned Freed, Ray Moody, and Don - Stinchfield, provided both encouragement and invaluable help. The - author, however, take full responsibility for all errors contained in - this document. - - - - - - - - - -Levinson Standards Track [Page 8] - -RFC 2387 Multipart/Related August 1998 - - -9. References - - [822] Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", STD 11, RFC 822, August 1982. - - [CID] Levinson, E., and J. Clark, "Message/External-Body - Content-ID Access Type", RFC 1873, December 1995, - Levinson, E., "Message/External-Body Content-ID Access - Type", Work in Progress. - - [CFG] Borenstein, N., "A User Agent Configuration Mechanism For - Multimedia Mail Format Information", RFC 1524, September - 1993. - - [DISP] Troost, R., and S. Dorner, "Communicating Presentation - Information in Internet Messages: The Content- - Disposition Header", RFC 1806, June 1995. - - [MIME] Borenstein, N., and Freed, N., "Multipurpose Internet - Mail Extensions (MIME) Part One: Format of Internet - Message Bodies", RFC 2045, November 1996. - -9. Author's Address - - Edward Levinson - 47 Clive Street - Metuchen, NJ 08840-1060 - USA - - Phone: +1 908 494 1606 - EMail: XIson@cnj.digex.com - -10. Changes from previous draft (RFC 2112) - - Corrected cid urls to conform to RFC 2111; the angle brackets were - removed. - - - - - - - - - - - - - - - -Levinson Standards Track [Page 9] - -RFC 2387 Multipart/Related August 1998 - - -11. Full Copyright Statement - - Copyright (C) The Internet Society (1998). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Levinson Standards Track [Page 10] - diff --git a/proto/rfc2425.txt b/proto/rfc2425.txt @@ -1,1851 +0,0 @@ - - - - - - -Network Working Group T. Howes -Request for Comments: 2425 M. Smith -Category: Standards Track Netscape Communications Corp. - F. Dawson - Lotus Development Corporation - September 1998 - - - A MIME Content-Type for Directory Information - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1998). All Rights Reserved. - -1. Abstract - - This document defines a MIME Content-Type for holding directory - information. The definition is independent of any particular - directory service or protocol. The text/directory Content-Type is - defined for holding a variety of directory information, for example, - name, or email address, or logo. The text/directory Content-Type can - also be used as the root body part in a multipart/related Content- - Type for handling more complicated situations, especially those in - which non-textual information that already has a natural MIME - representation, for example, a photograph or sound, is to be - represented. - - The text/directory Content-Type defines a general framework and - format for holding directory information in a simple "type:value" - form. We refer to "type" in this context meaning a property or - attribute with which the value is associated. Mechanisms are defined - to specify alternate languages, encodings and other meta-information. - This document also defines the procedure by which particular formats, - called profiles, for carrying application-specific information within - a text/directory Content-Type can be defined and registered, and the - conventions such formats must follow. It is expected that other - documents will be produced that define such formats for various - applications (e.g., white pages). - - - - - -Howes, et. al. Standards Track [Page 1] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this - document are to be interpreted as described in [RFC-2119]. - -2. Table of Contents - - Status of the Memo................................................ 1 - Copyright Notice.................................................. 1 - 1. Abstract...................................................... 1 - 2. Table of Contents............................................. 2 - 3. Need for a MIME Directory Type................................ 3 - 4. Overview...................................................... 4 - 5. The text/directory Content-Type............................... 4 - 5.1. MIME media type name........................................ 4 - 5.2. MIME subtype name........................................... 5 - 5.3. Required parameters......................................... 5 - 5.4. Optional parameters......................................... 5 - 5.5. Encoding considerations..................................... 5 - 5.6. Security considerations..................................... 6 - 5.7. Interoperability considerations............................. 6 - 5.8. Published specification..................................... 6 - 5.8.1. Line delimiting and folding............................... 6 - 5.8.2. ABNF content-type definition.............................. 7 - 5.8.3. Pre-defined Parameters.................................... 9 - 5.8.4. Pre-defined Value Types...................................11 - 5.9. Applications which use this media type......................14 - 5.10. Additional information.....................................14 - 5.11. Person & email address to contact for further information..14 - 5.12. Intended usage.............................................14 - 5.13. Author/Change controller...................................15 - 6. Predefined Types..............................................15 - 6.1. SOURCE Type Definition......................................15 - 6.2. NAME Type Definition........................................16 - 6.3. PROFILE Type Definition.....................................16 - 6.4. BEGIN Type Definition.......................................17 - 6.5. END Type Definition.........................................17 - 7. Use of the multipart/related Content-Type.....................18 - 8. Examples.......................................................18 - 8.1. Example 1...................................................19 - 8.2. Example 2...................................................19 - 8.3. Example 3...................................................20 - 8.4. Example 4...................................................21 - 9. Registration of new profiles..................................22 - 9.1. Define the profile..........................................22 - 9.2. Post the profile definition.................................23 - 9.3. Allow a comment period......................................23 - 9.4. Submit the profile for approval.............................23 - 10. Profile Change Control.......................................23 - - - -Howes, et. al. Standards Track [Page 2] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - 11. Registration of new types....................................24 - 11.1. Define the type............................................24 - 11.2. Post the type definition...................................25 - 11.3. Allow a comment period.....................................25 - 11.4. Submit the type for approval...............................25 - 12. Type Change Control..........................................25 - 13. Registration of new parameters...............................26 - 13.1. Define the parameter.......................................26 - 13.2. Post the parameter definition..............................27 - 13.3. Allow a comment period.....................................27 - 13.4. Submit the parameter for approval..........................27 - 14. Parameter Change Control.....................................28 - 15. Registration of new value types..............................28 - 15.1. Define the value type......................................28 - 15.2. Post the value type definition.............................29 - 15.3. Allow a comment period.....................................29 - 15.4. Submit the value type for approval.........................29 - 16. Security Considerations......................................30 - 17. Acknowledgements..............................................30 - 18. References....................................................30 - 19. Authors' Addresses...........................................32 - 20. Full Copyright Statement......................................33 - -3. Need for a MIME Directory Type - - For purposes of this document, a directory is a special-purpose - database that contains typed information. A directory usually - supports both read and search of the information it contains, and can - support creation and modification of the information as well. - Directory information is usually accessed far more often than it is - updated. Directories can be local or global in scope. They can be - distributed or centralized. The information they contain can be - replicated, with weak or strong consistency requirements. - - There are several situations in which users of Internet mail might - wish to exchange directory information: the email analogy of a - "business card" exchange; the conveyance of directory information to - a user having only email access to the Internet; the provision of - machine-parseable address information when purchasing goods or - services over the Internet; etc. As MIME [RFC-2045, RFC-2046] is - used increasingly by other protocols, most notably HTTP, it can also - be useful for these protocols to carry directory information in MIME - format. Such a format, for example, could be used to represent URC - (uniform resource characteristics) information about resources on the - World Wide Web, or to provide a rudimentary directory service over - HTTP. - - - - - -Howes, et. al. Standards Track [Page 3] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -4. Overview - - The scheme defined here for representing directory information in a - MIME Content-Type has two parts. First, the text/directory Content- - Type is defined for use in holding directory information within a - single body part, for example name, title, or email address. In its - simplest form, the format uses a "type:value" approach, which should - be easily parseable by existing MIME implementations and - understandable by users. More complicated situations can be - represented also. This document defines the general form the - information in the Content-Type should have, and the procedure by - which specific types and values (properties) for particular - applications can be defined. The framework is general enough to - handle information from any number of end directory services, - including LDAP [RFC-1777, RFC-1778], WHOIS++ [RFC-1835], and X.500 - [X500]. - - Directory entries can include far more than just textual information. - Some such information (e.g., an image or sound) overlaps with - predefined MIME Content-Types. In these cases it can be desirable to - include the information in its well-known MIME format. This situation - is handled by using a multipart/related Content-Type as defined in - [RFC-2112]. The root component of this type is a text/directory body - part specifying any in-line information, and for information - contained in other Content-Types, the Content-IDs (in URI form) of - those parts. - - In some applications, it can be useful to include a pointer (e.g, a - URI) to some directory information rather than the information - itself. This document defines a general mechanism for accomplishing - this. - -5. The text/directory Content-Type - - The text/directory Content-Type is used to hold basic directory - information and URIs referencing other information, including other - MIME body parts holding supplementary or non-textual directory - information, such as an image or sound. It is defined as follows, - using the MIME media type registration template from [RFC-2048]. - - To: ietf-types@uninett.no - Subject: Registration of MIME media type text/directory - -5.1. MIME media type name - - MIME media type name: text - - - - - -Howes, et. al. Standards Track [Page 4] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -5.2. MIME subtype name - - MIME subtype name: directory - -5.3. Required parameters - - Required parameters: charset - - The "charset" parameter is as defined in [RFC-2046] for other body - parts. It is used to identify the default character set used within - the body part. - -5.4. Optional parameters - - Optional parameters: profile - - The "profile" parameter is used to convey the type(s) of entity(ies) - to which the directory information pertains and the likely set of - information associated with the entity(ies). It is intended only as a - guide to applications interpreting the information contained within - the body part. It SHOULD NOT be used to exclude or require particular - pieces of information unless a profile definition specifically calls - for this behavior. Unless specifically forbidden by a particular - profile definition, a text/directory content type can contain - arbitrary attribute/value pairs. - - The value of the "profile" parameter is defined as follows. Profile - names are case insensitive (i.e., the profile name "vCard" is the - same as "VCARD" and "vcard" and "vcArD"). - - profile = x-name / iana-token - - x-name = "x-" 1*(ALPHA / DIGIT / "-") - ; Names beginning with "x-" or "X-" are - ; reserved for experimental use not intended for released - ; products, or for use in bilateral agreements. - - iana-token = <a publicly-defined extension token, registered - with IANA, as specified in Section 9 of this - document> - -5.5. Encoding considerations - - The default encoding is 8bit. Otherwise, as specified by the - Content-Transfer-Encoding header field. - - - - - - -Howes, et. al. Standards Track [Page 5] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -5.6. Security considerations - - Directory information can be public or it can be protected from - unauthorized access by the directory service in which it resides. - Once the information leaves its native service, there can be no - guarantee that the same care will be taken by all services handling - the information. Furthermore, this specification defines no access - control mechanism by which information can be protected, or by which - access control information can be conveyed. Note that the integrity - and privacy of a text/directory body part can be protected by - enclosing it within an appropriate MIME-based security mechanism. - -5.7. Interoperability considerations - - In order to make sense of directory information, applications must - share a common understanding of the types of information contained - within the Content-Type (the directory schema). This schema - information is not defined in this document, but rather in companion - documents (e.g., [MIME-VCARD]) that follow the requirements specified - in this document, or in bilateral agreements between communicating - parties. - -5.8. Published specification - - The text/directory Content-Type contains directory information, - typically pertaining to a single directory entity or group of - entities. The content consists of one or more lines in the format - given below. - -5.8.1. Line delimiting and folding - - Individual lines within the MIME text/directory Content Type body are - delimited by the [RFC-822] line break, which is a CRLF sequence - (ASCII decimal 13, followed by ASCII decimal 10). Long logical lines - of text can be split into a multiple-physical-line representation - using the following folding technique. - - A logical line MAY be continued on the next physical line anywhere - between two characters by inserting a CRLF immediately followed by a - single white space character (space, ASCII decimal 32, or horizontal - tab, ASCII decimal 9). At least one character must be present on the - folded line. Any sequence of CRLF followed immediately by a single - white space character is ignored (removed) when processing the - content type. For example the line: - - DESCRIPTION:This is a long description that exists on a long line. - - Can be represented as: - - - -Howes, et. al. Standards Track [Page 6] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - DESCRIPTION:This is a long description - that exists on a long line. - - It could also be represented as: - - DESCRIPTION:This is a long descrip - tion that exists o - n a long line. - - The process of moving from this folded multiple-line representation - of a type definition to its single line representation is called - unfolding. Unfolding is accomplished by regarding CRLF immediately - followed by a white space character (namely HTAB ASCII decimal 9 or - SPACE ASCII decimal 32) as equivalent to no characters at all (i.e., - the CRLF and single white space character are removed). - -5.8.2. ABNF content-type definition - - The following ABNF uses the notation of RFC 2234, which also defines - CRLF, WSP, DQUOTE, VCHAR, ALPHA, and DIGIT. After the unfolding of - any folded lines as described above, the syntax for a line of this - content type is as follows: - - contentline = [group "."] name *(";" param) ":" value CRLF - ; When parsing a content line, folded lines MUST first - ; be unfolded according to the unfolding procedure - ; described above. - ; When generating a content line, lines longer than 75 - ; characters SHOULD be folded according to the folding - ; procedure described above. - - group = 1*(ALPHA / DIGIT / "-") - - name = x-name / iana-token - - iana-token = 1*(ALPHA / DIGIT / "-") - ; identifier registered with IANA - - x-name = "x-" 1*(ALPHA / DIGIT / "-") - ; Names that begin with "x-" or "X-" are - ; reserved for experimental use, not intended for released - ; products, or for use in bilateral agreements. - - param = param-name "=" param-value *("," param-value) - - param-name = x-name / iana-token - - param-value = ptext / quoted-string - - - -Howes, et. al. Standards Track [Page 7] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - ptext = *SAFE-CHAR - - value = *VALUE-CHAR - / valuespec ; valuespec defined in section 5.8.4 - - quoted-string = DQUOTE *QSAFE-CHAR DQUOTE - - NON-ASCII = %x80-FF - ; use restricted by charset parameter - ; on outer MIME object (UTF-8 preferred) - - QSAFE-CHAR = WSP / %x21 / %x23-7E / NON-ASCII - ; Any character except CTLs, DQUOTE - - SAFE-CHAR = WSP / %x21 / %x23-2B / %x2D-39 / %x3C-7E / NON-ASCII - ; Any character except CTLs, DQUOTE, ";", ":", "," - - VALUE-CHAR = WSP / VCHAR / NON-ASCII - ; any textual character - - A line that begins with a white space character is a continuation of - the previous line, as described above. The white space character and - immediately preceeding CRLF should be discarded when reconstructing - the original line. Note that this line-folding convention differs - from that found in RFC 822, in that the sequence <CRLF><WSP> found - anywhere in the content indicates a continued line and should be - removed. - - Various type names and the format of the corresponding values are - defined as specified in Section 11. Specifications MAY impose - ordering on the type constructs within a body part, though none is - required by default. The various x-name constructs are used for - bilaterally-agreed upon type names, parameter names and parameter - values, or for use in experimental settings. - - Type names and parameter names are case insensitive (e.g., the type - name "fn" is the same as "FN" and "Fn"). Parameter values MAY be case - sensitive or case insensitive, depending on their definition. - - The group construct is used to group related attributes together. - The group name is a syntactic convention used to indicate that all - type names prefaced with the same group name SHOULD be grouped - together when displayed by an application. It has no other - significance. Implementations that do not understand or support - grouping MAY simply strip off any text before a "." to the left of - the type name and present the types and values as normal. - - - - - -Howes, et. al. Standards Track [Page 8] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Each attribute defined in the text/directory body MAY have multiple - values, if allowed in the definition of the profile in which the - attribute is used. The general rule for encoding multi-valued items - is to simply create a new content line for each value (including the - type name). However, it should be noted that some value types - support encoding multiple values in a single content line by - separating the values with a comma ",". This approach has been taken - for several of the content types defined below (date, time, integer, - float), for space-saving reasons. - -5.8.3. Pre-defined Parameters - - The following parameters and value types are defined for general use. - - predefined-param = encodingparm - / valuetypeparm - / languageparm - / contextparm - - encodingparm = "encoding" "=" encodingtype - - encodingtype = "b" ; from RFC 2047 - / iana-token ; registered as described in - ; section 15 of this document - - valuetypeparm = "value" "=" valuetype - - valuetype = "uri" ; genericurl from secion 5 of RFC 1738 - / "text" - / "date" - / "time" - / "date-time" ; date time - / "integer" - / "boolean" - / "float" - / x-name - / iana-token ; registered as described in - ; section 15 of this document - - languageparm = "language" "=" Language-Tag - ; Language-Tag is defined in section 2 of RFC 1766 - - contextparm = "context" "=" context - - context = x-name - / iana-token - - - - - -Howes, et. al. Standards Track [Page 9] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - The "language" type parameter is used to identify data in multiple - languages. There is no concept of "default" language, except as - specified by any "Content-Language" MIME header parameter that is - present. The value of the "language" type parameter is a language - tag as defined in Section 2 of [RFC-1766]. - - The "context" type parameter is used to identify a context (e.g., a - protocol) used in interpreting the value. This is used, for example, - in the "source" type, defined below. - - The "encoding" type parameter is used to specify an alternate - encoding for a value. If the value contains a CRLF, it must be - encoded, since CRLF is used to separate lines in the content-type - itself. Currently, only the "b" encoding is supported. - - The "b" encoding can also be useful for binary values that are mixed - with other text information in the body part (e.g., a certificate). - Using a per-value "b" encoding in this case leaves the other - information in a more readable form. The encoded base 64 value can be - split across multiple physical lines in the content type by using the - line folding technique described above. - - The Content-Transfer-Encoding header field is used to specify the - encoding used for the body part as a whole. The "encoding" type - parameter is used to specify an encoding for a particular value - (e.g., a certificate). In this case, the Content-Transfer-Encoding - header might specify "8bit", while the one certificate value might - specify an encoding of "b" via an "encoding=b" type parameter. - - The Content-Transfer-Encoding and the encodings of individual types - given by the "encoding" type parameter are independent of one - another. When encoding a text/directory body part for transmission, - individual type encodings are performed first, then the entire body - part is encoded according to the Content-Transfer-Encoding. When - decoding a text/directory body part, the Content-Transfer-Encoding is - decoded first, and then any individual types with an "encoding" type - parameter are decoded. - - The "value" parameter is optional, and is used to identify the value - type (data type) and format of the value. The use of these - predefined formats is encouraged even if the value parameter is not - explicity used. By defining a standard set of value types and their - formats, existing parsing and processing code can be leveraged. - - Including the value type explicitly as part of each property provides - an extra hint to keep parsing simple and support more generalized - applications. For example a search engine would not have to know the - particular value types for all of the items for which it is - - - -Howes, et. al. Standards Track [Page 10] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - searching. Because the value type is explicit in the definition, the - search engine could look for dates in any item type and provide - results that can still be interpreted. - -5.8.4. Pre-defined Value Types - - The format for values corresponding to the predefined valuetype - specifications given above are defined. - - valuespec = text-list - / genericurl ; from section 5 of RFC 1738 - / date-list - / time-list - / date-time-list - / boolean - / integer-list - / float-list - / iana-valuespec - - text-list = *TEXT-LIST-CHAR *("," *TEXT-LIST-CHAR) - - TEXT-LIST-CHAR = "\\" / "\," / "\n" - / <any VALUE-CHAR except , or \ or newline> - ; Backslashes, newlines, and commas must be encoded. - ; \n or \N can be used to encode a newline. - - date-list = date *("," date) - - time-list = time *("," time) - - date-time-list = date "T" time *("," date "T" time) - - boolean = "TRUE" / "FALSE" - - integer-list = integer *("," integer) - - integer = [sign] 1*DIGIT - - float-list = float *("," float) - - float = [sign] 1*DIGIT ["." 1*DIGIT] - - sign = "+" / "-" - - date = date-fullyear ["-"] date-month ["-"] date-mday - - date-fullyear = 4 DIGIT - - - - -Howes, et. al. Standards Track [Page 11] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - date-month = 2 DIGIT ;01-12 - - date-mday = 2 DIGIT ;01-28, 01-29, 01-30, 01-31 - ;based on month/year - - time = time-hour [":"] time-minute [":"] time-second [time-secfrac] - [time-zone] - - time-hour = 2 DIGIT ;00-23 - - time-minute = 2 DIGIT ;00-59 - - time-second = 2 DIGIT ;00-60 (leap second) - - time-secfrac = "," 1*DIGIT - - time-zone = "Z" / time-numzone - - time-numzome = sign time-hour [":"] time-minute - - iana-valuespec = <a publicly-defined valuetype format, registered - with IANA, as defined in section 15 of this - document> - - Some specific notes on the value types and formats: - - "text": The "text" value type should be used to identify values that - contain human-readable text. The character set and language in which - the text is represented is controlled by the charset content-header - and the language type parameter and content-header. - - Examples for "text": - this is a text value - this is one value,this is another - this is a single value\, with a comma encoded - - A formatted text line break in a text value type MUST be represented - as the character sequence backslash (ASCII decimal 92) followed by a - Latin small letter n (ASCII decimal 110) or a Latin capital letter N - (ASCII decimal 78), that is "\n" or "\N". - - For example a multiple line DESCRIPTION value of: - - Mythical Manager - Hyjinx Software Division - BabsCo, Inc. - - could be represented as: - - - -Howes, et. al. Standards Track [Page 12] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - DESCRIPTION:Mythical Manager\nHyjinx Software Division\n - BabsCo\, Inc.\n - - demonstrating the \n literal formatted line break technique, the - CRLF-followed-by-space line folding technique, and the backslash - escape technique. - - "uri": The "uri" value type should be used to identify values that - are referenced by a URI (including a Content-ID URI), instead of - encoded in-line. These value references might be used if the value is - too large, or otherwise undesirable to include directly. The format - for the URI is as defined in RFC 1738. - - Examples for "uri": - http://www.foobar.com/my/picture.jpg - ldap://ldap.foobar.com/cn=babs%20jensen - - "date", "time", and "date-time": Each of these value types is based - on a subset of the definitions in ISO 8601 standard. Profiles MAY - place further restrictions on "date" and "time" values. Multiple - "date" and "time" values can be specified using the comma-separated - notation, unless restricted by a profile. - - Examples for "date": - 1985-04-12 - 1996-08-05,1996-11-11 - 19850412 - - Examples for "time": - 10:22:00 - 102200 - 10:22:00.33 - 10:22:00.33Z - 10:22:33,11:22:00 - 10:22:00-08:00 - - Examples for "date-time": - 1996-10-22T14:00:00Z - 1996-08-11T12:34:56Z - 19960811T123456Z - 1996-10-22T14:00:00Z,1996-08-11T12:34:56Z - - "boolean": The "boolean" value type is used to express boolen values. - These values are case insensitive. - - Examples: TRUE - false - True - - - -Howes, et. al. Standards Track [Page 13] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - "integer": The "integer" value type is used to express signed - integers in decimal format. If sign is not specified, the value is - assumed positive "+". Multiple "integer" values can be specified - using the comma-separated notation, unless restricted by a profile. - - Examples: 1234567890 - -1234556790 - +1234556790,432109876 - - "float": The "float" value type is used to express real numbers. If - sign is not specified, the value is assumed positive "+". Multiple - "float" values can be specified using the comma-separated notation, - unless restricted by a profile. - - Examples: 20.30 - 1000000.0000001 - 1.333,3.14 - -5.9. Applications which use this media type - - Applications which use this media type: Various - -5.10. Additional information - - Additional information: None - -5.11. Person & email address to contact for further information - - Tim Howes - Netscape Communications Corp. - 501 East Middlefield Rd. - Mountain View, CA 94041 - USA - howes@netscape.com - +1 415 937 3419 - -5.12. Intended usage - - Intended usage: COMMON - - - - - - - - - - - - -Howes, et. al. Standards Track [Page 14] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -5.13. Author/Change controller - - Tim Howes - Netscape Communications Corp. - 501 East Middlefield Rd. - Mountain View, CA 94041 - USA - howes@netscape.com - +1 415 937 3419 - - Mark Smith - Netscape Communications Corp. - 501 East Middlefield Rd. - Mountain View, CA 94041 - USA - mcs@netscape.com - +1 415 937 3477 - - Frank Dawson - Lotus Development Corporation - 6544 Battleford Drive - Raleigh, NC 27613-3502 - USA - frank_dawson@lotus.com - +1-919-676-9515 - -6. Predefined Types - - The following types are generally useful regardless of the profile - being carried and are defined below using the text/directory MIME - type registration template defined in Section 11.1 of this document. - These types MAY be included in any profile, unless explicitly - forbidden in the profile definition. - -6.1. SOURCE Type Definition - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type SOURCE - - Type name: SOURCE - - Type purpose: To identify the source of directory information - contained in the content type. - - Type encoding: 8bit - - Type valuetype: uri - - - - -Howes, et. al. Standards Track [Page 15] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Type special notes: The SOURCE type is used to provide the means by - which applications knowledgable in the given directory service - protocol can obtain additional or more up-to-date information from - the directory service. It contains a URI as defined in [RFC-1738] - and/or other information referencing the directory entity or entities - to which the information pertains. When directory information is - available from more than one source, the sending entity can pick what - it considers to be the best source, or multiple SOURCE types can be - included. The interpretation of the value for a SOURCE type can - depend on the setting of the CONTEXT type parameter. The value of the - CONTEXT type parameter MUST be compatible with the value of the uri - prefix. - - Type example: - SOURCE;CONTEXT=LDAP:ldap://ldap.host/cn=Babs%20Jensen, - %20o=Babsco,%20c=US - -6.2. NAME Type Definition - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type NAME - - Type name: NAME - - Type purpose: To identify the displayable name of the directory - entity to which information in the content type pertains. - - Type encoding: 8bit - - Type valuetype: text - - Type special notes: The NAME type is used to convey the display name - of the entity to which the directory information pertains. - - Type example: - NAME:Babs Jensen's Contact Information - -6.3. PROFILE Type Definition - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type PROFILE - - Type name: PROFILE - - Type purpose: To identify the type of directory entity to which - information in the content type pertains. - - Type encoding: 8bit - - - -Howes, et. al. Standards Track [Page 16] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Type valuetype: A profile name, registered as described in Section 9 - of this document or bilaterally agreed upon as described in Section - 5. - - Type special notes: The PROFILE type is used to convey the type of - the entity to which the directory information in the rest of the body - part pertains. It should be the same as the "profile" header - parameter, if present. - - Type example: - PROFILE:vCard - -6.4. BEGIN Type Definition - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type BEGIN - - Type name: BEGIN - - Type purpose: To denote the beginning of a syntactic entity within a - text/directory content-type. - - Type encoding: 8bit - - Type valuetype: text, containing a profile name, registered as - described in Section 9 of this document or bilaterally-agreed upon as - described in Section 5. - - Type special notes: The BEGIN type is used in conjunction with the - END type to delimit a profile containing a related set of properties - within an text/directory content-type. This construct can be used - instead of or in addition to wrapping separate sets of information - inside additional MIME headers. It is provided for applications that - wish to define content that can contain multiple entities within the - same text/directory content-type or to define content that can be - identifiable outside of a MIME environment. - - Type example: - BEGIN:VCARD - -6.5. END Type Definition - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type END - - Type name: END - - - - - -Howes, et. al. Standards Track [Page 17] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Type purpose: To denote the end of a syntactic entity within a - text/directory content-type. - - Type encoding: 8bit - - Type valuetype: text, containing a profile name, registered as - described in Section 9 of this document or bilaterally-agreed upon as - described in Section 5. - - Type special notes: The END type is used in conjunction with the - BEGIN type to delimit a profile containing a related set of - properties within an text/directory content-type. This construct can - be used instead of or in addition to wrapping separate sets of - information inside additional MIME headers. It is provided for - applications that wish to define content that can contain multiple - entities within the same text/directory content-type or to define - content that can be identifiable outside of a MIME environment. - - Type example: - END: VCARD - -7. Use of the multipart/related Content-Type - - The multipart/related Content-Type can be used to hold directory - information comprised of both text and non-text information or - directory information that already has a natural MIME representation. - The root body part within the multipart/related body part is - specified as defined in [RFC-2112] by a "start" parameter, or it is - the first body part in the absence of such a parameter. The root - body part must have a Content-Type of "text/directory". This part - holds inline information and makes reference to subsequent body parts - holding additional text or non-text directory information via their - Content-ID URIs as explained in Section 5. - - The body parts referred to do not have to be in any particular order, - except as noted above for the root body part. - -8. Examples - - The following examples are for illustrative purposes only and are not - part of the definition. - - - - - - - - - - -Howes, et. al. Standards Track [Page 18] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -8.1. Example 1 - - The first example illustrates simple use of the text/directory - Content-Type. Note that no "profile" parameter is given, so an - application may not know what kind of directory entity the - information applies to. Note also the use of both hypothetical - official and bilaterally agreed upon types. - - From: Whomever@wherever.com - To: Someone@somewhere.com - Subject: whatever - MIME-Version: 1.0 - Message-ID: <id1@host.net> - Content-Type: text/directory - Content-ID: <id2@host.com> - - cn:Babs Jensen - cn:Barbara J Jensen - sn:Jensen - email:babs@umich.edu - phone:+1 313 747-4454 - x-id:1234567890 - -8.2. Example 2 - - The next example illustrates the use of the Quoted-Printable transfer - encoding defined in [RFC 2045] to include non-ASCII character in some - of the information returned, and the use of the optional "name" and - "source" types. It also illustrates the use of an "encoding" type - parameter to encode a certificate value in "b". A "vCard" profile - [MIME- VCARD] is used for the example. - -Content-Type: text/directory; - charset="iso-8859-1"; - profile="vCard" -Content-ID: <id3@host.com> -Content-Transfer-Encoding: Quoted-Printable - -begin:VCARD -source:ldap://cn=bjorn%20Jensen, o=university%20of%20Michigan, c=US -name:Bjorn Jensen -fn:Bj=F8rn Jensen -n:Jensen;Bj=F8rn -email;type=internet:bjorn@umich.edu -tel;type=work,voice,msg:+1 313 747-4454 -key;type=x509;encoding=B:dGhpcyBjb3VsZCBiZSAKbXkgY2VydGlmaWNhdGUK -end:VCARD - - - - -Howes, et. al. Standards Track [Page 19] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -8.3. Example 3 - - The next example illustrates the use of multi-valued type parameters, - the "language" type parameter, the "value" type parameter, folding of - long lines, the \n encoding for formatted lines, attribute grouping, - and the inline "b" encoding. A "vCard" profile [MIME-VCARD] is used - for the example. - -Content-Type: text/directory; profile="vcard"; charset=iso-8859-1 -Content-ID: <id3@host.com> -Content-Transfer-Encoding: Quoted-Printable - -begin:vcard -source:ldap://cn=Meister%20Berger,o=Universitaet%20Goerlitz,c=DE -name:Meister Berger -fn:Meister Berger -n:Berger;Meister -bday;value=date:1963-09-21 -o:Universit=E6t G=F6rlitz -title:Mayor -title;language=de;value=text:Burgermeister -note:The Mayor of the great city of - Goerlitz in the great country of Germany. -email;internet:mb@goerlitz.de -home.tel;type=fax,voice,msg:+49 3581 123456 -home.label:Hufenshlagel 1234\n - 02828 Goerlitz\n - Deutschland -key;type=X509;encoding=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcNAQEEBQ - AwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bmljYXRpb25zI - ENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0ZW1zMRwwGgYDVQQD - ExNyb290Y2EubmV0c2NhcGUuY29tMB4XDTk3MDYwNjE5NDc1OVoXDTk3MTIwMzE5NDc - 1OVowgYkxCzAJBgNVBAYTAlVTMSYwJAYDVQQKEx1OZXRzY2FwZSBDb21tdW5pY2F0aW - 9ucyBDb3JwLjEYMBYGA1UEAxMPVGltb3RoeSBBIEhvd2VzMSEwHwYJKoZIhvcNAQkBF - hJob3dlc0BuZXRzY2FwZS5jb20xFTATBgoJkiaJk/IsZAEBEwVob3dlczBcMA0GCSqG - SIb3DQEBAQUAA0sAMEgCQQC0JZf6wkg8pLMXHHCUvMfL5H6zjSk4vTTXZpYyrdN2dXc - oX49LKiOmgeJSzoiFKHtLOIboyludF90CgqcxtwKnAgMBAAGjNjA0MBEGCWCGSAGG+E - IBAQQEAwIAoDAfBgNVHSMEGDAWgBT84FToB/GV3jr3mcau+hUMbsQukjANBgkqhkiG9 - w0BAQQFAAOBgQBexv7o7mi3PLXadkmNP9LcIPmx93HGp0Kgyx1jIVMyNgsemeAwBM+M - SlhMfcpbTrONwNjZYW8vJDSoi//yrZlVt9bJbs7MNYZVsyF1unsqaln4/vy6Uawfg8V - UMk1U7jt8LYpo4YULU7UZHPYVUaSgVttImOHZIKi4hlPXBOhcUQ== -end:vcard - - - - - - - - - -Howes, et. al. Standards Track [Page 20] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -8.4. Example 4 - - The final example illustrates the use of the multipart/related - Content-Type to include non-textual directory data via the "uri" - encoding to refer to other body parts within the same message, or to - external values. Note that no "profile" parameter is given, so an - application may not know what kind of directory entity the - information applies to. Note also the use of both hypothetical - official and bilaterally agreed upon types. - -Content-Type: multipart/related; - boundary=woof; - type="text/directory"; - start="<id5@host.com>" -Content-ID: <id4@host.com> - ---woof -Content-Type: text/directory; charset="iso-8859-1" -Content-ID: <id5@host.com> -Content-Transfer-Encoding: Quoted-Printable - -source:ldap://cn=Bjorn%20Jensen,o=University%20of%20Michigan,c=US -cn:Bj=F8rn Jensen -sn:Jensen -email:bjorn@umich.edu -image;value=uri:cid:id6@host.com -image;value=uri;format=jpeg:ftp://some.host/some/path.jpg -sound;value=uri:cid:id7@host.com -phone:+1 313 747-4454 - ---woof -Content-Type: image/jpeg -Content-ID: <id6@host.com> - -<...image data...> - ---woof -Content-Type: message/external-body; - name="myvoice.au"; - site="myhost.com"; - access-type=ANON-FTP; - directory="pub/myname"; - mode="image" - -Content-Type: audio/basic -Content-ID: <id7@host.com> - ---woof-- - - - -Howes, et. al. Standards Track [Page 21] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -9. Registration of new profiles - - This section defines procedures by which new profiles are registered - with the IANA and made available to the Internet community. Note that - non-IANA profiles can be used by bilateral agreement, provided the - associated profile names follow the "X-" convention defined above. - - The procedures defined here are designed to allow public comment and - review of new profiles, while posing only a small impediment to the - definition of new profiles. - - Registration of a new profile is accomplished by the following steps. - -9.1. Define the profile - - A profile is defined by completing the following template. - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME profile XXX - - Profile name: - - Profile purpose: - - Profile types: - - Profile special notes (optional): - - Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) - - The explanation of what goes in each field in the template follows. - - Profile name: The name of the profile as it will appear in the - text/directory MIME Content-Type "profile" header parameter, or the - predefined "profile" type name. - - Profile purpose: The purpose of the profile (e.g., to represent - information about people, printers, documents, etc.). Give a short - but clear description. - - Profile types: The list of types associated with the profile. This - list of types is to be expected but not required in the profile, - unless otherwise noted in the profile definition. Other types not - mentioned in the profile definition MAY also be present. Note that - any new types referenced by the profile MUST be defined separately as - described in Section 10. - - - - - -Howes, et. al. Standards Track [Page 22] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Profile special notes: Any special notes about the profile, how it is - to be used, etc. This section of the template can also be used to - define an ordering on the types that appear in the Content-Type, if - such an ordering is required. - -9.2. Post the profile definition - - The profile description must be posted to the new profile discussion - list, ietf-mime-direct@imc.org - -9.3. Allow a comment period - - Discussion on the new profile must be allowed to take place on the - list for a minimum of two weeks. Consensus must be reached on the - profile before proceeding to step 4. - -9.4. Submit the profile for approval - - Once the two-week comment period has elapsed, and the proposer is - convinced consensus has been reached on the profile, the registration - application should be submitted to the Profile Reviewer for approval. - The Profile Reviewer is appointed by the Application Area Directors - and can either accept or reject the profile registration. An accepted - registration is passed on by the Profile Reviewer to the IANA for - inclusion in the official IANA profile registry. The registration may - be rejected for any of the following reasons. 1) Insufficient comment - period; 2) Consensus not reached; 3) Technical deficiencies raised on - the list or elsewhere have not been addressed. The Profile Reviewer's - decision to reject a profile can be appealed by the proposer to the - IESG, or the objections raised can be addressed by the proposer and - the profile resubmitted. - -10. Profile Change Control - - Existing profiles can be changed using the same process by which they - were registered. - - Define the change - - Post the change - - Allow a comment period - - Submit the changed profile for approval - - - - - - - -Howes, et. al. Standards Track [Page 23] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Note that the original author or any other interested party can - propose a change to an existing profile, but that such changes should - only be proposed when there are serious omissions or errors in the - published specification. The Profile Reviewer can object to a change - if it is not backwards compatible, but is not required to do so. - - Profile definitions can never be deleted from the IANA registry, but - profiles which are no longer believed to be useful can be declared - OBSOLETE by a change to their "intended use" field. - -11. Registration of new types - - This section defines procedures by which new types are registered - with the IANA. Note that non-IANA types can be used by bilateral - agreement, provided the associated types names follow the "X-" - convention defined above. - - The procedures defined here are designed to allow public comment and - review of new types, while posing only a small impediment to the - definition of new types. - - Registration of a new type is accomplished by the following steps. - -11.1. Define the type - - A type is defined by completing the following template. - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type XXX - - Type name: - - Type purpose: - - Type encoding: - - Type valuetype: - - Type special notes (optional): - - Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) - - The meaning of each field in the template is as follows. - - Type name: The name of the type, as it will appear in the body of an - text/directory MIME Content-Type "type: value" line to the left of - the colon ":". - - - - -Howes, et. al. Standards Track [Page 24] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Type purpose: The purpose of the type (e.g., to represent a name, - postal address, IP address, etc.). Give a short but clear - description. - - Type encoding: The default encoding a value of the type must have in - the body of a text/directory MIME Content-Type. - - Type valuetype: The format a value of the type must have in the body - of a text/directory MIME Content-Type. This description must be - precise and must not violate the general encoding rules defined in - section 5 of this document. - - Type special notes: Any special notes about the type, how it is to be - used, etc. - -11.2. Post the type definition - - The type description must be posted to the new type discussion list, - ietf-mime-direct@imc.org - -11.3. Allow a comment period - - Discussion on the new type must be allowed to take place on the list - for a minimum of two weeks. Consensus must be reached on the type - before proceeding to step 4. - -11.4. Submit the type for approval - - Once the two-week comment period has elapsed, and the proposer is - convinced consensus has been reached on the type, the registration - application should be submitted to the Profile Reviewer for approval. - The Profile Reviewer is appointed by the Application Area Directors - and can either accept or reject the type registration. An accepted - registration is passed on by the Profile Reviewer to the IANA for - inclusion in the official IANA profile registry. The registration can - be rejected for any of the following reasons. 1) Insufficient comment - period; 2) Consensus not reached; 3) Technical deficiencies raised on - the list or elsewhere have not been addressed. The Profile - Reviewer's decision to reject a type can be appealed by the proposer - to the IESG, or the objections raised can be addressed by the - proposer and the type resubmitted. - - - - - - - - - - -Howes, et. al. Standards Track [Page 25] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -12. Type Change Control - - Existing types can be changed using the same process by which they - were registered. - - Define the change - - Post the change - - Allow a comment period - - Submit the type for approval - - Note that the original author or any other interested party can - propose a change to an existing type, but that such changes should - only be proposed when there are serious omissions or errors in the - published specification. The Profile Reviewer can object to a change - if it is not backwards compatible, but is not required to do so. - - Type definitions can never be deleted from the IANA registry, but - types which are nolonger believed to be useful can be declared - OBSOLETE by a change to their "intended use" field. - -13. Registration of new parameters - - This section defines procedures by which new parameters are - registered with the IANA and made available to the Internet - community. Note that non-IANA parameters can be used by bilateral - agreement, provided the associated parameters names follow the "X-" - convention defined above. - - The procedures defined here are designed to allow public comment and - review of new parameters, while posing only a small impediment to the - definition of new parameters. - - Registration of a new parameter is accomplished by the following - steps. - -13.1. Define the parameter - - A parameter is defined by completing the following template. - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME type parameter XXX - - Parameter name: - - Parameter purpose: - - - -Howes, et. al. Standards Track [Page 26] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - Parameter values: - - Parameter special notes (optional): - - Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) - - The explanation of what goes in each field in the template follows. - - Parameter name: The name of the parameter as it will appear in the - text/directory MIME Content-Type. - - Parameter purpose: The purpose of the parameter (e.g., to represent - the format of an image, type of a phone number, etc.). Give a short - but clear description. If defining a general paramemter like "format" - or "type" keep in mind that other applications might wish to extend - its use. - - Parameter values: The list or description of values associated with - the parameter. - - Parameter special notes: Any special notes about the parameter, how - it is to be used, etc. - -13.2. Post the parameter definition - - The parameter description must be posted to the new parameter - discussion list, ietf-mime-direct@imc.org - -13.3. Allow a comment period - - Discussion on the new parameter must be allowed to take place on the - list for a minimum of two weeks. Consensus must be reached on the - parameter before proceeding to step 4. - -13.4. Submit the parameter for approval - - Once the two-week comment period has elapsed, and the proposer is - convinced consensus has been reached on the parameter, the - registration application should be submitted to the Profile Reviewer - for approval. The Profile Reviewer is appointed by the Application - Area Directors and can either accept or reject the parameter - registration. An accepted registration is passed on by the Profile - Reviewer to the IANA for inclusion in the official IANA parameter - registry. The registration can be rejected for any of the following - reasons. 1) Insufficient comment period; 2) Consensus not reached; 3) - Technical deficiencies raised on the list or elsewhere have not been - addressed. The Profile Reviewer's decision to reject a profile can be - appealed by the proposer to the IESG, or the objections raised can be - - - -Howes, et. al. Standards Track [Page 27] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - addressed by the proposer and the parameter registration resubmitted. - -14. Parameter Change Control - - Existing parameters can be changed using the same process by which - they were registered. - - Define the change - - Post the change - - Allow a comment period - - Submit the parameter for approval - - Note that the original author or any other interested party can - propose a change to an existing parameter, but that such changes - should only be proposed when there are serious omissions or errors in - the published specification. The Profile Reviewer can object to a - change if it is not backwards compatible, but is not required to do - so. - - Parameter definitions can never be deleted from the IANA registry, - but parameters which are nolonger believed to be useful can be - declared OBSOLETE by a change to their "intended use" field. - -15. Registration of new value types - - This section defines procedures by which new value types are - registered with the IANA and made available to the Internet - community. Note that non-IANA value types can be used by bilateral - agreement, provided the associated value types names follow the "X-" - convention defined above. - - The procedures defined here are designed to allow public comment and - review of new value types, while posing only a small impediment to - the definition of new value types. - - Registration of a new value types is accomplished by the following - steps. - -15.1. Define the value type - - A value type is defined by completing the following template. - - To: ietf-mime-direct@imc.org - Subject: Registration of text/directory MIME value type XXX - - - - -Howes, et. al. Standards Track [Page 28] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - value type name: - - value type purpose: - - value type format: - - value type special notes (optional): - - Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) - - The explanation of what goes in each field in the template follows. - - value type name: The name of the value type as it will appear in the - text/directory MIME Content-Type. - - value type purpose: The purpose of the value type. Give a short but - clear description. - - value type format: The definition of the format for the value, - usually using ABNF grammar. - - value type special notes: Any special notes about the value type, how - it is to be used, etc. - -15.2. Post the value type definition - - The value type description must be posted to the new value type - discussion list, ietf-mime-direct@imc.org - -15.3. Allow a comment period - - Discussion on the new value type must be allowed to take place on the - list for a minimum of two weeks. Consensus must be reached before - proceeding to step 4. - -15.4. Submit the value type for approval - - Once the two-week comment period has elapsed, and the proposer is - convinced consensus has been reached on the value type, the - registration application should be submitted to the Profile Reviewer - for approval. The Profile Reviewer is appointed by the Application - Area Directors and can either accept or reject the value type - registration. An accepted registration should be passed on by the - Profile Reviewer to the IANA for inclusion in the official IANA value - type registry. The registration can be rejected for any of the - following reasons. 1) Insufficient comment period; 2) Consensus not - reached; 3) Technical deficiencies raised on the list or elsewhere - have not been addressed. The Profile Reviewer's decision to reject a - - - -Howes, et. al. Standards Track [Page 29] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - profile can be appealed by the proposer to the IESG, or the - objections raised can be addressed by the proposer and the value type - registration resubmitted. - -16. Security Considerations - - Internet mail is subject to many well known security attacks, - including monitoring, replay, and forgery. Care should be taken by - any directory service in allowing information to leave the scope of - the service itself, where any access controls can no longer be - guaranteed. Applications should also take care to display directory - data in a "safe" environment (e.g., PostScript-valued types). - -17. Acknowledgements - - The registration procedures defined here were shamelessly lifted from - the MIME registration RFC. - - The many valuable comments contributed by members of the IETF ASID - working group are gratefully acknowledged, as are the contributions - of the Versit Consortium. Chris Newman was especially helpful in - navigating the intricacies of ABNF lore. - -18. References - - [RFC-1777] Yeong, W., Howes, T., and S. Kille, "Lightweight - Directory Access Protocol", RFC 1777, March 1995. - - [RFC-1778] Howes, T., Kille, S., Yeong, W., and C. Robbins, "The - String Representation of Standard Attribute Syntaxes", - RFC 1778, March 1995. - - [RFC-822] Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", STD 11, RFC 822, August 1982. - - [RFC-2045] Borenstein, N., and N. Freed, "Multipurpose Internet - Mail Extensions (MIME) Part One: Format of Internet - Message Bodies", RFC 2045, November 1996. - - [RFC-2046] Moore, K., "Multipurpose Internet Mail Extensions (MIME) - Part Two: Media Types", RFC 2046, November 1996. - - [RFC-2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) Part Four: Registration - Procedures", RFC 2048, November 1996. - - [RFC-1766] Alvestrand, H., "Tags for the Identification of - Languages", RFC 1766, March 1995. - - - -Howes, et. al. Standards Track [Page 30] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - - [RFC-2112] Levinson, E., "The MIME Multipart/Related Content-type", - RFC 2112, March 1997. - - [X500] "Information Processing Systems - Open Systems - Interconnection - The Directory: Overview of Concepts, - Models and Services", ISO/IEC JTC 1/SC21, International - Standard 9594-1, 1988. - - [RFC-1835] Deutsch, P., Schoultz, R., Faltstrom, P., and C. Weider, - "Architecture of the WHOIS++ service", RFC 1835, August - 1995. - - [RFC-1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform - Resource Locators (URL)", RFC 1738, December 1994. - - [MIME-VCARD] Dawson, F., and T. Howes, "VCard MIME Directory - Profile", RFC 2426, September 1998. - - [VCARD] Internet Mail Consortium, "vCard - The Electronic - Business Card", Version 2.1, - http://www.imc.com/pdi/vcard-21.txt, September, 1996. - - [RFC-2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC-2234] Crocker, D., and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 2234, November 1997. - - - - - - - - - - - - - - - - - - - - - - - - -Howes, et. al. Standards Track [Page 31] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -19. Authors' Addresses - - Tim Howes - Netscape Communications Corp. - 501 East Middlefield Rd. - Mountain View, CA 94041 - USA - - Phone: +1.415.937.3419 - EMail: howes@netscape.com - - - Mark Smith - Netscape Communications Corp. - 501 East Middlefield Rd. - Mountain View, CA 94041 - USA - - Phone: +1.415.937.3477 - EMail: mcs@netscape.com - - - Frank Dawson - Lotus Development Corporation - 6544 Battleford Drive - Raleigh, NC 27613 - USA - - Phone: +1-919-676-9515 - EMail: frank_dawson@lotus.com - - - - - - - - - - - - - - - - - - - - - -Howes, et. al. Standards Track [Page 32] - -RFC 2425 MIME Content-Type for Directory Information September 1998 - - -20. Full Copyright Statement - - Copyright (C) The Internet Society (1998). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Howes, et. al. Standards Track [Page 33] - diff --git a/proto/rfc2426.txt b/proto/rfc2426.txt @@ -1,2355 +0,0 @@ - - - - - - -Network Working Group F. Dawson -Request for Comments: 2426 Lotus Development Corporation -Category: Standards Track T. Howes - Netscape Communications - September 1998 - - - vCard MIME Directory Profile - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1998). All Rights Reserved. - -Abstract - - This memo defines the profile of the MIME Content-Type [MIME-DIR] for - directory information for a white-pages person object, based on a - vCard electronic business card. The profile definition is independent - of any particular directory service or protocol. The profile is - defined for representing and exchanging a variety of information - about an individual (e.g., formatted and structured name and delivery - addresses, email address, multiple telephone numbers, photograph, - logo, audio clips, etc.). The directory information used by this - profile is based on the attributes for the person object defined in - the X.520 and X.521 directory services recommendations. The profile - also provides the method for including a [VCARD] representation of a - white-pages directory entry within the MIME Content-Type defined by - the [MIME-DIR] document. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this - document are to be interpreted as described in [RFC 2119]. - - - - - - - - - - - -Dawson & Howes Standards Track [Page 1] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -Table of Contents - - Overview.........................................................3 - 1. THE VCARD MIME DIRECTORY PROFILE REGISTRATION.................4 - 2. MIME DIRECTORY FEATURES.......................................5 - 2.1 PREDEFINED TYPE USAGE ......................................5 - 2.1.1 BEGIN and END Type ......................................5 - 2.1.2 NAME Type ...............................................5 - 2.1.3 PROFILE Type ............................................5 - 2.1.4 SOURCE Type .............................................5 - 2.2 PREDEFINED TYPE PARAMETER USAGE ............................6 - 2.3 PREDEFINED VALUE TYPE USAGE ................................6 - 2.4 EXTENSIONS TO THE PREDEFINED VALUE TYPES ...................6 - 2.4.1 BINARY ..................................................6 - 2.4.2 VCARD ...................................................6 - 2.4.3 PHONE-NUMBER ............................................7 - 2.4.4 UTC-OFFSET ..............................................7 - 2.5 STRUCTURED TYPE VALUES .....................................7 - 2.6 LINE DELIMITING AND FOLDING ................................8 - 3. VCARD PROFILE FEATURES........................................8 - 3.1 IDENTIFICATION TYPES .......................................8 - 3.1.1 FN Type Definition ......................................8 - 3.1.2 N Type Definition .......................................9 - 3.1.3 NICKNAME Type Definition ................................9 - 3.1.4 PHOTO Type Definition ..................................10 - 3.1.5 BDAY Type Definition ...................................11 - 3.2 DELIVERY ADDRESSING TYPES .................................11 - 3.2.1 ADR Type Definition ....................................11 - 3.2.2 LABEL Type Definition ..................................13 - 3.3 TELECOMMUNICATIONS ADDRESSING TYPES .......................13 - 3.3.1 TEL Type Definition ....................................14 - 3.3.2 EMAIL Type Definition ..................................15 - 3.3.3 MAILER Type Definition .................................15 - 3.4 GEOGRAPHICAL TYPES ........................................16 - 3.4.1 TZ Type Definition .....................................16 - 3.4.2 GEO Type Definition ....................................16 - 3.5 ORGANIZATIONAL TYPES ......................................17 - 3.5.1 TITLE Type Definition ..................................17 - 3.5.2 ROLE Type Definition ...................................18 - 3.5.3 LOGO Type Definition ...................................18 - 3.5.4 AGENT Type Definition ..................................19 - 3.5.5 ORG Type Definition ....................................20 - 3.6 EXPLANATORY TYPES .........................................20 - 3.6.1 CATEGORIES Type Definition .............................20 - 3.6.2 NOTE Type Definition ...................................21 - 3.6.3 PRODID Type Definition .................................21 - 3.6.4 REV Type Definition ....................................22 - 3.6.5 SORT-STRING Type Definition ............................22 - - - -Dawson & Howes Standards Track [Page 2] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - 3.6.6 SOUND Type Definition ..................................23 - 3.6.7 UID Type Definition ....................................24 - 3.6.8 URL Type Definition ....................................25 - 3.6.9 VERSION Type Definition ................................25 - 3.7 SECURITY TYPES ............................................25 - 3.7.1 CLASS Type Definition ..................................26 - 3.7.2 KEY Type Definition ....................................26 - 3.8 EXTENDED TYPES ............................................27 - 4. FORMAL GRAMMAR...............................................27 - 5. DIFFERENCES FROM VCARD V2.1..................................37 - 6. ACKNOWLEDGEMENTS.............................................39 - 7. AUTHORS' ADDRESSES...........................................39 - 8. SECURITY CONSIDERATIONS......................................39 - 9. REFERENCES...................................................40 - 10. FULL COPYRIGHT STATEMENT....................................42 - -Overview - - The [MIME-DIR] document defines a MIME Content-Type for holding - different kinds of directory information. The directory information - can be based on any of a number of directory schemas. This document - defines a [MIME-DIR] usage profile for conveying directory - information based on one such schema; that of the white-pages type of - person object. - - The schema is based on the attributes for the person object defined - in the X.520 and X.521 directory services recommendations. The schema - has augmented the basic attributes defined in the X.500 series - recommendation in order to provide for an electronic representation - of the information commonly found on a paper business card. This - schema was first defined in the [VCARD] document. Hence, this [MIME- - DIR] profile is referred to as the vCard MIME Directory Profile. - - A directory entry based on this usage profile can include traditional - directory, white-pages information such as the distinguished name - used to uniquely identify the entry, a formatted representation of - the name used for user-interface or presentation purposes, both the - structured and presentation form of the delivery address, various - telephone numbers and organizational information associated with the - entry. In addition, traditional paper business card information such - as an image of an organizational logo or identify photograph can be - included in this person object. - - The vCard MIME Directory Profile also provides support for - representing other important information about the person associated - with the directory entry. For instance, the date of birth of the - person; an audio clip describing the pronunciation of the name - associated with the directory entry, or some other application of the - - - -Dawson & Howes Standards Track [Page 3] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - digital sound; longitude and latitude geo-positioning information - related to the person associated with the directory entry; date and - time that the directory information was last updated; annotations - often written on a business card; Uniform Resource Locators (URL) for - a website; public key information. The profile also provides support - for non-standard extensions to the schema. This provides the - flexibility for implementations to augment the current capabilities - of the profile in a standardized way. More information about this - electronic business card format can be found in [VCARD]. - -1. The vCard Mime Directory Profile Registration - - This profile is identified by the following [MIME-DIR] registration - template information. Subsequent sections define the profile - definition. - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME profile VCARD - - Profile name: VCARD - - Profile purpose: To hold person object or white-pages type of - directory information. The person schema captured in the directory - entries is that commonly found in an electronic business card. - - Predefined MIME Directory value specifications used: uri, date, - date-time, float - - New value specifications: This profile places further constraints on - the [MIME-DIR] text value specification. In addition, it adds a - binary, phone-number, utc-offset and vcard value specifications. - - Predefined MIME Directory types used: SOURCE, NAME, PROFILE, BEGIN, - END. - - Predefined MIME Directory parameters used: ENCODING, VALUE, CHARSET, - LANGUAGE, CONTEXT. - - New types: FN, N, NICKNAME, PHOTO, BDAY, ADR, LABEL, TEL, EMAIL, - MAILER, TZ, GEO, TITLE, ROLE, LOGO, AGENT, ORG, CATEGORIES, NOTE, - PRODID, REV, SORT-STRING, SOUND, URL, UID, VERSION, CLASS, KEY - - New parameters: TYPE - - Profile special notes: The vCard object MUST contain the FN, N and - VERSION types. The type-grouping feature of [MIME-DIR] is supported - by this profile to group related vCard properties about a directory - - - -Dawson & Howes Standards Track [Page 4] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - entry. For example, vCard properties describing WORK or HOME related - characteristics can be grouped with a unique group label. - - The profile permits the use of non-standard types (i.e., those - identified with the prefix string "X-") as a flexible method for - implementations to extend the functionality currently defined within - this profile. - -2. MIME Directory Features - - The vCard MIME Directory Profile makes use of many of the features - defined by [MIME-DIR]. The following sections either clarify or - extend the content-type definition of [MIME-DIR]. - -2.1 Predefined Type Usage - - The vCard MIME Directory Profile uses the following predefined types - from [MIME-DIR]. - -2.1.1 BEGIN and END Type - - The content entity MUST begin with the BEGIN type with a value of - "VCARD". The content entity MUST end with the END type with a value - of "VCARD". - -2.1.2 NAME Type - - If the NAME type is present, then its value is the displayable, - presentation text associated with the source for the vCard, as - specified in the SOURCE type. - -2.1.3 PROFILE Type - - If the PROFILE type is present, then its value MUST be "VCARD". - -2.1.4 SOURCE Type - - If the SOURCE type is present, then its value provides information - how to find the source for the vCard. - - - - - - - - - - - - -Dawson & Howes Standards Track [Page 5] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -2.2 Predefined Type Parameter Usage - - The vCard MIME Directory Profile uses the following predefined type - parameters as defined by [MIME-DIR]. - - - LANGUAGE - - - ENCODING - - - VALUE - -2.3 Predefined VALUE Type Usage - - The predefined data type values specified in [MIME-DIR] MUST NOT be - repeated in COMMA separated value lists except within the N, - NICKNAME, ADR and CATEGORIES value types. - - The text value type defined in [MIME-DIR] is further restricted such - that any SEMI-COLON character (ASCII decimal 59) in the value MUST be - escaped with the BACKSLASH character (ASCII decimal 92). - -2.4 Extensions To The Predefined VALUE Types - - The predefined data type values specified in [MIME-DIR] have been - extended by the vCard profile to include a number of value types that - are specific to this profile. - -2.4.1 BINARY - - The "binary" value type specifies that the type value is inline, - encoded binary data. This value type can be specified in the PHOTO, - LOGO, SOUND, and KEY types. - - If inline encoded binary data is specified, the ENCODING type - parameter MUST be used to specify the encoding format. The binary - data MUST be encoded using the "B" encoding format. Long lines of - encoded binary data SHOULD BE folded to 75 characters using the - folding method defined in [MIME-DIR]. - - The value type is defined by the following notation: - - binary = <A "B" binary encoded string as defined by [RFC 2047].> - -2.4.2 VCARD - - The "vcard" value type specifies that the type value is another - vCard. This value type can be specified in the AGENT type. The value - type is defined by this specification. Since each of the type - - - -Dawson & Howes Standards Track [Page 6] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - declarations with in the vcard value type are being specified within - a text value themselves, they MUST be terminated with the backslash - escape sequence "\n" or "\N", instead of the normal newline character - sequence CRLF. In addition, any COMMA character (ASCII decimal 44), - SEMI-COLON character (ASCII decimal 59) and COLON character (ASCII - decimal 58) MUST be escaped with the BACKSLASH character (ASCII - decimal 92). For example, with the AGENT type a value would be - specified as: - - AGENT:BEGIN:VCARD\nFN:Joe Friday\nTEL:+1-919-555-7878\n - TITLE:Area Administrator\, Assistant\n EMAIL\;TYPE=INTERN\n - ET:jfriday@host.com\nEND:VCARD\n - -2.4.3 PHONE-NUMBER - - The "phone-number" value type specifies that the type value is a - telephone number. This value type can be specified in the TEL type. - The value type is a text value that has the special semantics of a - telephone number as defined in [CCITT E.163] and [CCITT X.121]. - -2.4.4 UTC-OFFSET - - The "utc-offset" value type specifies that the type value is a signed - offset from UTC. This value type can be specified in the TZ type. - - The value type is an offset from Coordinated Universal Time (UTC). It - is specified as a positive or negative difference in units of hours - and minutes (e.g., +hh:mm). The time is specified as a 24-hour clock. - Hour values are from 00 to 23, and minute values are from 00 to 59. - Hour and minutes are 2-digits with high order zeroes required to - maintain digit count. The extended format for ISO 8601 UTC offsets - MUST be used. The extended format makes use of a colon character as a - separator of the hour and minute text fields. - - The value is defined by the following notation: - - time-hour = 2DIGIT ;00-23 - time-minute = 2DIGIT ;00-59 - utc-offset = ("+" / "-") time-hour ":" time-minute - -2.5 Structured Type Values - - Compound type values are delimited by a field delimiter, specified by - the SEMI-COLON character (ASCII decimal 59). A SEMI-COLON in a - component of a compound property value MUST be escaped with a - BACKSLASH character (ASCII decimal 92). - - - - - -Dawson & Howes Standards Track [Page 7] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Lists of values are delimited by a list delimiter, specified by the - COMMA character (ASCII decimal 44). A COMMA character in a value MUST - be escaped with a BACKSLASH character (ASCII decimal 92). - - This profile supports the type grouping mechanism defined in [MIME- - DIR]. Grouping of related types is a useful technique to communicate - common semantics concerning the properties of a vCard. - -2.6 Line Delimiting and Folding - - This profile supports the same line delimiting and folding methods - defined in [MIME-DIR]. Specifically, when parsing a content line, - folded lines must first be unfolded according to the unfolding - procedure described in [MIME-DIR]. After generating a content line, - lines longer than 75 characters SHOULD be folded according to the - folding procedure described in [MIME DIR]. - - Folding is done after any content encoding of a type value. Unfolding - is done before any decoding of a type value in a content line. - -3. vCard Profile Features - - The vCard MIME Directory Profile Type contains directory information, - typically pertaining to a single directory entry. The information is - described using an attribute schema that is tailored for capturing - personal contact information. The vCard can include attributes that - describe identification, delivery addressing, telecommunications - addressing, geographical, organizational, general explanatory and - security and access information about the particular object - associated with the vCard. - -3.1 Identification Types - - These types are used in the vCard profile to capture information - associated with the identification and naming of the person or - resource associated with the vCard. - -3.1.1 FN Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type FN - - Type name:FN - - Type purpose: To specify the formatted text corresponding to the name - of the object the vCard represents. - - - - -Dawson & Howes Standards Track [Page 8] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: This type is based on the semantics of the X.520 - Common Name attribute. The property MUST be present in the vCard - object. - - Type example: - - FN:Mr. John Q. Public\, Esq. - -3.1.2 N Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type N - - Type name: N - - Type purpose: To specify the components of the name of the object the - vCard represents. - - Type encoding: 8bit - - Type value: A single structured text value. Each component can have - multiple values. - - Type special note: The structured type value corresponds, in - sequence, to the Family Name, Given Name, Additional Names, Honorific - Prefixes, and Honorific Suffixes. The text components are separated - by the SEMI-COLON character (ASCII decimal 59). Individual text - components can include multiple text values (e.g., multiple - Additional Names) separated by the COMMA character (ASCII decimal - 44). This type is based on the semantics of the X.520 individual name - attributes. The property MUST be present in the vCard object. - - Type example: - - N:Public;John;Quinlan;Mr.;Esq. - - N:Stevenson;John;Philip,Paul;Dr.;Jr.,M.D.,A.C.P. - -3.1.3 NICKNAME Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type NICKNAME - - - -Dawson & Howes Standards Track [Page 9] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type name: NICKNAME - - Type purpose: To specify the text corresponding to the nickname of - the object the vCard represents. - - Type encoding: 8bit - - Type value: One or more text values separated by a COMMA character - (ASCII decimal 44). - - Type special note: The nickname is the descriptive name given instead - of or in addition to the one belonging to a person, place, or thing. - It can also be used to specify a familiar form of a proper name - specified by the FN or N types. - - Type example: - - NICKNAME:Robbie - - NICKNAME:Jim,Jimmie - -3.1.4 PHOTO Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type PHOTO - - Type name: PHOTO - - Type purpose: To specify an image or photograph information that - annotates some aspect of the object the vCard represents. - - Type encoding: The encoding MUST be reset to "b" using the ENCODING - parameter in order to specify inline, encoded binary data. If the - value is referenced by a URI value, then the default encoding of 8bit - is used and no explicit ENCODING parameter is needed. - - Type value: A single value. The default is binary value. It can also - be reset to uri value. The uri value can be used to specify a value - outside of this MIME entity. - - Type special notes: The type can include the type parameter "TYPE" to - specify the graphic image format type. The TYPE parameter values MUST - be one of the IANA registered image formats or a non-standard image - format. - - - - - - -Dawson & Howes Standards Track [Page 10] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type example: - - PHOTO;VALUE=uri:http://www.abc.com/pub/photos - /jqpublic.gif - - - PHOTO;ENCODING=b;TYPE=JPEG:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN - AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm - ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 - <...remainder of "B" encoded binary data...> - -3.1.5 BDAY Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type BDAY - - Type name: BDAY - - Type purpose: To specify the birth date of the object the vCard - represents. - - Type encoding: 8bit - - Type value: The default is a single date value. It can also be reset - to a single date-time value. - - Type examples: - - BDAY:1996-04-15 - - BDAY:1953-10-15T23:10:00Z - - BDAY:1987-09-27T08:30:00-06:00 - -3.2 Delivery Addressing Types - - These types are concerned with information related to the delivery - addressing or label for the vCard object. - -3.2.1 ADR Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type ADR - - Type name: ADR - - - - -Dawson & Howes Standards Track [Page 11] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type purpose: To specify the components of the delivery address for - the vCard object. - - Type encoding: 8bit - - Type value: A single structured text value, separated by the - SEMI-COLON character (ASCII decimal 59). - - Type special notes: The structured type value consists of a sequence - of address components. The component values MUST be specified in - their corresponding position. The structured type value corresponds, - in sequence, to the post office box; the extended address; the street - address; the locality (e.g., city); the region (e.g., state or - province); the postal code; the country name. When a component value - is missing, the associated component separator MUST still be - specified. - - The text components are separated by the SEMI-COLON character (ASCII - decimal 59). Where it makes semantic sense, individual text - components can include multiple text values (e.g., a "street" - component with multiple lines) separated by the COMMA character - (ASCII decimal 44). - - The type can include the type parameter "TYPE" to specify the - delivery address type. The TYPE parameter values can include "dom" to - indicate a domestic delivery address; "intl" to indicate an - international delivery address; "postal" to indicate a postal - delivery address; "parcel" to indicate a parcel delivery address; - "home" to indicate a delivery address for a residence; "work" to - indicate delivery address for a place of work; and "pref" to indicate - the preferred delivery address when more than one address is - specified. These type parameter values can be specified as a - parameter list (i.e., "TYPE=dom;TYPE=postal") or as a value list - (i.e., "TYPE=dom,postal"). This type is based on semantics of the - X.520 geographical and postal addressing attributes. The default is - "TYPE=intl,postal,parcel,work". The default can be overridden to some - other set of values by specifying one or more alternate values. For - example, the default can be reset to "TYPE=dom,postal,work,home" to - specify a domestic delivery address for postal delivery to a - residence that is also used for work. - - Type example: In this example the post office box and the extended - address are absent. - - ADR;TYPE=dom,home,postal,parcel:;;123 Main - Street;Any Town;CA;91921-1234 - - - - - -Dawson & Howes Standards Track [Page 12] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -3.2.2 LABEL Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type LABEL - - Type name: LABEL - - Type purpose: To specify the formatted text corresponding to delivery - address of the object the vCard represents. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: The type value is formatted text that can be used - to present a delivery address label for the vCard object. The type - can include the type parameter "TYPE" to specify delivery label type. - The TYPE parameter values can include "dom" to indicate a domestic - delivery label; "intl" to indicate an international delivery label; - "postal" to indicate a postal delivery label; "parcel" to indicate a - parcel delivery label; "home" to indicate a delivery label for a - residence; "work" to indicate delivery label for a place of work; and - "pref" to indicate the preferred delivery label when more than one - label is specified. These type parameter values can be specified as a - parameter list (i.e., "TYPE=dom;TYPE=postal") or as a value list - (i.e., "TYPE=dom,postal"). This type is based on semantics of the - X.520 geographical and postal addressing attributes. The default is - "TYPE=intl,postal,parcel,work". The default can be overridden to some - other set of values by specifying one or more alternate values. For - example, the default can be reset to "TYPE=intl,post,parcel,home" to - specify an international delivery label for both postal and parcel - delivery to a residential location. - - Type example: A multi-line address label. - - LABEL;TYPE=dom,home,postal,parcel:Mr.John Q. Public\, Esq.\n - Mail Drop: TNE QB\n123 Main Street\nAny Town\, CA 91921-1234 - \nU.S.A. - -3.3 Telecommunications Addressing Types - - These types are concerned with information associated with the - telecommunications addressing of the object the vCard represents. - - - - - - - -Dawson & Howes Standards Track [Page 13] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -3.3.1 TEL Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type TEL - - Type name: TEL - - Type purpose: To specify the telephone number for telephony - communication with the object the vCard represents. - - Type encoding: 8bit - - Type value: A single phone-number value. - - Type special notes: The value of this type is specified in a - canonical form in order to specify an unambiguous representation of - the globally unique telephone endpoint. This type is based on the - X.500 Telephone Number attribute. - - The type can include the type parameter "TYPE" to specify intended - use for the telephone number. The TYPE parameter values can include: - "home" to indicate a telephone number associated with a residence, - "msg" to indicate the telephone number has voice messaging support, - "work" to indicate a telephone number associated with a place of - work, "pref" to indicate a preferred-use telephone number, "voice" to - indicate a voice telephone number, "fax" to indicate a facsimile - telephone number, "cell" to indicate a cellular telephone number, - "video" to indicate a video conferencing telephone number, "pager" to - indicate a paging device telephone number, "bbs" to indicate a - bulletin board system telephone number, "modem" to indicate a MODEM - connected telephone number, "car" to indicate a car-phone telephone - number, "isdn" to indicate an ISDN service telephone number, "pcs" to - indicate a personal communication services telephone number. The - default type is "voice". These type parameter values can be specified - as a parameter list (i.e., "TYPE=work;TYPE=voice") or as a value list - (i.e., "TYPE=work,voice"). The default can be overridden to another - set of values by specifying one or more alternate values. For - example, the default TYPE of "voice" can be reset to a WORK and HOME, - VOICE and FAX telephone number by the value list - "TYPE=work,home,voice,fax". - - Type example: - - TEL;TYPE=work,voice,pref,msg:+1-213-555-1234 - - - - - - -Dawson & Howes Standards Track [Page 14] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -3.3.2 EMAIL Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type EMAIL - - Type name: EMAIL - - Type purpose: To specify the electronic mail address for - communication with the object the vCard represents. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: The type can include the type parameter "TYPE" to - specify the format or preference of the electronic mail address. The - TYPE parameter values can include: "internet" to indicate an Internet - addressing type, "x400" to indicate a X.400 addressing type or "pref" - to indicate a preferred-use email address when more than one is - specified. Another IANA registered address type can also be - specified. The default email type is "internet". A non-standard value - can also be specified. - - Type example: - - EMAIL;TYPE=internet:jqpublic@xyz.dom1.com - - EMAIL;TYPE=internet:jdoe@isp.net - - EMAIL;TYPE=internet,pref:jane_doe@abc.com - -3.3.3 MAILER Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type MAILER - - Type name: MAILER - - Type purpose: To specify the type of electronic mail software that is - used by the individual associated with the vCard. - - Type encoding: 8bit - - Type value: A single text value. - - - - - -Dawson & Howes Standards Track [Page 15] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type special notes: This information can provide assistance to a - correspondent regarding the type of data representation which can be - used, and how they can be packaged. This property is based on the - private MIME type X-Mailer that is generally implemented by MIME user - agent products. - - Type example: - - MAILER:PigeonMail 2.1 - -3.4 Geographical Types - - These types are concerned with information associated with - geographical positions or regions associated with the object the - vCard represents. - -3.4.1 TZ Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type TZ - - Type name: TZ - - Type purpose: To specify information related to the time zone of the - object the vCard represents. - - Type encoding: 8bit - - Type value: The default is a single utc-offset value. It can also be - reset to a single text value. - - Type special notes: The type value consists of a single value. - - Type examples: - - TZ:-05:00 - - TZ;VALUE=text:-05:00; EST; Raleigh/North America - ;This example has a single value, not a structure text value. - -3.4.2 GEO Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type GEO - - Type name: GEO - - - -Dawson & Howes Standards Track [Page 16] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type purpose: To specify information related to the global - positioning of the object the vCard represents. - - Type encoding: 8bit - - Type value: A single structured value consisting of two float values - separated by the SEMI-COLON character (ASCII decimal 59). - - Type special notes: This type specifies information related to the - global position of the object associated with the vCard. The value - specifies latitude and longitude, in that order (i.e., "LAT LON" - ordering). The longitude represents the location east and west of the - prime meridian as a positive or negative real number, respectively. - The latitude represents the location north and south of the equator - as a positive or negative real number, respectively. The longitude - and latitude values MUST be specified as decimal degrees and should - be specified to six decimal places. This will allow for granularity - within a meter of the geographical position. The text components are - separated by the SEMI-COLON character (ASCII decimal 59). The simple - formula for converting degrees-minutes-seconds into decimal degrees - is: - - decimal = degrees + minutes/60 + seconds/3600. - - Type example: - - GEO:37.386013;-122.082932 - -3.5 Organizational Types - - These types are concerned with information associated with - characteristics of the organization or organizational units of the - object the vCard represents. - -3.5.1 TITLE Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type TITLE - - Type name: TITLE - - Type purpose: To specify the job title, functional position or - function of the object the vCard represents. - - Type encoding: 8bit - - Type value: A single text value. - - - -Dawson & Howes Standards Track [Page 17] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type special notes: This type is based on the X.520 Title attribute. - - Type example: - - TITLE:Director\, Research and Development - -3.5.2 ROLE Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type ROLE - - Type name: ROLE - - Type purpose: To specify information concerning the role, occupation, - or business category of the object the vCard represents. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: This type is based on the X.520 Business Category - explanatory attribute. This property is included as an organizational - type to avoid confusion with the semantics of the TITLE type and - incorrect usage of that type when the semantics of this type is - intended. - - Type example: - - ROLE:Programmer - -3.5.3 LOGO Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type LOGO - - Type name: LOGO - - Type purpose: To specify a graphic image of a logo associated with - the object the vCard represents. - - Type encoding: The encoding MUST be reset to "b" using the ENCODING - parameter in order to specify inline, encoded binary data. If the - value is referenced by a URI value, then the default encoding of 8bit - is used and no explicit ENCODING parameter is needed. - - - - - -Dawson & Howes Standards Track [Page 18] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type value: A single value. The default is binary value. It can also - be reset to uri value. The uri value can be used to specify a value - outside of this MIME entity. - - Type special notes: The type can include the type parameter "TYPE" to - specify the graphic image format type. The TYPE parameter values MUST - be one of the IANA registered image formats or a non-standard image - format. - - Type example: - - LOGO;VALUE=uri:http://www.abc.com/pub/logos/abccorp.jpg - - LOGO;ENCODING=b;TYPE=JPEG:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN - AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm - ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 - <...the remainder of "B" encoded binary data...> - -3.5.4 AGENT Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type AGENT - - Type name: AGENT - - Type purpose: To specify information about another person who will - act on behalf of the individual or resource associated with the - vCard. - - Type encoding: 8-bit - - Type value: The default is a single vcard value. It can also be reset - to either a single text or uri value. The text value can be used to - specify textual information. The uri value can be used to specify - information outside of this MIME entity. - - Type special notes: This type typically is used to specify an area - administrator, assistant, or secretary for the individual associated - with the vCard. A key characteristic of the Agent type is that it - represents somebody or something that is separately addressable. - - Type example: - - AGENT;VALUE=uri: - CID:JQPUBLIC.part3.960129T083020.xyzMail@host3.com - - - - - -Dawson & Howes Standards Track [Page 19] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - AGENT:BEGIN:VCARD\nFN:Susan Thomas\nTEL:+1-919-555- - 1234\nEMAIL\;INTERNET:sthomas@host.com\nEND:VCARD\n - -3.5.5 ORG Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type ORG - - Type name: ORG - - Type purpose: To specify the organizational name and units associated - with the vCard. - - Type encoding: 8bit - - Type value: A single structured text value consisting of components - separated the SEMI-COLON character (ASCII decimal 59). - - Type special notes: The type is based on the X.520 Organization Name - and Organization Unit attributes. The type value is a structured type - consisting of the organization name, followed by one or more levels - of organizational unit names. - - Type example: A type value consisting of an organizational name, - organizational unit #1 name and organizational unit #2 name. - - ORG:ABC\, Inc.;North American Division;Marketing - -3.6 Explanatory Types - - These types are concerned with additional explanations, such as that - related to informational notes or revisions specific to the vCard. - -3.6.1 CATEGORIES Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type CATEGORIES - - Type name: CATEGORIES - - Type purpose: To specify application category information about the - vCard. - - Type encoding: 8bit - - - - - -Dawson & Howes Standards Track [Page 20] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type value: One or more text values separated by a COMMA character - (ASCII decimal 44). - - Type example: - - CATEGORIES:TRAVEL AGENT - - CATEGORIES:INTERNET,IETF,INDUSTRY,INFORMATION TECHNOLOGY - -3.6.2 NOTE Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type NOTE - - Type name: NOTE - - Type purpose: To specify supplemental information or a comment that - is associated with the vCard. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: The type is based on the X.520 Description - attribute. - - Type example: - - NOTE:This fax number is operational 0800 to 1715 - EST\, Mon-Fri. - -3.6.3 PRODID Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type PRODID - - Type name: PRODID - - Type purpose: To specify the identifier for the product that created - the vCard object. - - Type encoding: 8-bit - - Type value: A single text value. - - - - - -Dawson & Howes Standards Track [Page 21] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type special notes: Implementations SHOULD use a method such as that - specified for Formal Public Identifiers in ISO 9070 to assure that - the text value is unique. - - Type example: - - PRODID:-//ONLINE DIRECTORY//NONSGML Version 1//EN - -3.6.4 REV Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type REV - - Type name: REV - - Type purpose: To specify revision information about the current - vCard. - - Type encoding: 8-bit - - Type value: The default is a single date-time value. Can also be - reset to a single date value. - - Type special notes: The value distinguishes the current revision of - the information in this vCard for other renditions of the - information. - - Type example: - - REV:1995-10-31T22:27:10Z - - REV:1997-11-15 - -3.6.5 SORT-STRING Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type SORT-STRING - - Type Name: SORT-STRING - - Type purpose: To specify the family name or given name text to be - used for national-language-specific sorting of the FN and N types. - - Type encoding: 8bit - - Type value: A single text value. - - - -Dawson & Howes Standards Track [Page 22] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type special notes: The sort string is used to provide family name or - given name text that is to be used in locale- or national-language- - specific sorting of the formatted name and structured name types. - Without this information, sorting algorithms could incorrectly sort - this vCard within a sequence of sorted vCards. When this type is - present in a vCard, then this family name or given name value is used - for sorting the vCard. - - Type examples: For the case of family name sorting, the following - examples define common sort string usage with the FN and N types. - - FN:Rene van der Harten - N:van der Harten;Rene;J.;Sir;R.D.O.N. - SORT-STRING:Harten - - FN:Robert Pau Shou Chang - N:Pau;Shou Chang;Robert - SORT-STRING:Pau - - FN:Osamu Koura - N:Koura;Osamu - SORT-STRING:Koura - - FN:Oscar del Pozo - N:del Pozo Triscon;Oscar - SORT-STRING:Pozo - - FN:Chistine d'Aboville - N:d'Aboville;Christine - SORT-STRING:Aboville - -3.6.6 SOUND Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type SOUND - - Type name: SOUND - - Type purpose: To specify a digital sound content information that - annotates some aspect of the vCard. By default this type is used to - specify the proper pronunciation of the name type value of the vCard. - - Type encoding: The encoding MUST be reset to "b" using the ENCODING - parameter in order to specify inline, encoded binary data. If the - value is referenced by a URI value, then the default encoding of 8bit - is used and no explicit ENCODING parameter is needed. - - - - -Dawson & Howes Standards Track [Page 23] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type value: A single value. The default is binary value. It can also - be reset to uri value. The uri value can be used to specify a value - outside of this MIME entity. - - Type special notes: The type can include the type parameter "TYPE" to - specify the audio format type. The TYPE parameter values MUST be one - of the IANA registered audio formats or a non-standard audio format. - - Type example: - - SOUND;TYPE=BASIC;VALUE=uri:CID:JOHNQPUBLIC.part8. - 19960229T080000.xyzMail@host1.com - - SOUND;TYPE=BASIC;ENCODING=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN - AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm - ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 - <...the remainder of "B" encoded binary data...> - -3.6.7 UID Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type UID - - Type name: UID - - Type purpose: To specify a value that represents a globally unique - identifier corresponding to the individual or resource associated - with the vCard. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: The type is used to uniquely identify the object - that the vCard represents. - - The type can include the type parameter "TYPE" to specify the format - of the identifier. The TYPE parameter value should be an IANA - registered identifier format. The value can also be a non-standard - format. - - Type example: - - UID:19950401-080045-40000F192713-0052 - - - - - - -Dawson & Howes Standards Track [Page 24] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -3.6.8 URL Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type URL - - Type name: URL - - Type purpose: To specify a uniform resource locator associated with - the object that the vCard refers to. - - Type encoding: 8bit - - Type value: A single uri value. - - Type example: - - URL:http://www.swbyps.restaurant.french/~chezchic.html - -3.6.9 VERSION Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type VERSION - - Type name: VERSION - - Type purpose: To specify the version of the vCard specification used - to format this vCard. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: The property MUST be present in the vCard object. - The value MUST be "3.0" if the vCard corresponds to this - specification. - - Type example: - - VERSION:3.0 - -3.7 Security Types - - These types are concerned with the security of communication pathways - or access to the vCard. - - - - - -Dawson & Howes Standards Track [Page 25] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -3.7.1 CLASS Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type CLASS - - Type name: CLASS - - Type purpose: To specify the access classification for a vCard - object. - - Type encoding: 8bit - - Type value: A single text value. - - Type special notes: An access classification is only one component of - the general security model for a directory service. The - classification attribute provides a method of capturing the intent of - the owner for general access to information described by the vCard - object. - - Type examples: - - CLASS:PUBLIC - - CLASS:PRIVATE - - CLASS:CONFIDENTIAL - -3.7.2 KEY Type Definition - - To: ietf-mime-directory@imc.org - - Subject: Registration of text/directory MIME type KEY - - Type name: KEY - - Type purpose: To specify a public key or authentication certificate - associated with the object that the vCard represents. - - Type encoding: The encoding MUST be reset to "b" using the ENCODING - parameter in order to specify inline, encoded binary data. If the - value is a text value, then the default encoding of 8bit is used and - no explicit ENCODING parameter is needed. - - Type value: A single value. The default is binary. It can also be - reset to text value. The text value can be used to specify a text - key. - - - -Dawson & Howes Standards Track [Page 26] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - Type special notes: The type can also include the type parameter TYPE - to specify the public key or authentication certificate format. The - parameter type should specify an IANA registered public key or - authentication certificate format. The parameter type can also - specify a non-standard format. - - Type example: - - KEY;ENCODING=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcNAQEEBQA - wdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENbW11bmljYX - Rpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 - ZW1zMRwwGgYDVQQDExNyb290Y2EubmV0c2NhcGUuY29tMB4XDTk3MDYwNj - E5NDc1OVoXDTk3MTIwMzE5NDc1OVowgYkxCzAJBgNVBAYTAlVTMSYwJAYD - VQQKEx1OZXRzY2FwZSBDb21tdW5pY2F0aW9ucyBDb3JwLjEYMBYGA1UEAx - MPVGltb3RoeSBBIEhvd2VzMSEwHwYJKoZIhvcNAQkBFhJob3dlc0BuZXRz - Y2FwZS5jb20xFTATBgoJkiaJk/IsZAEBEwVob3dlczBcMA0GCSqGSIb3DQ - EBAQUAA0sAMEgCQQC0JZf6wkg8pLMXHHCUvMfL5H6zjSk4vTTXZpYyrdN2 - dXcoX49LKiOmgeJSzoiFKHtLOIboyludF90CgqcxtwKnAgMBAAGjNjA0MB - EGCWCGSAGG+EIBAQQEAwIAoDAfBgNVHSMEGDAWgBT84FToB/GV3jr3mcau - +hUMbsQukjANBgkqhkiG9w0BAQQFAAOBgQBexv7o7mi3PLXadkmNP9LcIP - mx93HGp0Kgyx1jIVMyNgsemeAwBM+MSlhMfcpbTrONwNjZYW8vJDSoi//y - rZlVt9bJbs7MNYZVsyF1unsqaln4/vy6Uawfg8VUMk1U7jt8LYpo4YULU7 - UZHPYVUaSgVttImOHZIKi4hlPXBOhcUQ== - -3.8 Extended Types - - The types defined by this document can be extended with private types - using the non-standard, private values mechanism defined in [RFC - 2045]. Non-standard, private types with a name starting with "X-" may - be defined bilaterally between two cooperating agents without outside - registration or standardization. - -4. Formal Grammar - - The following formal grammar is provided to assist developers in - building parsers for the vCard. - - This syntax is written according to the form described in RFC 2234, - but it references just this small subset of RFC 2234 literals: - - ;******************************************* - ; Commonly Used Literal Definition - ;******************************************* - - ALPHA = %x41-5A / %x61-7A - ; Latin Capital Letter A-Latin Capital Letter Z / - ; Latin Small Letter a-Latin Small Letter z - - - - -Dawson & Howes Standards Track [Page 27] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - CHAR = %x01-7F - ; Any C0 Controls and Basic Latin, excluding NULL from - ; Code Charts, pages 7-6 through 7-9 in [UNICODE] - - CR = %x0D - ; Carriage Return - - LF = %0A - ; Line Feed - - CRLF = CR LF - ; Internet standard newline - - ;CTL = %x00-1F / %x7F - ; Controls. Not used, but referenced in comments. - - DIGIT = %x30-39 - ; Digit Zero-Digit Nine - - DQUOTE = %x22 - ; Quotation Mark - - HTAB = %x09 - ; Horizontal Tabulation - - SP = %x20 - ; space - - VCHAR = %x21-7E - ; Visible (printing) characters - - WSP = SP / HTAB - ; White Space - - ;******************************************* - ; Basic vCard Definition - ;******************************************* - - vcard_entity = 1*(vcard) - - vcard = [group "."] "BEGIN" ":" "VCARD" 1*CRLF - 1*(contentline) - ;A vCard object MUST include the VERSION, FN and N types. - [group "."] "END" ":" "VCARD" 1*CRLF - - contentline = [group "."] name *(";" param ) ":" value CRLF - ; When parsing a content line, folded lines must first - ; be unfolded according to the unfolding procedure - - - -Dawson & Howes Standards Track [Page 28] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - ; described above. When generating a content line, lines - ; longer than 75 characters SHOULD be folded according to - ; the folding procedure described in [MIME DIR]. - - group = 1*(ALPHA / DIGIT / "-") - - name = iana-token / x-name - ; Parsing of the param and value is - ; based on the "name" or type identifier - ; as defined in ABNF sections below - - iana-token = 1*(ALPHA / DIGIT / "-") - ; vCard type or parameter identifier registered with IANA - - x-name = "X-" 1*(ALPHA / DIGIT / "-") - ; Reserved for non-standard use - - param = param-name "=" param-value *("," param-value) - - param-name = iana-token / x-name - - param-value = ptext / quoted-string - - ptext = *SAFE-CHAR - - value = *VALUE-CHAR - - quoted-string = DQUOTE QSAFE-CHAR DQUOTE - - NON-ASCII = %x80-FF - ; Use is restricted by CHARSET parameter - ; on outer MIME object (UTF-8 preferred) - - QSAFE-CHAR = WSP / %x21 / %x23-7E / NON-ASCII - ; Any character except CTLs, DQUOTE - - SAFE-CHAR = WSP / %x21 / %x23-2B / %x2D-39 / %x3C-7E / NON-ASCII - ; Any character except CTLs, DQUOTE, ";", ":", "," - - VALUE-CHAR = WSP / VCHAR / NON-ASCII - ; Any textual character - - ;******************************************* - ; vCard Type Definition - ; - ; Provides type-specific definitions for how the - ; "value" and "param" are defined. - ;******************************************* - - - -Dawson & Howes Standards Track [Page 29] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - ;For name="NAME" - param = "" - ; No parameters allowed - - value = text-value - - ;For name="PROFILE" - param = "" - ; No parameters allowed - - value = text-value - ; Value MUST be the case insensitive value "VCARD - - ;For name="SOURCE" - param = source-param - ; No parameters allowed - - value = uri - - source-param = ("VALUE" "=" "uri") - / ("CONTEXT" "=" "word") - ; Parameter value specifies the protocol context - ; for the uri value. - / (x-name "=" *SAFE-CHAR) - - ;For name="FN" - ;This type MUST be included in a vCard object. - param = text-param - ; Text parameters allowed - - value = text-value - - ;For name="N" - ;This type MUST be included in a vCard object. - - param = text-param - ; Text parameters allowed - - value = n-value - - n-value = 0*4(text-value *("," text-value) ";") - text-value *("," text-value) - ; Family; Given; Middle; Prefix; Suffix. - ; Example: Public;John;Quincy,Adams;Reverend Dr. III - - ;For name="NICKNAME" - param = text-param - ; Text parameters allowed - - - -Dawson & Howes Standards Track [Page 30] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - value = text-list - - ;For name="PHOTO" - param = img-inline-param - ; Only image parameters allowed - - param =/ img-refer-param - ; Only image parameters allowed - - value = img-inline-value - ; Value and parameter MUST match - - value =/ img-refer-value - ; Value and parameter MUST match - - ;For name="BDAY" - param = ("VALUE" "=" "date") - ; Only value parameter allowed - - param =/ ("VALUE" "=" "date-time") - ; Only value parameter allowed - - value = date-value - ; Value MUST match value type - - value =/ date-time-value - ; Value MUST match value type - - ;For name="ADR" - param = adr-param / text-param - ; Only adr and text parameters allowed - - value = adr-value - - ;For name="LABEL" - param = adr-param / text-param - ; Only adr and text parameters allowed - - value = text-value - - ;For name="TEL" - param = tel-param - ; Only tel parameters allowed - - value = phone-number-value - - tel-param = "TYPE" "=" tel-type *("," tel-type) - - - - -Dawson & Howes Standards Track [Page 31] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - tel-type = "HOME" / "WORK" / "PREF" / "VOICE" / "FAX" / "MSG" - / "CELL" / "PAGER" / "BBS" / "MODEM" / "CAR" / "ISDN" - / "VIDEO" / "PCS" / iana-token / x-name - ; Values are case insensitive - - ;For name="EMAIL" - param = email-param - ; Only email parameters allowed - - value = text-value - - email-param = "TYPE" "=" email-type ["," "PREF"] - ; Value is case insensitive - - email-type = "INTERNET" / "X400" / iana-token / "X-" word - ; Values are case insensitive - - ;For name="MAILER" - param = text-param - ; Only text parameters allowed - - value = text-value - - ;For name="TZ" - param = "" - ; No parameters allowed - - value = utc-offset-value - - ;For name="GEO" - param = "" - ; No parameters allowed - - value = float-value ";" float-value - - ;For name="TITLE" - param = text-param - ; Only text parameters allowed - - value = text-value - - ;For name="ROLE" - param = text-param - ; Only text parameters allowed - - value = text-value - - ;For name="LOGO" - - - -Dawson & Howes Standards Track [Page 32] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - param = img-inline-param / img-refer-param - ; Only image parameters allowed - - value = img-inline-value / img-refer-value - ; Value and parameter MUST match - - ;For name="AGENT" - param = agent-inline-param - - param =/ agent-refer-param - - value = agent-inline-value - ; Value and parameter MUST match - - value =/ agent-refer-value - ; Value and parameter MUST match - - agent-inline-param = "" - ; No parameters allowed - - agent-refer-param = "VALUE" "=" "uri" - ; Only value parameter allowed - - agent-inline-value = text-value - ; Value MUST be a valid vCard object - - agent-refer-value = uri - ; URI MUST refer to image content of given type - - ;For name="ORG" - - param = text-param - ; Only text parameters allowed - - value = org-value - - org-value = *(text-value ";") text-value - ; First is Organization Name, remainder are Organization Units. - - ;For name="CATEGORIES" - param = text-param - ; Only text parameters allowed - - value = text-list - - ;For name="NOTE" - param = text-param - ; Only text parameters allowed - - - -Dawson & Howes Standards Track [Page 33] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - value = text-value - - ;For name="PRODID" - param = "" - ; No parameters allowed - - value = text-value - - ;For name="REV" - param = ["VALUE" =" "date-time"] - ; Only value parameters allowed. Values are case insensitive. - - param =/ "VALUE" =" "date" - ; Only value parameters allowed. Values are case insensitive. - - value = date-time-value - - value =/ date-value - - ;For name="SORT-STRING" - param = text-param - ; Only text parameters allowed - - value = text-value - - ;For name="SOUND" - param = snd-inline-param - ; Only sound parameters allowed - - param =/ snd-refer-param - ; Only sound parameters allowed - - value = snd-line-value - ; Value MUST match value type - - value =/ snd-refer-value - ; Value MUST match value type - - snd-inline-value = binary-value CRLF - ; Value MUST be "b" encoded audio content - - snd-inline-param = ("VALUE" "=" "binary"]) - / ("ENCODING" "=" "b") - / ("TYPE" "=" *SAFE-CHAR) - ; Value MUST be an IANA registered audio type - - snd-refer-value = uri - ; URI MUST refer to audio content of given type - - - -Dawson & Howes Standards Track [Page 34] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - snd-refer-param = ("VALUE" "=" "uri") - / ("TYPE" "=" word) - ; Value MUST be an IANA registered audio type - - ;For name="UID" - param = "" - ; No parameters allowed - - value = text-value - - ;For name="URL" - param = "" - ; No parameters allowed - - value = uri - - ;For name="VERSION" - ;This type MUST be included in a vCard object. - param = "" - ; No parameters allowed - - value = text-value - ; Value MUST be "3.0" - - ;For name="CLASS" - param = "" - ; No parameters allowed - - value = "PUBLIC" / "PRIVATE" / "CONFIDENTIAL" - / iana-token / x-name - ; Value are case insensitive - - ;For name="KEY" - param = key-txt-param - ; Only value and type parameters allowed - - param =/ key-bin-param - ; Only value and type parameters allowed - - value = text-value - - value =/ binary-value - - key-txt-param = "TYPE" "=" keytype - - key-bin-param = ("TYPE" "=" keytype) - / ("ENCODING" "=" "b") - ; Value MUST be a "b" encoded key or certificate - - - -Dawson & Howes Standards Track [Page 35] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - keytype = "X509" / "PGP" / iana-token / x-name - ; Values are case insensitive - - ;For name="X-" non-standard type - param = text-param / (x-name "=" param-value) - ; Only text or non-standard parameters allowed - - value = text-value - - ;******************************************* - ; vCard Commonly Used Parameter Definition - ;******************************************* - - text-param = ("VALUE" "=" "ptext") - / ("LANGUAGE" "=" langval) - / (x-name "=" param-value) - - langval = <a language string as defined in RFC 1766> - - img-inline-value = binary-value - ;Value MUST be "b" encoded image content - - img-inline-param - - img-inline-param = ("VALUE" "=" "binary") - / ("ENCODING" "=" "b") - / ("TYPE" "=" param-value - ;TYPE value MUST be an IANA registered image type - - img-refer-value = uri - ;URI MUST refer to image content of given type - - img-refer-param = ("VALUE" "=" "uri") - / ("TYPE" "=" param-value) - ;TYPE value MUST be an IANA registered image type - - adr-param = ("TYPE" "=" adr-type *("," adr-type)) - / (text-param) - - adr-type = "dom" / "intl" / "postal" / "parcel" / "home" - / "work" / "pref" / iana-type / x-name - - adr-value = 0*6(text-value ";") text-value - ; PO Box, Extended Address, Street, Locality, Region, Postal - ; Code, Country Name - - - - - - -Dawson & Howes Standards Track [Page 36] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - ;******************************************* - ; vCard Type Value Definition - ;******************************************* - - text-value-list = 1*text-value *("," 1*text-value) - - text-value = *(SAFE-CHAR / ":" / DQUOTE / ESCAPED-CHAR) - - ESCAPED-CHAR = "\\" / "\;" / "\," / "\n" / "\N") - ; \\ encodes \, \n or \N encodes newline - ; \; encodes ;, \, encodes , - - binary-value = <A "b" encoded text value as defined in [RFC 2047]> - - date-value = <A single date value as defined in [MIME-DIR]> - - time-value = <A single time value as defined in [MIME-DIR]> - - date-time-value = <A single date-time value as defined in [MIME-DIR] - - float-value = <A single float value as defined in [MIME-DIR]> - - phone-number-value = <A single text value as defined in [CCITT - E.163] and [CCITT X.121]> - - uri-value = <A uri value as defined in [MIME-DIR]> - - utc-offset-value = ("+" / "-") time-hour ":" time-minute - time-hour = 2DIGIT ;00-23 - time-minute = 2DIGIT ;00-59 - -5. Differences From vCard v2.1 - - This specification has been reviewed by the IETF community. The - review process introduced a number of differences from the [VCARD] - version 2.1. These differences require that vCard objects conforming - to this specification have a different version number than a vCard - conforming to [VCARD]. The differences include the following: - - . The QUOTED-PRINTABLE inline encoding has been eliminated. - Only the "B" encoding of [RFC 2047] is an allowed value for - the ENCODING parameter. - - . The method for specifying CRLF character sequences in text - type values has been changed. The CRLF character sequence in - a text type value is specified with the backslash character - sequence "\n" or "\N". - - - - -Dawson & Howes Standards Track [Page 37] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - . Any COMMA or SEMICOLON in a text type value must be backslash - escaped. - - . VERSION value corresponding to this specification MUST be - "3.0". - - . The [MIME-DIR] predefined types of SOURCE, NAME and PROFILE - are allowed. - - . The [MIME-DIR] VALUE type parameter for value data typing is - allowed. In addition, there are extensions made to these type - values for additional value types used in this specification. - - . The [VCARD] CHARSET type parameter has been eliminated. - Character set can only be specified on the CHARSET parameter - on the Content-Type MIME header field. - - . The [VCARD] support for non-significant WSP character has - been eliminated. - - . The "TYPE=" prefix to parameter values is required. In - [VCARD] this was optional. - - . LOGO, PHOTO and SOUND multimedia formats MUST be either IANA - registered types or non-standard types. - - . Inline binary content must be "B" encoded and folded. A blank - line after the encoded binary content is no longer required. - - . TEL values can be identified as personal communication - services telephone numbers with the PCS type parameter value. - - . The CATEGORIES, CLASS, NICKNAME, PRODID and SORT-STRING types - have been added. - - . The VERSION, N and FN types MUST be specified in a vCard. - This identifies the version of the specification that the - object was formatted to. It also assures that every vCard - will include both a structured and formatted name that can be - used to identify the object. - - - - - - - - - - - -Dawson & Howes Standards Track [Page 38] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -6. Acknowledgements - - The many valuable comments contributed by members of the IETF ASID - working group are gratefully acknowledged, as are the contributions - by Roland Alden, Stephen Bartlett, Alec Dun, Patrik Faltstrom, Daniel - Gurney, Bruce Johnston, Daniel Klaussen, Pete Miller, Keith Moore, - Vinod Seraphin, Michelle Watkins. Chris Newman was especially helpful - in navigating the intricacies of ABNF lore. - -7. Authors' Addresses - - BEGIN:vCard - VERSION:3.0 - FN:Frank Dawson - ORG:Lotus Development Corporation - ADR;TYPE=WORK,POSTAL,PARCEL:;;6544 Battleford Drive - ;Raleigh;NC;27613-3502;U.S.A. - TEL;TYPE=VOICE,MSG,WORK:+1-919-676-9515 - TEL;TYPE=FAX,WORK:+1-919-676-9564 - EMAIL;TYPE=INTERNET,PREF:Frank_Dawson@Lotus.com - EMAIL;TYPE=INTERNET:fdawson@earthlink.net - URL:http://home.earthlink.net/~fdawson - END:vCard - - - BEGIN:vCard - VERSION:3.0 - FN:Tim Howes - ORG:Netscape Communications Corp. - ADR;TYPE=WORK:;;501 E. Middlefield Rd.;Mountain View; - CA; 94043;U.S.A. - TEL;TYPE=VOICE,MSG,WORK:+1-415-937-3419 - TEL;TYPE=FAX,WORK:+1-415-528-4164 - EMAIL;TYPE=INTERNET:howes@netscape.com - END:vCard - -8. Security Considerations - - vCards can carry cryptographic keys or certificates, as described in - Section 3.7.2. - - Section 3.7.1 specifies a desired security classification policy for - a particular vCard. That policy is not enforced in any way. - - The vCard objects have no inherent authentication or privacy, but can - easily be carried by any security mechanism that transfers MIME - objects with authentication or privacy. In cases where threats of - "spoofed" vCard information is a concern, the vCard SHOULD BE - - - -Dawson & Howes Standards Track [Page 39] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - transported using one of these secure mechanisms. - - The information in a vCard may become out of date. In cases where the - vitality of data is important to an originator of a vCard, the "URL" - type described in section 3.6.8 SHOULD BE specified. In addition, the - "REV" type described in section 3.6.4 can be specified to indicate - the last time that the vCard data was updated. - -9. References - - [ISO 8601] ISO 8601:1988 - Data elements and interchange formats - - Information interchange - Representation of dates and - times - The International Organization for - Standardization, June, 1988. - - [ISO 8601 TC] ISO 8601, Technical Corrigendum 1 - Data elements and - interchange formats - Information interchange - - Representation of dates and times - The International - Organization for Standardization, May, 1991. - - [ISO 9070] ISO 9070, Information Processing - SGML support - facilities - Registration Procedures for Public Text - Owner Identifiers, April, 1991. - - [CCITT E.163] Recommendation E.163 - Numbering Plan for The - International Telephone Service, CCITT Blue Book, - Fascicle II.2, pp. 128-134, November, 1988. - - [CCITT X.121] Recommendation X.121 - International Numbering Plan for - Public Data Networks, CCITT Blue Book, Fascicle VIII.3, - pp. 317-332, November, 1988. - - [CCITT X.520] Recommendation X.520 - The Directory - Selected - Attribute Types, November 1988. - - [CCITT X.521] Recommendation X.521 - The Directory - Selected Object - Classes, November 1988. - - [MIME-DIR] Howes, T., Smith, M., and F. Dawson, "A MIME Content- - Type for Directory Information", RFC 2425, September - 1998. - - [RFC 1738] Berners-Lee, T., Masinter, L., and M. McCahill, - "Uniform Resource Locators (URL)", RFC 1738, December - 1994. - - [RFC 1766] Alvestrand, H., "Tags for the Identification of - Languages", RFC 1766, March 1995. - - - -Dawson & Howes Standards Track [Page 40] - -RFC 2426 vCard MIME Directory Profile September 1998 - - - [RFC 1872] Levinson, E., "The MIME Multipart/Related Content- - type", RFC 1872, December 1995. - - [RFC 2045] Freed, N., and N. Borenstein, "Multipurpose Internet - Mail Extensions (MIME) - Part One: Format of Internet - Message Bodies", RFC 2045, November 1996. - - [RFC 2046] Freed, N., and N. Borenstein, "Multipurpose Internet - Mail Extensions (MIME) - Part Two: Media Types", RFC - 2046, November 1996. - - [RFC 2047] Moore, K., "Multipurpose Internet Mail Extensions - (MIME) - Part Three: Message Header Extensions for - Non-ASCII Text", RFC 2047, November 1996. - - [RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) - Part Four: - Registration Procedures", RFC 2048, January 1997. - - [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC 2234] Crocker, D., and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 2234, November 1997. - - [UNICODE] "The Unicode Standard - Version 2.0", The Unicode - Consortium, July 1996. - - [VCARD] Internet Mail Consortium, "vCard - The Electronic - Business Card Version 2.1", - http://www.imc.org/pdi/vcard-21.txt, September 18, - 1996. - - - - - - - - - - - - - - - - - - - -Dawson & Howes Standards Track [Page 41] - -RFC 2426 vCard MIME Directory Profile September 1998 - - -10. Full Copyright Statement - - Copyright (C) The Internet Society (1998). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - - - - - - - - - - - - - - - - - - - - - - - -Dawson & Howes Standards Track [Page 42] - diff --git a/proto/rfc2595.txt b/proto/rfc2595.txt @@ -1,843 +0,0 @@ - - - - - - -Network Working Group C. Newman -Request for Comments: 2595 Innosoft -Category: Standards Track June 1999 - - - Using TLS with IMAP, POP3 and ACAP - - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1999). All Rights Reserved. - -1. Motivation - - The TLS protocol (formerly known as SSL) provides a way to secure an - application protocol from tampering and eavesdropping. The option of - using such security is desirable for IMAP, POP and ACAP due to common - connection eavesdropping and hijacking attacks [AUTH]. Although - advanced SASL authentication mechanisms can provide a lightweight - version of this service, TLS is complimentary to simple - authentication-only SASL mechanisms or deployed clear-text password - login commands. - - Many sites have a high investment in authentication infrastructure - (e.g., a large database of a one-way-function applied to user - passwords), so a privacy layer which is not tightly bound to user - authentication can protect against network eavesdropping attacks - without requiring a new authentication infrastructure and/or forcing - all users to change their password. Recognizing that such sites will - desire simple password authentication in combination with TLS - encryption, this specification defines the PLAIN SASL mechanism for - use with protocols which lack a simple password authentication - command such as ACAP and SMTP. (Note there is a separate RFC for the - STARTTLS command in SMTP [SMTPTLS].) - - There is a strong desire in the IETF to eliminate the transmission of - clear-text passwords over unencrypted channels. While SASL can be - used for this purpose, TLS provides an additional tool with different - deployability characteristics. A server supporting both TLS with - - - - -Newman Standards Track [Page 1] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - simple passwords and a challenge/response SASL mechanism is likely to - interoperate with a wide variety of clients without resorting to - unencrypted clear-text passwords. - - The STARTTLS command rectifies a number of the problems with using a - separate port for a "secure" protocol variant. Some of these are - mentioned in section 7. - -1.1. Conventions Used in this Document - - The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", - "MAY", and "OPTIONAL" in this document are to be interpreted as - described in "Key words for use in RFCs to Indicate Requirement - Levels" [KEYWORDS]. - - Terms related to authentication are defined in "On Internet - Authentication" [AUTH]. - - Formal syntax is defined using ABNF [ABNF]. - - In examples, "C:" and "S:" indicate lines sent by the client and - server respectively. - -2. Basic Interoperability and Security Requirements - - The following requirements apply to all implementations of the - STARTTLS extension for IMAP, POP3 and ACAP. - -2.1. Cipher Suite Requirements - - Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher - suite is REQUIRED. This is important as it assures that any two - compliant implementations can be configured to interoperate. - - All other cipher suites are OPTIONAL. - -2.2. Privacy Operational Mode Security Requirements - - Both clients and servers SHOULD have a privacy operational mode which - refuses authentication unless successful activation of an encryption - layer (such as that provided by TLS) occurs prior to or at the time - of authentication and which will terminate the connection if that - encryption layer is deactivated. Implementations are encouraged to - have flexability with respect to the minimal encryption strength or - cipher suites permitted. A minimalist approach to this - recommendation would be an operational mode where the - TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA cipher suite is mandatory prior to - permitting authentication. - - - -Newman Standards Track [Page 2] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - Clients MAY have an operational mode which uses encryption only when - it is advertised by the server, but authentication continues - regardless. For backwards compatibility, servers SHOULD have an - operational mode where only the authentication mechanisms required by - the relevant base protocol specification are needed to successfully - authenticate. - -2.3. Clear-Text Password Requirements - - Clients and servers which implement STARTTLS MUST be configurable to - refuse all clear-text login commands or mechanisms (including both - standards-track and nonstandard mechanisms) unless an encryption - layer of adequate strength is active. Servers which allow - unencrypted clear-text logins SHOULD be configurable to refuse - clear-text logins both for the entire server, and on a per-user - basis. - -2.4. Server Identity Check - - During the TLS negotiation, the client MUST check its understanding - of the server hostname against the server's identity as presented in - the server Certificate message, in order to prevent man-in-the-middle - attacks. Matching is performed according to these rules: - - - The client MUST use the server hostname it used to open the - connection as the value to compare against the server name as - expressed in the server certificate. The client MUST NOT use any - form of the server hostname derived from an insecure remote source - (e.g., insecure DNS lookup). CNAME canonicalization is not done. - - - If a subjectAltName extension of type dNSName is present in the - certificate, it SHOULD be used as the source of the server's - identity. - - - Matching is case-insensitive. - - - A "*" wildcard character MAY be used as the left-most name - component in the certificate. For example, *.example.com would - match a.example.com, foo.example.com, etc. but would not match - example.com. - - - If the certificate contains multiple names (e.g. more than one - dNSName field), then a match with any one of the fields is - considered acceptable. - - If the match fails, the client SHOULD either ask for explicit user - confirmation, or terminate the connection and indicate the server's - identity is suspect. - - - -Newman Standards Track [Page 3] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - -2.5. TLS Security Policy Check - - Both the client and server MUST check the result of the STARTTLS - command and subsequent TLS negotiation to see whether acceptable - authentication or privacy was achieved. Ignoring this step - completely invalidates using TLS for security. The decision about - whether acceptable authentication or privacy was achieved is made - locally, is implementation-dependent, and is beyond the scope of this - document. - -3. IMAP STARTTLS extension - - When the TLS extension is present in IMAP, "STARTTLS" is listed as a - capability in response to the CAPABILITY command. This extension - adds a single command, "STARTTLS" to the IMAP protocol which is used - to begin a TLS negotiation. - -3.1. STARTTLS Command - - Arguments: none - - Responses: no specific responses for this command - - Result: OK - begin TLS negotiation - BAD - command unknown or arguments invalid - - A TLS negotiation begins immediately after the CRLF at the end of - the tagged OK response from the server. Once a client issues a - STARTTLS command, it MUST NOT issue further commands until a - server response is seen and the TLS negotiation is complete. - - The STARTTLS command is only valid in non-authenticated state. - The server remains in non-authenticated state, even if client - credentials are supplied during the TLS negotiation. The SASL - [SASL] EXTERNAL mechanism MAY be used to authenticate once TLS - client credentials are successfully exchanged, but servers - supporting the STARTTLS command are not required to support the - EXTERNAL mechanism. - - Once TLS has been started, the client MUST discard cached - information about server capabilities and SHOULD re-issue the - CAPABILITY command. This is necessary to protect against - man-in-the-middle attacks which alter the capabilities list prior - to STARTTLS. The server MAY advertise different capabilities - after STARTTLS. - - The formal syntax for IMAP is amended as follows: - - - - -Newman Standards Track [Page 4] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - command_any =/ "STARTTLS" - - Example: C: a001 CAPABILITY - S: * CAPABILITY IMAP4rev1 STARTTLS LOGINDISABLED - S: a001 OK CAPABILITY completed - C: a002 STARTTLS - S: a002 OK Begin TLS negotiation now - <TLS negotiation, further commands are under TLS layer> - C: a003 CAPABILITY - S: * CAPABILITY IMAP4rev1 AUTH=EXTERNAL - S: a003 OK CAPABILITY completed - C: a004 LOGIN joe password - S: a004 OK LOGIN completed - -3.2. IMAP LOGINDISABLED capability - - The current IMAP protocol specification (RFC 2060) requires the - implementation of the LOGIN command which uses clear-text passwords. - Many sites may choose to disable this command unless encryption is - active for security reasons. An IMAP server MAY advertise that the - LOGIN command is disabled by including the LOGINDISABLED capability - in the capability response. Such a server will respond with a tagged - "NO" response to any attempt to use the LOGIN command. - - An IMAP server which implements STARTTLS MUST implement support for - the LOGINDISABLED capability on unencrypted connections. - - An IMAP client which complies with this specification MUST NOT issue - the LOGIN command if this capability is present. - - This capability is useful to prevent clients compliant with this - specification from sending an unencrypted password in an environment - subject to passive attacks. It has no impact on an environment - subject to active attacks as a man-in-the-middle attacker can remove - this capability. Therefore this does not relieve clients of the need - to follow the privacy mode recommendation in section 2.2. - - Servers advertising this capability will fail to interoperate with - many existing compliant IMAP clients and will be unable to prevent - those clients from disclosing the user's password. - -4. POP3 STARTTLS extension - - The POP3 STARTTLS extension adds the STLS command to POP3 servers. - If this is implemented, the POP3 extension mechanism [POP3EXT] MUST - also be implemented to avoid the need for client probing of multiple - commands. The capability name "STLS" indicates this command is - present and permitted in the current state. - - - -Newman Standards Track [Page 5] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - STLS - - Arguments: none - - Restrictions: - Only permitted in AUTHORIZATION state. - - Discussion: - A TLS negotiation begins immediately after the CRLF at the - end of the +OK response from the server. A -ERR response - MAY result if a security layer is already active. Once a - client issues a STLS command, it MUST NOT issue further - commands until a server response is seen and the TLS - negotiation is complete. - - The STLS command is only permitted in AUTHORIZATION state - and the server remains in AUTHORIZATION state, even if - client credentials are supplied during the TLS negotiation. - The AUTH command [POP-AUTH] with the EXTERNAL mechanism - [SASL] MAY be used to authenticate once TLS client - credentials are successfully exchanged, but servers - supporting the STLS command are not required to support the - EXTERNAL mechanism. - - Once TLS has been started, the client MUST discard cached - information about server capabilities and SHOULD re-issue - the CAPA command. This is necessary to protect against - man-in-the-middle attacks which alter the capabilities list - prior to STLS. The server MAY advertise different - capabilities after STLS. - - Possible Responses: - +OK -ERR - - Examples: - C: STLS - S: +OK Begin TLS negotiation - <TLS negotiation, further commands are under TLS layer> - ... - C: STLS - S: -ERR Command not permitted when TLS active - - - - - - - - - - -Newman Standards Track [Page 6] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - -5. ACAP STARTTLS extension - - When the TLS extension is present in ACAP, "STARTTLS" is listed as a - capability in the ACAP greeting. No arguments to this capability are - defined at this time. This extension adds a single command, - "STARTTLS" to the ACAP protocol which is used to begin a TLS - negotiation. - -5.1. STARTTLS Command - - Arguments: none - - Responses: no specific responses for this command - - Result: OK - begin TLS negotiation - BAD - command unknown or arguments invalid - - A TLS negotiation begins immediately after the CRLF at the end of - the tagged OK response from the server. Once a client issues a - STARTTLS command, it MUST NOT issue further commands until a - server response is seen and the TLS negotiation is complete. - - The STARTTLS command is only valid in non-authenticated state. - The server remains in non-authenticated state, even if client - credentials are supplied during the TLS negotiation. The SASL - [SASL] EXTERNAL mechanism MAY be used to authenticate once TLS - client credentials are successfully exchanged, but servers - supporting the STARTTLS command are not required to support the - EXTERNAL mechanism. - - After the TLS layer is established, the server MUST re-issue an - untagged ACAP greeting. This is necessary to protect against - man-in-the-middle attacks which alter the capabilities list prior - to STARTTLS. The client MUST discard cached capability - information and replace it with the information from the new ACAP - greeting. The server MAY advertise different capabilities after - STARTTLS. - - The formal syntax for ACAP is amended as follows: - - command_any =/ "STARTTLS" - - Example: S: * ACAP (SASL "CRAM-MD5") (STARTTLS) - C: a002 STARTTLS - S: a002 OK "Begin TLS negotiation now" - <TLS negotiation, further commands are under TLS layer> - S: * ACAP (SASL "CRAM-MD5" "PLAIN" "EXTERNAL") - - - - -Newman Standards Track [Page 7] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - -6. PLAIN SASL mechanism - - Clear-text passwords are simple, interoperate with almost all - existing operating system authentication databases, and are useful - for a smooth transition to a more secure password-based - authentication mechanism. The drawback is that they are unacceptable - for use over an unencrypted network connection. - - This defines the "PLAIN" SASL mechanism for use with ACAP and other - protocols with no clear-text login command. The PLAIN SASL mechanism - MUST NOT be advertised or used unless a strong encryption layer (such - as the provided by TLS) is active or backwards compatibility dictates - otherwise. - - The mechanism consists of a single message from the client to the - server. The client sends the authorization identity (identity to - login as), followed by a US-ASCII NUL character, followed by the - authentication identity (identity whose password will be used), - followed by a US-ASCII NUL character, followed by the clear-text - password. The client may leave the authorization identity empty to - indicate that it is the same as the authentication identity. - - The server will verify the authentication identity and password with - the system authentication database and verify that the authentication - credentials permit the client to login as the authorization identity. - If both steps succeed, the user is logged in. - - The server MAY also use the password to initialize any new - authentication database, such as one suitable for CRAM-MD5 - [CRAM-MD5]. - - Non-US-ASCII characters are permitted as long as they are represented - in UTF-8 [UTF-8]. Use of non-visible characters or characters which - a user may be unable to enter on some keyboards is discouraged. - - The formal grammar for the client message using Augmented BNF [ABNF] - follows. - - message = [authorize-id] NUL authenticate-id NUL password - authenticate-id = 1*UTF8-SAFE ; MUST accept up to 255 octets - authorize-id = 1*UTF8-SAFE ; MUST accept up to 255 octets - password = 1*UTF8-SAFE ; MUST accept up to 255 octets - NUL = %x00 - UTF8-SAFE = %x01-09 / %x0B-0C / %x0E-7F / UTF8-2 / - UTF8-3 / UTF8-4 / UTF8-5 / UTF8-6 - UTF8-1 = %x80-BF - UTF8-2 = %xC0-DF UTF8-1 - UTF8-3 = %xE0-EF 2UTF8-1 - - - -Newman Standards Track [Page 8] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - UTF8-4 = %xF0-F7 3UTF8-1 - UTF8-5 = %xF8-FB 4UTF8-1 - UTF8-6 = %xFC-FD 5UTF8-1 - - Here is an example of how this might be used to initialize a CRAM-MD5 - authentication database for ACAP: - - Example: S: * ACAP (SASL "CRAM-MD5") (STARTTLS) - C: a001 AUTHENTICATE "CRAM-MD5" - S: + "<1896.697170952@postoffice.reston.mci.net>" - C: "tim b913a602c7eda7a495b4e6e7334d3890" - S: a001 NO (TRANSITION-NEEDED) - "Please change your password, or use TLS to login" - C: a002 STARTTLS - S: a002 OK "Begin TLS negotiation now" - <TLS negotiation, further commands are under TLS layer> - S: * ACAP (SASL "CRAM-MD5" "PLAIN" "EXTERNAL") - C: a003 AUTHENTICATE "PLAIN" {21+} - C: <NUL>tim<NUL>tanstaaftanstaaf - S: a003 OK CRAM-MD5 password initialized - - Note: In this example, <NUL> represents a single ASCII NUL octet. - -7. imaps and pop3s ports - - Separate "imaps" and "pop3s" ports were registered for use with SSL. - Use of these ports is discouraged in favor of the STARTTLS or STLS - commands. - - A number of problems have been observed with separate ports for - "secure" variants of protocols. This is an attempt to enumerate some - of those problems. - - - Separate ports lead to a separate URL scheme which intrudes into - the user interface in inappropriate ways. For example, many web - pages use language like "click here if your browser supports SSL." - This is a decision the browser is often more capable of making than - the user. - - - Separate ports imply a model of either "secure" or "not secure." - This can be misleading in a number of ways. First, the "secure" - port may not in fact be acceptably secure as an export-crippled - cipher suite might be in use. This can mislead users into a false - sense of security. Second, the normal port might in fact be - secured by using a SASL mechanism which includes a security layer. - Thus the separate port distinction makes the complex topic of - security policy even more confusing. One common result of this - confusion is that firewall administrators are often misled into - - - -Newman Standards Track [Page 9] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - permitting the "secure" port and blocking the standard port. This - could be a poor choice given the common use of SSL with a 40-bit - key encryption layer and plain-text password authentication is less - secure than strong SASL mechanisms such as GSSAPI with Kerberos 5. - - - Use of separate ports for SSL has caused clients to implement only - two security policies: use SSL or don't use SSL. The desirable - security policy "use TLS when available" would be cumbersome with - the separate port model, but is simple with STARTTLS. - - - Port numbers are a limited resource. While they are not yet in - short supply, it is unwise to set a precedent that could double (or - worse) the speed of their consumption. - - -8. IANA Considerations - - This constitutes registration of the "STARTTLS" and "LOGINDISABLED" - IMAP capabilities as required by section 7.2.1 of RFC 2060 [IMAP]. - - The registration for the POP3 "STLS" capability follows: - - CAPA tag: STLS - Arguments: none - Added commands: STLS - Standard commands affected: May enable USER/PASS as a side-effect. - CAPA command SHOULD be re-issued after successful completion. - Announced states/Valid states: AUTHORIZATION state only. - Specification reference: this memo - - The registration for the ACAP "STARTTLS" capability follows: - - Capability name: STARTTLS - Capability keyword: STARTTLS - Capability arguments: none - Published Specification(s): this memo - Person and email address for further information: - see author's address section below - - The registration for the PLAIN SASL mechanism follows: - - SASL mechanism name: PLAIN - Security Considerations: See section 9 of this memo - Published specification: this memo - Person & email address to contact for further information: - see author's address section below - Intended usage: COMMON - Author/Change controller: see author's address section below - - - -Newman Standards Track [Page 10] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - -9. Security Considerations - - TLS only provides protection for data sent over a network connection. - Messages transferred over IMAP or POP3 are still available to server - administrators and usually subject to eavesdropping, tampering and - forgery when transmitted through SMTP or NNTP. TLS is no substitute - for an end-to-end message security mechanism using MIME security - multiparts [MIME-SEC]. - - A man-in-the-middle attacker can remove STARTTLS from the capability - list or generate a failure response to the STARTTLS command. In - order to detect such an attack, clients SHOULD warn the user when - session privacy is not active and/or be configurable to refuse to - proceed without an acceptable level of security. - - A man-in-the-middle attacker can always cause a down-negotiation to - the weakest authentication mechanism or cipher suite available. For - this reason, implementations SHOULD be configurable to refuse weak - mechanisms or cipher suites. - - Any protocol interactions prior to the TLS handshake are performed in - the clear and can be modified by a man-in-the-middle attacker. For - this reason, clients MUST discard cached information about server - capabilities advertised prior to the start of the TLS handshake. - - Clients are encouraged to clearly indicate when the level of - encryption active is known to be vulnerable to attack using modern - hardware (such as encryption keys with 56 bits of entropy or less). - - The LOGINDISABLED IMAP capability (discussed in section 3.2) only - reduces the potential for passive attacks, it provides no protection - against active attacks. The responsibility remains with the client - to avoid sending a password over a vulnerable channel. - - The PLAIN mechanism relies on the TLS encryption layer for security. - When used without TLS, it is vulnerable to a common network - eavesdropping attack. Therefore PLAIN MUST NOT be advertised or used - unless a suitable TLS encryption layer is active or backwards - compatibility dictates otherwise. - - When the PLAIN mechanism is used, the server gains the ability to - impersonate the user to all services with the same password - regardless of any encryption provided by TLS or other network privacy - mechanisms. While many other authentication mechanisms have similar - weaknesses, stronger SASL mechanisms such as Kerberos address this - issue. Clients are encouraged to have an operational mode where all - mechanisms which are likely to reveal the user's password to the - server are disabled. - - - -Newman Standards Track [Page 11] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - The security considerations for TLS apply to STARTTLS and the - security considerations for SASL apply to the PLAIN mechanism. - Additional security requirements are discussed in section 2. - -10. References - - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 2234, November 1997. - - [ACAP] Newman, C. and J. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November 1997. - - [AUTH] Haller, N. and R. Atkinson, "On Internet Authentication", - RFC 1704, October 1994. - - [CRAM-MD5] Klensin, J., Catoe, R. and P. Krumviede, "IMAP/POP - AUTHorize Extension for Simple Challenge/Response", RFC - 2195, September 1997. - - [IMAP] Crispin, M., "Internet Message Access Protocol - Version - 4rev1", RFC 2060, December 1996. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [MIME-SEC] Galvin, J., Murphy, S., Crocker, S. and N. Freed, - "Security Multiparts for MIME: Multipart/Signed and - Multipart/Encrypted", RFC 1847, October 1995. - - [POP3] Myers, J. and M. Rose, "Post Office Protocol - Version 3", - STD 53, RFC 1939, May 1996. - - [POP3EXT] Gellens, R., Newman, C. and L. Lundblade, "POP3 Extension - Mechanism", RFC 2449, November 1998. - - [POP-AUTH] Myers, J., "POP3 AUTHentication command", RFC 1734, - December 1994. - - [SASL] Myers, J., "Simple Authentication and Security Layer - (SASL)", RFC 2222, October 1997. - - [SMTPTLS] Hoffman, P., "SMTP Service Extension for Secure SMTP over - TLS", RFC 2487, January 1999. - - [TLS] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", - RFC 2246, January 1999. - - - - - -Newman Standards Track [Page 12] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", RFC 2279, January 1998. - - -11. Author's Address - - Chris Newman - Innosoft International, Inc. - 1050 Lakes Drive - West Covina, CA 91790 USA - - EMail: chris.newman@innosoft.com - - -A. Appendix -- Compliance Checklist - - An implementation is not compliant if it fails to satisfy one or more - of the MUST requirements for the protocols it implements. An - implementation that satisfies all the MUST and all the SHOULD - requirements for its protocols is said to be "unconditionally - compliant"; one that satisfies all the MUST requirements but not all - the SHOULD requirements for its protocols is said to be - "conditionally compliant". - - Rules Section - ----- ------- - Mandatory-to-implement Cipher Suite 2.1 - SHOULD have mode where encryption required 2.2 - server SHOULD have mode where TLS not required 2.2 - MUST be configurable to refuse all clear-text login - commands or mechanisms 2.3 - server SHOULD be configurable to refuse clear-text - login commands on entire server and on per-user basis 2.3 - client MUST check server identity 2.4 - client MUST use hostname used to open connection 2.4 - client MUST NOT use hostname from insecure remote lookup 2.4 - client SHOULD support subjectAltName of dNSName type 2.4 - client SHOULD ask for confirmation or terminate on fail 2.4 - MUST check result of STARTTLS for acceptable privacy 2.5 - client MUST NOT issue commands after STARTTLS - until server response and negotiation done 3.1,4,5.1 - client MUST discard cached information 3.1,4,5.1,9 - client SHOULD re-issue CAPABILITY/CAPA command 3.1,4 - IMAP server with STARTTLS MUST implement LOGINDISABLED 3.2 - IMAP client MUST NOT issue LOGIN if LOGINDISABLED 3.2 - POP server MUST implement POP3 extensions 4 - ACAP server MUST re-issue ACAP greeting 5.1 - - - - -Newman Standards Track [Page 13] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - - client SHOULD warn when session privacy not active and/or - refuse to proceed without acceptable security level 9 - SHOULD be configurable to refuse weak mechanisms or - cipher suites 9 - - The PLAIN mechanism is an optional part of this specification. - However if it is implemented the following rules apply: - - Rules Section - ----- ------- - MUST NOT use PLAIN unless strong encryption active - or backwards compatibility dictates otherwise 6,9 - MUST use UTF-8 encoding for characters in PLAIN 6 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Newman Standards Track [Page 14] - -RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 - - -Full Copyright Statement - - Copyright (C) The Internet Society (1999). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Newman Standards Track [Page 15] - diff --git a/proto/rfc2646.txt b/proto/rfc2646.txt @@ -1,787 +0,0 @@ - - - - - - -Network Working Group R. Gellens, Editor -Request for Comments: 2646 Qualcomm -Updates: 2046 August 1999 -Category: Standards Track - - - The Text/Plain Format Parameter - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (1999). All Rights Reserved. - -Table of Contents - - 1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 2. Conventions Used in this Document . . . . . . . . . . . . . 2 - 3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . 2 - 3.1. Paragraph Text . . . . . . . . . . . . . . . . . . . . 3 - 3.2. Embarrassing Line Wrap . . . . . . . . . . . . . . . . . 3 - 3.3. New Media Types . . . . . . . . . . . . . . . . . . . . 4 - 4. The Format Parameter to the Text/Plain Media Type . . . . . 4 - 4.1. Generating Format=Flowed . . . . . . . . . . . . . . . 5 - 4.2. Interpreting Format=Flowed . . . . . . . . . . . . . . . 6 - 4.3. Usenet Signature Convention . . . . . . . . . . . . . . 7 - 4.4. Space-Stuffing . . . . . . . . . . . . . . . . . . . . . 7 - 4.5. Quoting . . . . . . . . . . . . . . . . . . . . . . . . 8 - 4.6. Digital Signatures and Encryption . . . . . . . . . . . 9 - 4.7. Line Analysis Table . . . . . . . . . . . . . . . . . . 10 - 4.8. Examples . . . . . . . . . . . . . . . . . . . . . . . . 10 - 5. ABNF . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 6. Failure Modes . . . . . . . . . . . . . . . . . . . . . . . 11 - 6.1. Trailing White Space Corruption . . . . . . . . . . . . 11 - 7. Security Considerations . . . . . . . . . . . . . . . . . . 12 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . 12 - 9. Internationalization Considerations . . . . . . . . . . . . 12 - 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 12 - 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 12. Editor's Address . . . . . . . . . . . . . . . . . . . . . 13 - 13. Full Copyright Statement . . . . . . . . . . . . . . . . . . 14 - - - - -Gellens Standards Track [Page 1] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - -1. Abstract - - Interoperability problems have been observed with erroneous labelling - of paragraph text as Text/Plain, and with various forms of - "embarrassing line wrap." (See section 3.) - - Attempts to deploy new media types, such as Text/Enriched [RICH] and - Text/HTML [HTML] have suffered from a lack of backwards compatibility - and an often hostile user reaction at the receiving end. - - What is required is a format which is in all significant ways - Text/Plain, and therefore is quite suitable for display as - Text/Plain, and yet allows the sender to express to the receiver - which lines can be considered a logical paragraph, and thus flowed - (wrapped and joined) as appropriate. - - This memo proposes a new parameter to be used with Text/Plain, and, - in the presence of this parameter, the use of trailing whitespace to - indicate flowed lines. This results in an encoding which appears as - normal Text/Plain in older implementations, since it is in fact - normal Text/Plain. - -2. Conventions Used in this Document - - The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", - and "MAY" in this document are to be interpreted as described in "Key - words for use in RFCs to Indicate Requirement Levels" [KEYWORDS]. - -3. The Problem - - The Text/Plain media type is the lowest common denominator of - Internet email, with lines of no more than 997 characters (by - convention usually no more than 80), and where the CRLF sequence - represents a line break [MIME-IMT]. - - Text/Plain is usually displayed as preformatted text, often in a - fixed font. That is, the characters start at the left margin of the - display window, and advance to the right until a CRLF sequence is - seen, at which point a new line is started, again at the left margin. - When a line length exceeds the display window, some clients will wrap - the line, while others invoke a horizontal scroll bar. - - Text which meets this description is defined by this memo as "fixed". - - Some interoperability problems have been observed with this media - type: - - - - - -Gellens Standards Track [Page 2] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - -3.1. Paragraph Text - - Many modern programs use a proportional-spaced font and CRLF to - represent paragraph breaks. Line breaks are "soft", occurring as - needed on display. That is, characters are grouped into a paragraph - until a CRLF sequence is seen, at which point a new paragraph is - started. Each paragraph is displayed, starting at the left margin - (or paragraph indent), and continuing to the right until a word is - encountered which does not fit in the remaining display width. This - word is displayed at the left margin of the next line. This - continues until the paragraph ends (a CRLF is seen). Extra vertical - space is left between paragraphs. - - Text which meets this description is defined by this memo as - "flowed". - - Numerous software products erroneously label this media type as - Text/Plain, resulting in much user discomfort. - -3.2. Embarrassing Line Wrap - - As Text/Plain messages get quoted in replies or forwarded messages, - the length of each line gradually increases, resulting in - "embarrassing line wrap." This results in text which is at best hard - to read, and often confuses attributions. - - Example: - - >>>>>>This is a comment from the first message to show a - >quoting example. - >>>>>This is a comment from the second message to show a - >quoting example. - >>>>This is a comment from the third message. - >>>This is a comment from the fourth message. - - It can be confusing to assign attribution to lines 2 and 4 above. - - In addition, as devices with display widths smaller than 80 - characters become more popular, embarrassing line wrap has become - even more prevalent, even with unquoted text. - - - - - - - - - - - -Gellens Standards Track [Page 3] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - Example: - - This is paragraph text that is - meant to be flowed across - several lines. - However, the sending mailer is - converting it to fixed text at - a width of 72 - characters, which causes it to - look like this when shown on a - PDA with only - 30 character lines. - -3.3. New Media Types - - Attempts to deploy new media types, such as Text/Enriched [RICH] and - Text/HTML [HTML] have suffered from a lack of backwards compatibility - and an often hostile user reaction at the receiving end. - - In particular, Text/Enriched requires that open angle brackets ("<") - and hard line breaks be doubled, with resulting user unhappiness when - viewed as Text/Plain. Text/HTML requires even more alteration of - text, with a corresponding increase in user complaints. - - A proposal to define a new media type to explicitly represent the - paragraph form suffered from a lack of interoperability with - currently deployed software. Some programs treat unknown subtypes of - Text as an attachment. - - What is desired is a format which is in all significant ways - Text/Plain, and therefore is quite suitable for display as - Text/Plain, and yet allows the sender to express to the receiver - which lines can be considered a logical paragraph, and thus flowed - (wrapped and joined) as appropriate. - -4. The Format Parameter to the Text/Plain Media Type - - This document defines a new MIME parameter for use with Text/Plain: - - Name: Format - Value: Fixed, Flowed - - (Neither the parameter name nor its value are case sensitive.) - - If not specified, a value of Fixed is assumed. The semantics of the - Fixed value are the usual associated with Text/Plain [MIME-IMT]. - - - - - -Gellens Standards Track [Page 4] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - A value of Flowed indicates that the definition of flowed text (as - specified in this memo) was used on generation, and MAY be used on - reception. - - This section discusses flowed text; section 5 provides a formal - definition. - - Because flowed lines are all-but-indistinguishable from fixed lines, - currently deployed software treats flowed lines as normal Text/Plain - (which is what they are). Thus, no interoperability problems are - expected. - - Note that this memo describes an on-the-wire format. It does not - address formats for local file storage. - -4.1. Generating Format=Flowed - - When generating Format=Flowed text, lines SHOULD be shorter than 80 - characters. As suggested values, any paragraph longer than 79 - characters in total length could be wrapped using lines of 72 or - fewer characters. While the specific line length used is a matter of - aesthetics and preference, longer lines are more likely to require - rewrapping and to encounter difficulties with older mailers. It has - been suggested that 66 character lines are the most readable. - - (The reason for the restriction to 79 or fewer characters between - CRLFs on the wire is to ensure that all lines, even when displayed by - a non-flowed-aware program, will fit in a standard 80-column screen - without having to be wrapped. The limit is 79, not 80, because while - 80 fit on a line, the last column is often reserved for a line-wrap - indicator.) - - When creating flowed text, the generating agent wraps, that is, - inserts 'soft' line breaks as needed. Soft line breaks are added - between words. Because a soft line break is a SP CRLF sequence, the - generating agent creates one by inserting a CRLF after the occurance - of a space. - - A generating agent SHOULD NOT insert white space into a word (a - sequence of printable characters not containing spaces). If faced - with a word which exceeds 79 characters (but less than 998 - characters, the [SMTP] limit on line length), the agent SHOULD send - the word as is and exceed the 79-character limit on line length. - - - - - - - - -Gellens Standards Track [Page 5] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - A generating agent SHOULD: - - 1. Ensure all lines (fixed and flowed) are 79 characters or - fewer in length, counting the trailing space but not - counting the CRLF, unless a word by itself exceeds 79 - characters. - 2. Trim spaces before user-inserted hard line breaks. - 3. Space-stuff lines which start with a space, "From ", or - ">". - - In order to create messages which do not require space-stuffing, and - are thus more aesthetically pleasing when viewed as Format=Fixed, a - generating agent MAY avoid wrapping immediately before ">", "From ", - or space. - - (See sections 4.4 and 4.5 for more information on space-stuffing and - quoting, respectively.) - - A Format=Flowed message consists of zero or more paragraphs, each - containing one or more flowed lines followed by one fixed line. The - usual case is a series of flowed text lines with blank (empty) fixed - lines between them. - - Any number of fixed lines can appear between paragraphs. - - [Quoted-Printable] encoding SHOULD NOT be used with Format=Flowed - unless absolutely necessary (for example, non-US-ASCII (8-bit) - characters over a strictly 7-bit transport such as unextended SMTP). - In particular, a message SHOULD NOT be encoded in Quoted-Printable - for the sole purpose of protecting the trailing space on flowed lines - unless the body part is cryptographically signed or encrypted (see - Section 4.6). - - The intent of Format=Flowed is to allow user agents to generate - flowed text which is non-obnoxious when viewed as pure, raw - Text/Plain (without any decoding); use of Quoted-Printable hinders - this and may cause Format=Flowed to be rejected by end users. - -4.2. Interpreting Format=Flowed - - If the first character of a line is a quote mark (">"), the line is - considered to be quoted (see section 4.5). Logically, all quote - marks are counted and deleted, resulting in a line with a non-zero - quote depth, and content. (The agent is of course free to display the - content with quote marks or excerpt bars or anything else.) - Logically, this test for quoted lines is done before any other tests - (that is, before checking for space-stuffed and flowed). - - - - -Gellens Standards Track [Page 6] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - If the first character of a line is a space, the line has been - space-stuffed (see section 4.4). Logically, this leading space is - deleted before examining the line further (that is, before checking - for flowed). - - If the line ends in one or more spaces, the line is flowed. - Otherwise it is fixed. Trailing spaces are part of the line's - content, but the CRLF of a soft line break is not. - - A series of one or more flowed lines followed by one fixed line is - considered a paragraph, and MAY be flowed (wrapped and unwrapped) as - appropriate on display and in the construction of new messages (see - section 4.5). - - A line consisting of one or more spaces (after deleting a stuffed - space) is considered a flowed line. - -4.3. Usenet Signature Convention - - There is a convention in Usenet news of using "-- " as the separator - line between the body and the signature of a message. When - generating a Format=Flowed message containing a Usenet-style - separator before the signature, the separator line is sent as-is. - This is a special case; an (optionally quoted) line consisting of - DASH DASH SP is not considered flowed. - -4.4. Space-Stuffing - - In order to allow for unquoted lines which start with ">", and to - protect against systems which "From-munge" in-transit messages - (modifying any line which starts with "From " to ">From "), - Format=Flowed provides for space-stuffing. - - Space-stuffing adds a single space to the start of any line which - needs protection when the message is generated. On reception, if the - first character of a line is a space, it is logically deleted. This - occurs after the test for a quoted line, and before the test for a - flowed line. - - On generation, any unquoted lines which start with ">", and any lines - which start with a space or "From " SHOULD be space-stuffed. Other - lines MAY be space-stuffed as desired. - - (Note that space-stuffing is similar to dot-stuffing as specified in - [SMTP].) - - - - - - -Gellens Standards Track [Page 7] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - If a space-stuffed message is received by an agent which handles - Format=Flowed, the space-stuffing is reversed and thus the message - appears unchanged. An agent which is not aware of Format=Flowed will - of course not undo any space-stuffing, thus Format=Flowed messages - may appear with a leading space on some lines (those which start with - a space, ">" which is not a quote indicator, or "From "). Since - lines which require space-stuffing rarely occur, and the aesthetic - consequences of unreversed space-stuffing are minimal, this is not - expected to be a significant problem. - -4.5. Quoting - - In Format=Flowed, the canonical quote indicator (or quote mark) is - one or more close angle bracket (">") characters. Lines which start - with the quote indicator are considered quoted. The number of ">" - characters at the start of the line specifies the quote depth. - Flowed lines which are also quoted may require special handling on - display and when copied to new messages. - - When creating quoted flowed lines, each such line starts with the - quote indicator. - - Note that because of space-stuffing, the lines - >> Exit, Stage Left - and - >>Exit, Stage Left - are semantically identical; both have a quote-depth of two, and a - content of "Exit, Stage Left". - - However, the line - > > Exit, Stage Left - is different. It has a quote-depth of one, and a content of - "> Exit, Stage Left". - - When generating quoted flowed lines, an agent needs to pay attention - to changes in quote depth. A sequence of quoted lines of the same - quote depth SHOULD be encoded as a paragraph, with the last line - generated as fixed and prior lines generated as flowed. - - If a receiving agent wishes to reformat flowed quoted lines (joining - and/or wrapping them) on display or when generating new messages, the - lines SHOULD be de-quoted, reformatted, and then re-quoted. To - de-quote, the number of close angle brackets in the quote indicator - at the start of each line is counted. Consecutive lines with the - same quoting depth are considered one paragraph and are reformatted - together. To re-quote after reformatting, a quote indicator - containing the same number of close angle brackets originally present - is prefixed to each line. - - - -Gellens Standards Track [Page 8] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - On reception, if a change in quoting depth occurs on a flowed line, - this is an improperly formatted message. The receiver SHOULD handle - this error by using the 'quote-depth-wins' rule, which is to ignore - the flowed indicator and treat the line as fixed. That is, the - change in quote depth ends the paragraph. - - For example, consider the following sequence of lines (using '*' to - indicate a soft line break, i.e., SP CRLF, and '#' to indicate a hard - line break, i.e., CRLF): - - > Thou villainous ill-breeding spongy dizzy-eyed* - > reeky elf-skinned pigeon-egg!* <--- problem ---< - >> Thou artless swag-bellied milk-livered* - >> dismal-dreaming idle-headed scut!# - >>> Thou errant folly-fallen spleeny reeling-ripe* - >>> unmuzzled ratsbane!# - >>>> Henceforth, the coding style is to be strictly* - >>>> enforced, including the use of only upper case.# - >>>>> I've noticed a lack of adherence to the coding* - >>>>> styles, of late.# - >>>>>> Any complaints?# - - The second line ends in a soft line break, even though it is the last - line of the one-deep quote block. The question then arises as to how - this line should be interpreted, considering that the next line is - the first line of the two-deep quote block. - - The example text above, when processed according to quote-depth wins, - results in the first two lines being considered as one quoted, flowed - section, with a quote depth of 1; the third and fourth lines become a - quoted, flowed section, with a quote depth of 2. - - A generating agent SHOULD NOT create this situation; a receiving - agent SHOULD handle it using quote-depth wins. - -4.6. Digital Signatures and Encryption - - If a message is digitally signed or encrypted it is important that - cryptographic processing use the on-the-wire Format=Flowed format. - That is, during generation the message SHOULD be prepared for - transmission, including addition of soft line breaks, space-stuffing, - and [Quoted-Printable] encoding (to protect soft line breaks) before - being digitally signed or encrypted; similarly, on receipt the - message SHOULD have the signature verified or be decrypted before - [Quoted-Printable] decoding and removal of stuffed spaces, soft line - breaks and quote marks, and reflowing. - - - - - -Gellens Standards Track [Page 9] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - -4.7. Line Analysis Table - - Lines contained in a Text/Plain body part with Format=Flowed can be - analyzed by examining the start and end of the line. If the line - starts with the quote indicator, it is quoted. If the line ends with - one or more space characters, it is flowed. This is summarized by - the following table: - - Starts Ends in - with One or Line - Quote More Spaces Type - ------ ----------- --------------- - no no unquoted, fixed - yes no quoted, fixed - no yes unquoted, flowed - yes yes quoted, flowed - -4.8. Examples - - The following example contains three paragraphs: - - `Take some more tea,' the March Hare said to Alice, very - earnestly. - - `I've had nothing yet,' Alice replied in an offended tone, `so I - can't take more.' - - `You mean you can't take LESS,' said the Hatter: `it's very easy - to take MORE than nothing.' - - This could be encoded as follows (using '*' to indicate a soft line - break, that is, SP CRLF sequence, and '#' to indicate a hard line - break, that is, CRLF): - - `Take some more tea,' the March Hare said to Alice, very* - earnestly.* - # - `I've had nothing yet,' Alice replied in an offended tone, `so* - I can't take more.'* - # - `You mean you can't take LESS,' said the Hatter: `it's very* - easy to take MORE than nothing.'# - - - - - - - - - -Gellens Standards Track [Page 10] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - To show an example of quoting, here we have the same exchange, - presented as a series of direct quotes: - - >>>Take some more tea.# - >>I've had nothing yet, so I can't take more.# - >You mean you can't take LESS, it's very easy to take* - >MORE than nothing.# - -5. ABNF - - The constructs used in Text/Plain; Format=Flowed body parts are - described using [ABNF], including the Core Rules: - - paragraph = 1*flowed-line fixed-line - fixed-line = fixed / sig-sep - fixed = [quote] [stuffing] *text-char non-sp CRLF - flowed-line = flow-qt / flow-unqt - flow-qt = quote [stuffing] *text-char 1*SP CRLF - flow-unqt = [stuffing] *text-char 1*SP CRLF - non-sp = %x01-09 / %x0B / %x0C / %x0E-1F / %x21-7F - ; any 7-bit US-ASCII character, excluding - ; NUL, CR, LF, and SP - quote = 1*">" - sig-sep = [quote] "--" SP CRLF - stuffing = [SP] ; space-stuffed, added on generation if - ; needed, deleted on reception - text-char = non-sp / SP - -6. Failure Modes - -6.1. Trailing White Space Corruption - - There are systems in existence which alter trailing whitespace on - messages which pass through them. Such systems may strip, or in - rarer cases, add trailing whitespace, in violation of RFC 821 [SMTP] - section 4.5.2. - - Stripping trailing whitespace has the effect of converting flowed - lines to fixed lines, which results in a message no worse than if - Format=Flowed had not been used. - - Adding trailing whitespace to a Format=Flowed message may result in a - malformed display or reply. - - Since most systems which add trailing white space do so to create a - line which fills an internal record format, the result is almost - always a line which contains an even number of characters (counting - the added trailing white space). - - - -Gellens Standards Track [Page 11] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - One possible avoidance, therefore, would be to define Format=Flowed - lines to use either one or two trailing space characters to indicate - a flowed line, such that the total line length is odd. However, - considering the scarcity of such systems today, it is not worth the - added complexity. - -7. Security Considerations - - This parameter introduces no security considerations beyond those - which apply to Text/Plain. - - Section 4.6 discusses the interaction between Format=Flowed and - digital signatures or encryption. - -8. IANA Considerations - - IANA is requested to add a reference to this specification in the - Text/Plain Media Type registration. - -9. Internationalization Considerations - - The line wrap and quoting specifications of Format=Flowed may not be - suitable for certain charsets, such as for Arabic and Hebrew - characters that read from right to left. Care should be taken in - applying format=flowed in these cases, as format=fixed combined with - quoted-printable encoding may be more suitable. - -10. Acknowledgments - - This proposal evolved from a discussion of Chris Newman's - Text/Paragraph draft which took place on the IETF 822 mailing list. - Special thanks to Ian Bell, Steve Dorner, Brian Kelley, Dan Kohn, - Laurence Lundblade, and Dan Wing for their reviews, comments, - suggestions, and discussions. - -11. References - - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", RFC 2234, November - 1997. - - [KEYWORDS] S. Bradner, "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RICH] Resnick, P. and A. Walker, "The text/enriched MIME - Content-type", RFC 1896, February 1996. - - - - - -Gellens Standards Track [Page 12] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - - [MIME-IMT] Freed, N. and N. Borenstein, "Multipurpose - Internet Mail Extensions (MIME) Part Two: Media - Types", RFC 2046, November 1996. - - [Quoted-Printable] Freed, N. and N. Borenstein, "Multipurpose - Internet Mail Extensions (MIME) Part One: Format - of Internet Message Bodies", RFC 2045, November - 1996. - - [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD - 10, RFC 821, August 1982. - - [HTML] Berners-Lee, T. and D. Connolly, "Hypertext Markup - Language -- 2.0", RFC 1866, November 1995. - - -12. Editor's Address - - Randall Gellens - QUALCOMM Incorporated - 5775 Morehouse Dr. - San Diego, CA 92121-2779 - USA - - Phone: +1 619 651 5115 - EMail: randy@qualcomm.com - - - - - - - - - - - - - - - - - - - - - - - - - -Gellens Standards Track [Page 13] - -RFC 2646 The Text/Plain Format Parameter August 1999 - - -13. Full Copyright Statement - - Copyright (C) The Internet Society (1999). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Gellens Standards Track [Page 14] - diff --git a/proto/rfc2822.txt b/proto/rfc2822.txt @@ -1,2859 +0,0 @@ - - - - - - -Network Working Group P. Resnick, Editor -Request for Comments: 2822 QUALCOMM Incorporated -Obsoletes: 822 April 2001 -Category: Standards Track - - - Internet Message Format - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2001). All Rights Reserved. - -Abstract - - This standard specifies a syntax for text messages that are sent - between computer users, within the framework of "electronic mail" - messages. This standard supersedes the one specified in Request For - Comments (RFC) 822, "Standard for the Format of ARPA Internet Text - Messages", updating it to reflect current practice and incorporating - incremental changes that were specified in other RFCs. - -Table of Contents - - 1. Introduction ............................................... 3 - 1.1. Scope .................................................... 3 - 1.2. Notational conventions ................................... 4 - 1.2.1. Requirements notation .................................. 4 - 1.2.2. Syntactic notation ..................................... 4 - 1.3. Structure of this document ............................... 4 - 2. Lexical Analysis of Messages ............................... 5 - 2.1. General Description ...................................... 5 - 2.1.1. Line Length Limits ..................................... 6 - 2.2. Header Fields ............................................ 7 - 2.2.1. Unstructured Header Field Bodies ....................... 7 - 2.2.2. Structured Header Field Bodies ......................... 7 - 2.2.3. Long Header Fields ..................................... 7 - 2.3. Body ..................................................... 8 - 3. Syntax ..................................................... 9 - 3.1. Introduction ............................................. 9 - 3.2. Lexical Tokens ........................................... 9 - - - -Resnick Standards Track [Page 1] - -RFC 2822 Internet Message Format April 2001 - - - 3.2.1. Primitive Tokens ....................................... 9 - 3.2.2. Quoted characters ......................................10 - 3.2.3. Folding white space and comments .......................11 - 3.2.4. Atom ...................................................12 - 3.2.5. Quoted strings .........................................13 - 3.2.6. Miscellaneous tokens ...................................13 - 3.3. Date and Time Specification ..............................14 - 3.4. Address Specification ....................................15 - 3.4.1. Addr-spec specification ................................16 - 3.5 Overall message syntax ....................................17 - 3.6. Field definitions ........................................18 - 3.6.1. The origination date field .............................20 - 3.6.2. Originator fields ......................................21 - 3.6.3. Destination address fields .............................22 - 3.6.4. Identification fields ..................................23 - 3.6.5. Informational fields ...................................26 - 3.6.6. Resent fields ..........................................26 - 3.6.7. Trace fields ...........................................28 - 3.6.8. Optional fields ........................................29 - 4. Obsolete Syntax ............................................29 - 4.1. Miscellaneous obsolete tokens ............................30 - 4.2. Obsolete folding white space .............................31 - 4.3. Obsolete Date and Time ...................................31 - 4.4. Obsolete Addressing ......................................33 - 4.5. Obsolete header fields ...................................33 - 4.5.1. Obsolete origination date field ........................34 - 4.5.2. Obsolete originator fields .............................34 - 4.5.3. Obsolete destination address fields ....................34 - 4.5.4. Obsolete identification fields .........................35 - 4.5.5. Obsolete informational fields ..........................35 - 4.5.6. Obsolete resent fields .................................35 - 4.5.7. Obsolete trace fields ..................................36 - 4.5.8. Obsolete optional fields ...............................36 - 5. Security Considerations ....................................36 - 6. Bibliography ...............................................37 - 7. Editor's Address ...........................................38 - 8. Acknowledgements ...........................................39 - Appendix A. Example messages ..................................41 - A.1. Addressing examples ......................................41 - A.1.1. A message from one person to another with simple - addressing .............................................41 - A.1.2. Different types of mailboxes ...........................42 - A.1.3. Group addresses ........................................43 - A.2. Reply messages ...........................................43 - A.3. Resent messages ..........................................44 - A.4. Messages with trace fields ...............................46 - A.5. White space, comments, and other oddities ................47 - A.6. Obsoleted forms ..........................................47 - - - -Resnick Standards Track [Page 2] - -RFC 2822 Internet Message Format April 2001 - - - A.6.1. Obsolete addressing ....................................48 - A.6.2. Obsolete dates .........................................48 - A.6.3. Obsolete white space and comments ......................48 - Appendix B. Differences from earlier standards ................49 - Appendix C. Notices ...........................................50 - Full Copyright Statement ......................................51 - -1. Introduction - -1.1. Scope - - This standard specifies a syntax for text messages that are sent - between computer users, within the framework of "electronic mail" - messages. This standard supersedes the one specified in Request For - Comments (RFC) 822, "Standard for the Format of ARPA Internet Text - Messages" [RFC822], updating it to reflect current practice and - incorporating incremental changes that were specified in other RFCs - [STD3]. - - This standard specifies a syntax only for text messages. In - particular, it makes no provision for the transmission of images, - audio, or other sorts of structured data in electronic mail messages. - There are several extensions published, such as the MIME document - series [RFC2045, RFC2046, RFC2049], which describe mechanisms for the - transmission of such data through electronic mail, either by - extending the syntax provided here or by structuring such messages to - conform to this syntax. Those mechanisms are outside of the scope of - this standard. - - In the context of electronic mail, messages are viewed as having an - envelope and contents. The envelope contains whatever information is - needed to accomplish transmission and delivery. (See [RFC2821] for a - discussion of the envelope.) The contents comprise the object to be - delivered to the recipient. This standard applies only to the format - and some of the semantics of message contents. It contains no - specification of the information in the envelope. - - However, some message systems may use information from the contents - to create the envelope. It is intended that this standard facilitate - the acquisition of such information by programs. - - This specification is intended as a definition of what message - content format is to be passed between systems. Though some message - systems locally store messages in this format (which eliminates the - need for translation between formats) and others use formats that - differ from the one specified in this standard, local storage is - outside of the scope of this standard. - - - - -Resnick Standards Track [Page 3] - -RFC 2822 Internet Message Format April 2001 - - - Note: This standard is not intended to dictate the internal formats - used by sites, the specific message system features that they are - expected to support, or any of the characteristics of user interface - programs that create or read messages. In addition, this standard - does not specify an encoding of the characters for either transport - or storage; that is, it does not specify the number of bits used or - how those bits are specifically transferred over the wire or stored - on disk. - -1.2. Notational conventions - -1.2.1. Requirements notation - - This document occasionally uses terms that appear in capital letters. - When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD - NOT", and "MAY" appear capitalized, they are being used to indicate - particular requirements of this specification. A discussion of the - meanings of these terms appears in [RFC2119]. - -1.2.2. Syntactic notation - - This standard uses the Augmented Backus-Naur Form (ABNF) notation - specified in [RFC2234] for the formal definitions of the syntax of - messages. Characters will be specified either by a decimal value - (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by - a case-insensitive literal value enclosed in quotation marks (e.g., - "A" for either uppercase or lowercase A). See [RFC2234] for the full - description of the notation. - -1.3. Structure of this document - - This document is divided into several sections. - - This section, section 1, is a short introduction to the document. - - Section 2 lays out the general description of a message and its - constituent parts. This is an overview to help the reader understand - some of the general principles used in the later portions of this - document. Any examples in this section MUST NOT be taken as - specification of the formal syntax of any part of a message. - - Section 3 specifies formal ABNF rules for the structure of each part - of a message (the syntax) and describes the relationship between - those parts and their meaning in the context of a message (the - semantics). That is, it describes the actual rules for the structure - of each part of a message (the syntax) as well as a description of - the parts and instructions on how they ought to be interpreted (the - semantics). This includes analysis of the syntax and semantics of - - - -Resnick Standards Track [Page 4] - -RFC 2822 Internet Message Format April 2001 - - - subparts of messages that have specific structure. The syntax - included in section 3 represents messages as they MUST be created. - There are also notes in section 3 to indicate if any of the options - specified in the syntax SHOULD be used over any of the others. - - Both sections 2 and 3 describe messages that are legal to generate - for purposes of this standard. - - Section 4 of this document specifies an "obsolete" syntax. There are - references in section 3 to these obsolete syntactic elements. The - rules of the obsolete syntax are elements that have appeared in - earlier revisions of this standard or have previously been widely - used in Internet messages. As such, these elements MUST be - interpreted by parsers of messages in order to be conformant to this - standard. However, since items in this syntax have been determined - to be non-interoperable or to cause significant problems for - recipients of messages, they MUST NOT be generated by creators of - conformant messages. - - Section 5 details security considerations to take into account when - implementing this standard. - - Section 6 is a bibliography of references in this document. - - Section 7 contains the editor's address. - - Section 8 contains acknowledgements. - - Appendix A lists examples of different sorts of messages. These - examples are not exhaustive of the types of messages that appear on - the Internet, but give a broad overview of certain syntactic forms. - - Appendix B lists the differences between this standard and earlier - standards for Internet messages. - - Appendix C has copyright and intellectual property notices. - -2. Lexical Analysis of Messages - -2.1. General Description - - At the most basic level, a message is a series of characters. A - message that is conformant with this standard is comprised of - characters with values in the range 1 through 127 and interpreted as - US-ASCII characters [ASCII]. For brevity, this document sometimes - refers to this range of characters as simply "US-ASCII characters". - - - - - -Resnick Standards Track [Page 5] - -RFC 2822 Internet Message Format April 2001 - - - Note: This standard specifies that messages are made up of characters - in the US-ASCII range of 1 through 127. There are other documents, - specifically the MIME document series [RFC2045, RFC2046, RFC2047, - RFC2048, RFC2049], that extend this standard to allow for values - outside of that range. Discussion of those mechanisms is not within - the scope of this standard. - - Messages are divided into lines of characters. A line is a series of - characters that is delimited with the two characters carriage-return - and line-feed; that is, the carriage return (CR) character (ASCII - value 13) followed immediately by the line feed (LF) character (ASCII - value 10). (The carriage-return/line-feed pair is usually written in - this document as "CRLF".) - - A message consists of header fields (collectively called "the header - of the message") followed, optionally, by a body. The header is a - sequence of lines of characters with special syntax as defined in - this standard. The body is simply a sequence of characters that - follows the header and is separated from the header by an empty line - (i.e., a line with nothing preceding the CRLF). - -2.1.1. Line Length Limits - - There are two limits that this standard places on the number of - characters in a line. Each line of characters MUST be no more than - 998 characters, and SHOULD be no more than 78 characters, excluding - the CRLF. - - The 998 character limit is due to limitations in many implementations - which send, receive, or store Internet Message Format messages that - simply cannot handle more than 998 characters on a line. Receiving - implementations would do well to handle an arbitrarily large number - of characters in a line for robustness sake. However, there are so - many implementations which (in compliance with the transport - requirements of [RFC2821]) do not accept messages containing more - than 1000 character including the CR and LF per line, it is important - for implementations not to create such messages. - - The more conservative 78 character recommendation is to accommodate - the many implementations of user interfaces that display these - messages which may truncate, or disastrously wrap, the display of - more than 78 characters per line, in spite of the fact that such - implementations are non-conformant to the intent of this - specification (and that of [RFC2821] if they actually cause - information to be lost). Again, even though this limitation is put on - messages, it is encumbant upon implementations which display messages - - - - - -Resnick Standards Track [Page 6] - -RFC 2822 Internet Message Format April 2001 - - - to handle an arbitrarily large number of characters in a line - (certainly at least up to the 998 character limit) for the sake of - robustness. - -2.2. Header Fields - - Header fields are lines composed of a field name, followed by a colon - (":"), followed by a field body, and terminated by CRLF. A field - name MUST be composed of printable US-ASCII characters (i.e., - characters that have values between 33 and 126, inclusive), except - colon. A field body may be composed of any US-ASCII characters, - except for CR and LF. However, a field body may contain CRLF when - used in header "folding" and "unfolding" as described in section - 2.2.3. All field bodies MUST conform to the syntax described in - sections 3 and 4 of this standard. - -2.2.1. Unstructured Header Field Bodies - - Some field bodies in this standard are defined simply as - "unstructured" (which is specified below as any US-ASCII characters, - except for CR and LF) with no further restrictions. These are - referred to as unstructured field bodies. Semantically, unstructured - field bodies are simply to be treated as a single line of characters - with no further processing (except for header "folding" and - "unfolding" as described in section 2.2.3). - -2.2.2. Structured Header Field Bodies - - Some field bodies in this standard have specific syntactical - structure more restrictive than the unstructured field bodies - described above. These are referred to as "structured" field bodies. - Structured field bodies are sequences of specific lexical tokens as - described in sections 3 and 4 of this standard. Many of these tokens - are allowed (according to their syntax) to be introduced or end with - comments (as described in section 3.2.3) as well as the space (SP, - ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters - (together known as the white space characters, WSP), and those WSP - characters are subject to header "folding" and "unfolding" as - described in section 2.2.3. Semantic analysis of structured field - bodies is given along with their syntax. - -2.2.3. Long Header Fields - - Each header field is logically a single line of characters comprising - the field name, the colon, and the field body. For convenience - however, and to deal with the 998/78 character limitations per line, - the field body portion of a header field can be split into a multiple - line representation; this is called "folding". The general rule is - - - -Resnick Standards Track [Page 7] - -RFC 2822 Internet Message Format April 2001 - - - that wherever this standard allows for folding white space (not - simply WSP characters), a CRLF may be inserted before any WSP. For - example, the header field: - - Subject: This is a test - - can be represented as: - - Subject: This - is a test - - Note: Though structured field bodies are defined in such a way that - folding can take place between many of the lexical tokens (and even - within some of the lexical tokens), folding SHOULD be limited to - placing the CRLF at higher-level syntactic breaks. For instance, if - a field body is defined as comma-separated values, it is recommended - that folding occur after the comma separating the structured items in - preference to other places where the field could be folded, even if - it is allowed elsewhere. - - The process of moving from this folded multiple-line representation - of a header field to its single line representation is called - "unfolding". Unfolding is accomplished by simply removing any CRLF - that is immediately followed by WSP. Each header field should be - treated in its unfolded form for further syntactic and semantic - evaluation. - -2.3. Body - - The body of a message is simply lines of US-ASCII characters. The - only two limitations on the body are as follows: - - - CR and LF MUST only occur together as CRLF; they MUST NOT appear - independently in the body. - - - Lines of characters in the body MUST be limited to 998 characters, - and SHOULD be limited to 78 characters, excluding the CRLF. - - Note: As was stated earlier, there are other standards documents, - specifically the MIME documents [RFC2045, RFC2046, RFC2048, RFC2049] - that extend this standard to allow for different sorts of message - bodies. Again, these mechanisms are beyond the scope of this - document. - - - - - - - - -Resnick Standards Track [Page 8] - -RFC 2822 Internet Message Format April 2001 - - -3. Syntax - -3.1. Introduction - - The syntax as given in this section defines the legal syntax of - Internet messages. Messages that are conformant to this standard - MUST conform to the syntax in this section. If there are options in - this section where one option SHOULD be generated, that is indicated - either in the prose or in a comment next to the syntax. - - For the defined expressions, a short description of the syntax and - use is given, followed by the syntax in ABNF, followed by a semantic - analysis. Primitive tokens that are used but otherwise unspecified - come from [RFC2234]. - - In some of the definitions, there will be nonterminals whose names - start with "obs-". These "obs-" elements refer to tokens defined in - the obsolete syntax in section 4. In all cases, these productions - are to be ignored for the purposes of generating legal Internet - messages and MUST NOT be used as part of such a message. However, - when interpreting messages, these tokens MUST be honored as part of - the legal syntax. In this sense, section 3 defines a grammar for - generation of messages, with "obs-" elements that are to be ignored, - while section 4 adds grammar for interpretation of messages. - -3.2. Lexical Tokens - - The following rules are used to define an underlying lexical - analyzer, which feeds tokens to the higher-level parsers. This - section defines the tokens used in structured header field bodies. - - Note: Readers of this standard need to pay special attention to how - these lexical tokens are used in both the lower-level and - higher-level syntax later in the document. Particularly, the white - space tokens and the comment tokens defined in section 3.2.3 get used - in the lower-level tokens defined here, and those lower-level tokens - are in turn used as parts of the higher-level tokens defined later. - Therefore, the white space and comments may be allowed in the - higher-level tokens even though they may not explicitly appear in a - particular definition. - -3.2.1. Primitive Tokens - - The following are primitive tokens referred to elsewhere in this - standard, but not otherwise defined in [RFC2234]. Some of them will - not appear anywhere else in the syntax, but they are convenient to - refer to in other parts of this document. - - - - -Resnick Standards Track [Page 9] - -RFC 2822 Internet Message Format April 2001 - - - Note: The "specials" below are just such an example. Though the - specials token does not appear anywhere else in this standard, it is - useful for implementers who use tools that lexically analyze - messages. Each of the characters in specials can be used to indicate - a tokenization point in lexical analysis. - -NO-WS-CTL = %d1-8 / ; US-ASCII control characters - %d11 / ; that do not include the - %d12 / ; carriage return, line feed, - %d14-31 / ; and white space characters - %d127 - -text = %d1-9 / ; Characters excluding CR and LF - %d11 / - %d12 / - %d14-127 / - obs-text - -specials = "(" / ")" / ; Special characters used in - "<" / ">" / ; other parts of the syntax - "[" / "]" / - ":" / ";" / - "@" / "\" / - "," / "." / - DQUOTE - - No special semantics are attached to these tokens. They are simply - single characters. - -3.2.2. Quoted characters - - Some characters are reserved for special interpretation, such as - delimiting lexical tokens. To permit use of these characters as - uninterpreted data, a quoting mechanism is provided. - -quoted-pair = ("\" text) / obs-qp - - Where any quoted-pair appears, it is to be interpreted as the text - character alone. That is to say, the "\" character that appears as - part of a quoted-pair is semantically "invisible". - - Note: The "\" character may appear in a message where it is not part - of a quoted-pair. A "\" character that does not appear in a - quoted-pair is not semantically invisible. The only places in this - standard where quoted-pair currently appears are ccontent, qcontent, - dcontent, no-fold-quote, and no-fold-literal. - - - - - -Resnick Standards Track [Page 10] - -RFC 2822 Internet Message Format April 2001 - - -3.2.3. Folding white space and comments - - White space characters, including white space used in folding - (described in section 2.2.3), may appear between many elements in - header field bodies. Also, strings of characters that are treated as - comments may be included in structured field bodies as characters - enclosed in parentheses. The following defines the folding white - space (FWS) and comment constructs. - - Strings of characters enclosed in parentheses are considered comments - so long as they do not appear within a "quoted-string", as defined in - section 3.2.5. Comments may nest. - - There are several places in this standard where comments and FWS may - be freely inserted. To accommodate that syntax, an additional token - for "CFWS" is defined for places where comments and/or FWS can occur. - However, where CFWS occurs in this standard, it MUST NOT be inserted - in such a way that any line of a folded header field is made up - entirely of WSP characters and nothing else. - -FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space - obs-FWS - -ctext = NO-WS-CTL / ; Non white space controls - - %d33-39 / ; The rest of the US-ASCII - %d42-91 / ; characters not including "(", - %d93-126 ; ")", or "\" - -ccontent = ctext / quoted-pair / comment - -comment = "(" *([FWS] ccontent) [FWS] ")" - -CFWS = *([FWS] comment) (([FWS] comment) / FWS) - - Throughout this standard, where FWS (the folding white space token) - appears, it indicates a place where header folding, as discussed in - section 2.2.3, may take place. Wherever header folding appears in a - message (that is, a header field body containing a CRLF followed by - any WSP), header unfolding (removal of the CRLF) is performed before - any further lexical analysis is performed on that header field - according to this standard. That is to say, any CRLF that appears in - FWS is semantically "invisible." - - A comment is normally used in a structured field body to provide some - human readable informational text. Since a comment is allowed to - contain FWS, folding is permitted within the comment. Also note that - since quoted-pair is allowed in a comment, the parentheses and - - - -Resnick Standards Track [Page 11] - -RFC 2822 Internet Message Format April 2001 - - - backslash characters may appear in a comment so long as they appear - as a quoted-pair. Semantically, the enclosing parentheses are not - part of the comment; the comment is what is contained between the two - parentheses. As stated earlier, the "\" in any quoted-pair and the - CRLF in any FWS that appears within the comment are semantically - "invisible" and therefore not part of the comment either. - - Runs of FWS, comment or CFWS that occur between lexical tokens in a - structured field header are semantically interpreted as a single - space character. - -3.2.4. Atom - - Several productions in structured header field bodies are simply - strings of certain basic characters. Such productions are called - atoms. - - Some of the structured header field bodies also allow the period - character (".", ASCII value 46) within runs of atext. An additional - "dot-atom" token is defined for those purposes. - -atext = ALPHA / DIGIT / ; Any character except controls, - "!" / "#" / ; SP, and specials. - "$" / "%" / ; Used for atoms - "&" / "'" / - "*" / "+" / - "-" / "/" / - "=" / "?" / - "^" / "_" / - "`" / "{" / - "|" / "}" / - "~" - -atom = [CFWS] 1*atext [CFWS] - -dot-atom = [CFWS] dot-atom-text [CFWS] - -dot-atom-text = 1*atext *("." 1*atext) - - Both atom and dot-atom are interpreted as a single unit, comprised of - the string of characters that make it up. Semantically, the optional - comments and FWS surrounding the rest of the characters are not part - of the atom; the atom is only the run of atext characters in an atom, - or the atext and "." characters in a dot-atom. - - - - - - - -Resnick Standards Track [Page 12] - -RFC 2822 Internet Message Format April 2001 - - -3.2.5. Quoted strings - - Strings of characters that include characters other than those - allowed in atoms may be represented in a quoted string format, where - the characters are surrounded by quote (DQUOTE, ASCII value 34) - characters. - -qtext = NO-WS-CTL / ; Non white space controls - - %d33 / ; The rest of the US-ASCII - %d35-91 / ; characters not including "\" - %d93-126 ; or the quote character - -qcontent = qtext / quoted-pair - -quoted-string = [CFWS] - DQUOTE *([FWS] qcontent) [FWS] DQUOTE - [CFWS] - - A quoted-string is treated as a unit. That is, quoted-string is - identical to atom, semantically. Since a quoted-string is allowed to - contain FWS, folding is permitted. Also note that since quoted-pair - is allowed in a quoted-string, the quote and backslash characters may - appear in a quoted-string so long as they appear as a quoted-pair. - - Semantically, neither the optional CFWS outside of the quote - characters nor the quote characters themselves are part of the - quoted-string; the quoted-string is what is contained between the two - quote characters. As stated earlier, the "\" in any quoted-pair and - the CRLF in any FWS/CFWS that appears within the quoted-string are - semantically "invisible" and therefore not part of the quoted-string - either. - -3.2.6. Miscellaneous tokens - - Three additional tokens are defined, word and phrase for combinations - of atoms and/or quoted-strings, and unstructured for use in - unstructured header fields and in some places within structured - header fields. - -word = atom / quoted-string - -phrase = 1*word / obs-phrase - - - - - - - - -Resnick Standards Track [Page 13] - -RFC 2822 Internet Message Format April 2001 - - -utext = NO-WS-CTL / ; Non white space controls - %d33-126 / ; The rest of US-ASCII - obs-utext - -unstructured = *([FWS] utext) [FWS] - -3.3. Date and Time Specification - - Date and time occur in several header fields. This section specifies - the syntax for a full date and time specification. Though folding - white space is permitted throughout the date-time specification, it - is RECOMMENDED that a single space be used in each place that FWS - appears (whether it is required or optional); some older - implementations may not interpret other occurrences of folding white - space correctly. - -date-time = [ day-of-week "," ] date FWS time [CFWS] - -day-of-week = ([FWS] day-name) / obs-day-of-week - -day-name = "Mon" / "Tue" / "Wed" / "Thu" / - "Fri" / "Sat" / "Sun" - -date = day month year - -year = 4*DIGIT / obs-year - -month = (FWS month-name FWS) / obs-month - -month-name = "Jan" / "Feb" / "Mar" / "Apr" / - "May" / "Jun" / "Jul" / "Aug" / - "Sep" / "Oct" / "Nov" / "Dec" - -day = ([FWS] 1*2DIGIT) / obs-day - -time = time-of-day FWS zone - -time-of-day = hour ":" minute [ ":" second ] - -hour = 2DIGIT / obs-hour - -minute = 2DIGIT / obs-minute - -second = 2DIGIT / obs-second - -zone = (( "+" / "-" ) 4DIGIT) / obs-zone - - - - - -Resnick Standards Track [Page 14] - -RFC 2822 Internet Message Format April 2001 - - - The day is the numeric day of the month. The year is any numeric - year 1900 or later. - - The time-of-day specifies the number of hours, minutes, and - optionally seconds since midnight of the date indicated. - - The date and time-of-day SHOULD express local time. - - The zone specifies the offset from Coordinated Universal Time (UTC, - formerly referred to as "Greenwich Mean Time") that the date and - time-of-day represent. The "+" or "-" indicates whether the - time-of-day is ahead of (i.e., east of) or behind (i.e., west of) - Universal Time. The first two digits indicate the number of hours - difference from Universal Time, and the last two digits indicate the - number of minutes difference from Universal Time. (Hence, +hhmm - means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) - minutes). The form "+0000" SHOULD be used to indicate a time zone at - Universal Time. Though "-0000" also indicates Universal Time, it is - used to indicate that the time was generated on a system that may be - in a local time zone other than Universal Time and therefore - indicates that the date-time contains no information about the local - time zone. - - A date-time specification MUST be semantically valid. That is, the - day-of-the-week (if included) MUST be the day implied by the date, - the numeric day-of-month MUST be between 1 and the number of days - allowed for the specified month (in the specified year), the - time-of-day MUST be in the range 00:00:00 through 23:59:60 (the - number of seconds allowing for a leap second; see [STD12]), and the - zone MUST be within the range -9959 through +9959. - -3.4. Address Specification - - Addresses occur in several message header fields to indicate senders - and recipients of messages. An address may either be an individual - mailbox, or a group of mailboxes. - -address = mailbox / group - -mailbox = name-addr / addr-spec - -name-addr = [display-name] angle-addr - -angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr - -group = display-name ":" [mailbox-list / CFWS] ";" - [CFWS] - - - - -Resnick Standards Track [Page 15] - -RFC 2822 Internet Message Format April 2001 - - -display-name = phrase - -mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list - -address-list = (address *("," address)) / obs-addr-list - - A mailbox receives mail. It is a conceptual entity which does not - necessarily pertain to file storage. For example, some sites may - choose to print mail on a printer and deliver the output to the - addressee's desk. Normally, a mailbox is comprised of two parts: (1) - an optional display name that indicates the name of the recipient - (which could be a person or a system) that could be displayed to the - user of a mail application, and (2) an addr-spec address enclosed in - angle brackets ("<" and ">"). There is also an alternate simple form - of a mailbox where the addr-spec address appears alone, without the - recipient's name or the angle brackets. The Internet addr-spec - address is described in section 3.4.1. - - Note: Some legacy implementations used the simple form where the - addr-spec appears without the angle brackets, but included the name - of the recipient in parentheses as a comment following the addr-spec. - Since the meaning of the information in a comment is unspecified, - implementations SHOULD use the full name-addr form of the mailbox, - instead of the legacy form, to specify the display name associated - with a mailbox. Also, because some legacy implementations interpret - the comment, comments generally SHOULD NOT be used in address fields - to avoid confusing such implementations. - - When it is desirable to treat several mailboxes as a single unit - (i.e., in a distribution list), the group construct can be used. The - group construct allows the sender to indicate a named group of - recipients. This is done by giving a display name for the group, - followed by a colon, followed by a comma separated list of any number - of mailboxes (including zero and one), and ending with a semicolon. - Because the list of mailboxes can be empty, using the group construct - is also a simple way to communicate to recipients that the message - was sent to one or more named sets of recipients, without actually - providing the individual mailbox address for each of those - recipients. - -3.4.1. Addr-spec specification - - An addr-spec is a specific Internet identifier that contains a - locally interpreted string followed by the at-sign character ("@", - ASCII value 64) followed by an Internet domain. The locally - interpreted string is either a quoted-string or a dot-atom. If the - string can be represented as a dot-atom (that is, it contains no - characters other than atext characters or "." surrounded by atext - - - -Resnick Standards Track [Page 16] - -RFC 2822 Internet Message Format April 2001 - - - characters), then the dot-atom form SHOULD be used and the - quoted-string form SHOULD NOT be used. Comments and folding white - space SHOULD NOT be used around the "@" in the addr-spec. - -addr-spec = local-part "@" domain - -local-part = dot-atom / quoted-string / obs-local-part - -domain = dot-atom / domain-literal / obs-domain - -domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS] - -dcontent = dtext / quoted-pair - -dtext = NO-WS-CTL / ; Non white space controls - - %d33-90 / ; The rest of the US-ASCII - %d94-126 ; characters not including "[", - ; "]", or "\" - - The domain portion identifies the point to which the mail is - delivered. In the dot-atom form, this is interpreted as an Internet - domain name (either a host name or a mail exchanger name) as - described in [STD3, STD13, STD14]. In the domain-literal form, the - domain is interpreted as the literal Internet address of the - particular host. In both cases, how addressing is used and how - messages are transported to a particular host is covered in the mail - transport document [RFC2821]. These mechanisms are outside of the - scope of this document. - - The local-part portion is a domain dependent string. In addresses, - it is simply interpreted on the particular host as a name of a - particular mailbox. - -3.5 Overall message syntax - - A message consists of header fields, optionally followed by a message - body. Lines in a message MUST be a maximum of 998 characters - excluding the CRLF, but it is RECOMMENDED that lines be limited to 78 - characters excluding the CRLF. (See section 2.1.1 for explanation.) - In a message body, though all of the characters listed in the text - rule MAY be used, the use of US-ASCII control characters (values 1 - through 8, 11, 12, and 14 through 31) is discouraged since their - interpretation by receivers for display is not guaranteed. - - - - - - - -Resnick Standards Track [Page 17] - -RFC 2822 Internet Message Format April 2001 - - -message = (fields / obs-fields) - [CRLF body] - -body = *(*998text CRLF) *998text - - The header fields carry most of the semantic information and are - defined in section 3.6. The body is simply a series of lines of text - which are uninterpreted for the purposes of this standard. - -3.6. Field definitions - - The header fields of a message are defined here. All header fields - have the same general syntactic structure: A field name, followed by - a colon, followed by the field body. The specific syntax for each - header field is defined in the subsequent sections. - - Note: In the ABNF syntax for each field in subsequent sections, each - field name is followed by the required colon. However, for brevity - sometimes the colon is not referred to in the textual description of - the syntax. It is, nonetheless, required. - - It is important to note that the header fields are not guaranteed to - be in a particular order. They may appear in any order, and they - have been known to be reordered occasionally when transported over - the Internet. However, for the purposes of this standard, header - fields SHOULD NOT be reordered when a message is transported or - transformed. More importantly, the trace header fields and resent - header fields MUST NOT be reordered, and SHOULD be kept in blocks - prepended to the message. See sections 3.6.6 and 3.6.7 for more - information. - - The only required header fields are the origination date field and - the originator address field(s). All other header fields are - syntactically optional. More information is contained in the table - following this definition. - -fields = *(trace - *(resent-date / - resent-from / - resent-sender / - resent-to / - resent-cc / - resent-bcc / - resent-msg-id)) - *(orig-date / - from / - sender / - reply-to / - - - -Resnick Standards Track [Page 18] - -RFC 2822 Internet Message Format April 2001 - - - to / - cc / - bcc / - message-id / - in-reply-to / - references / - subject / - comments / - keywords / - optional-field) - - The following table indicates limits on the number of times each - field may occur in a message header as well as any special - limitations on the use of those fields. An asterisk next to a value - in the minimum or maximum column indicates that a special restriction - appears in the Notes column. - -Field Min number Max number Notes - -trace 0 unlimited Block prepended - see - 3.6.7 - -resent-date 0* unlimited* One per block, required - if other resent fields - present - see 3.6.6 - -resent-from 0 unlimited* One per block - see - 3.6.6 - -resent-sender 0* unlimited* One per block, MUST - occur with multi-address - resent-from - see 3.6.6 - -resent-to 0 unlimited* One per block - see - 3.6.6 - -resent-cc 0 unlimited* One per block - see - 3.6.6 - -resent-bcc 0 unlimited* One per block - see - 3.6.6 - -resent-msg-id 0 unlimited* One per block - see - 3.6.6 - -orig-date 1 1 - -from 1 1 See sender and 3.6.2 - - - -Resnick Standards Track [Page 19] - -RFC 2822 Internet Message Format April 2001 - - -sender 0* 1 MUST occur with multi- - address from - see 3.6.2 - -reply-to 0 1 - -to 0 1 - -cc 0 1 - -bcc 0 1 - -message-id 0* 1 SHOULD be present - see - 3.6.4 - -in-reply-to 0* 1 SHOULD occur in some - replies - see 3.6.4 - -references 0* 1 SHOULD occur in some - replies - see 3.6.4 - -subject 0 1 - -comments 0 unlimited - -keywords 0 unlimited - -optional-field 0 unlimited - - The exact interpretation of each field is described in subsequent - sections. - -3.6.1. The origination date field - - The origination date field consists of the field name "Date" followed - by a date-time specification. - -orig-date = "Date:" date-time CRLF - - The origination date specifies the date and time at which the creator - of the message indicated that the message was complete and ready to - enter the mail delivery system. For instance, this might be the time - that a user pushes the "send" or "submit" button in an application - program. In any case, it is specifically not intended to convey the - time that the message is actually transported, but rather the time at - which the human or other creator of the message has put the message - into its final form, ready for transport. (For example, a portable - computer user who is not connected to a network might queue a message - - - - -Resnick Standards Track [Page 20] - -RFC 2822 Internet Message Format April 2001 - - - for delivery. The origination date is intended to contain the date - and time that the user queued the message, not the time when the user - connected to the network to send the message.) - -3.6.2. Originator fields - - The originator fields of a message consist of the from field, the - sender field (when applicable), and optionally the reply-to field. - The from field consists of the field name "From" and a - comma-separated list of one or more mailbox specifications. If the - from field contains more than one mailbox specification in the - mailbox-list, then the sender field, containing the field name - "Sender" and a single mailbox specification, MUST appear in the - message. In either case, an optional reply-to field MAY also be - included, which contains the field name "Reply-To" and a - comma-separated list of one or more addresses. - -from = "From:" mailbox-list CRLF - -sender = "Sender:" mailbox CRLF - -reply-to = "Reply-To:" address-list CRLF - - The originator fields indicate the mailbox(es) of the source of the - message. The "From:" field specifies the author(s) of the message, - that is, the mailbox(es) of the person(s) or system(s) responsible - for the writing of the message. The "Sender:" field specifies the - mailbox of the agent responsible for the actual transmission of the - message. For example, if a secretary were to send a message for - another person, the mailbox of the secretary would appear in the - "Sender:" field and the mailbox of the actual author would appear in - the "From:" field. If the originator of the message can be indicated - by a single mailbox and the author and transmitter are identical, the - "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD - appear. - - The originator fields also provide the information required when - replying to a message. When the "Reply-To:" field is present, it - indicates the mailbox(es) to which the author of the message suggests - that replies be sent. In the absence of the "Reply-To:" field, - replies SHOULD by default be sent to the mailbox(es) specified in the - "From:" field unless otherwise specified by the person composing the - reply. - - In all cases, the "From:" field SHOULD NOT contain any mailbox that - does not belong to the author(s) of the message. See also section - 3.6.3 for more information on forming the destination addresses for a - reply. - - - -Resnick Standards Track [Page 21] - -RFC 2822 Internet Message Format April 2001 - - -3.6.3. Destination address fields - - The destination fields of a message consist of three possible fields, - each of the same form: The field name, which is either "To", "Cc", or - "Bcc", followed by a comma-separated list of one or more addresses - (either mailbox or group syntax). - -to = "To:" address-list CRLF - -cc = "Cc:" address-list CRLF - -bcc = "Bcc:" (address-list / [CFWS]) CRLF - - The destination fields specify the recipients of the message. Each - destination field may have one or more addresses, and each of the - addresses indicate the intended recipients of the message. The only - difference between the three fields is how each is used. - - The "To:" field contains the address(es) of the primary recipient(s) - of the message. - - The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of - making a copy on a typewriter using carbon paper) contains the - addresses of others who are to receive the message, though the - content of the message may not be directed at them. - - The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy") contains - addresses of recipients of the message whose addresses are not to be - revealed to other recipients of the message. There are three ways in - which the "Bcc:" field is used. In the first case, when a message - containing a "Bcc:" field is prepared to be sent, the "Bcc:" line is - removed even though all of the recipients (including those specified - in the "Bcc:" field) are sent a copy of the message. In the second - case, recipients specified in the "To:" and "Cc:" lines each are sent - a copy of the message with the "Bcc:" line removed as above, but the - recipients on the "Bcc:" line get a separate copy of the message - containing a "Bcc:" line. (When there are multiple recipient - addresses in the "Bcc:" field, some implementations actually send a - separate copy of the message to each recipient with a "Bcc:" - containing only the address of that particular recipient.) Finally, - since a "Bcc:" field may contain no addresses, a "Bcc:" field can be - sent without any addresses indicating to the recipients that blind - copies were sent to someone. Which method to use with "Bcc:" fields - is implementation dependent, but refer to the "Security - Considerations" section of this document for a discussion of each. - - - - - - -Resnick Standards Track [Page 22] - -RFC 2822 Internet Message Format April 2001 - - - When a message is a reply to another message, the mailboxes of the - authors of the original message (the mailboxes in the "From:" field) - or mailboxes specified in the "Reply-To:" field (if it exists) MAY - appear in the "To:" field of the reply since these would normally be - the primary recipients of the reply. If a reply is sent to a message - that has destination fields, it is often desirable to send a copy of - the reply to all of the recipients of the message, in addition to the - author. When such a reply is formed, addresses in the "To:" and - "Cc:" fields of the original message MAY appear in the "Cc:" field of - the reply, since these are normally secondary recipients of the - reply. If a "Bcc:" field is present in the original message, - addresses in that field MAY appear in the "Bcc:" field of the reply, - but SHOULD NOT appear in the "To:" or "Cc:" fields. - - Note: Some mail applications have automatic reply commands that - include the destination addresses of the original message in the - destination addresses of the reply. How those reply commands behave - is implementation dependent and is beyond the scope of this document. - In particular, whether or not to include the original destination - addresses when the original message had a "Reply-To:" field is not - addressed here. - -3.6.4. Identification fields - - Though optional, every message SHOULD have a "Message-ID:" field. - Furthermore, reply messages SHOULD have "In-Reply-To:" and - "References:" fields as appropriate, as described below. - - The "Message-ID:" field contains a single unique message identifier. - The "References:" and "In-Reply-To:" field each contain one or more - unique message identifiers, optionally separated by CFWS. - - The message identifier (msg-id) is similar in syntax to an angle-addr - construct without the internal CFWS. - -message-id = "Message-ID:" msg-id CRLF - -in-reply-to = "In-Reply-To:" 1*msg-id CRLF - -references = "References:" 1*msg-id CRLF - -msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] - -id-left = dot-atom-text / no-fold-quote / obs-id-left - -id-right = dot-atom-text / no-fold-literal / obs-id-right - -no-fold-quote = DQUOTE *(qtext / quoted-pair) DQUOTE - - - -Resnick Standards Track [Page 23] - -RFC 2822 Internet Message Format April 2001 - - -no-fold-literal = "[" *(dtext / quoted-pair) "]" - - The "Message-ID:" field provides a unique message identifier that - refers to a particular version of a particular message. The - uniqueness of the message identifier is guaranteed by the host that - generates it (see below). This message identifier is intended to be - machine readable and not necessarily meaningful to humans. A message - identifier pertains to exactly one instantiation of a particular - message; subsequent revisions to the message each receive new message - identifiers. - - Note: There are many instances when messages are "changed", but those - changes do not constitute a new instantiation of that message, and - therefore the message would not get a new message identifier. For - example, when messages are introduced into the transport system, they - are often prepended with additional header fields such as trace - fields (described in section 3.6.7) and resent fields (described in - section 3.6.6). The addition of such header fields does not change - the identity of the message and therefore the original "Message-ID:" - field is retained. In all cases, it is the meaning that the sender - of the message wishes to convey (i.e., whether this is the same - message or a different message) that determines whether or not the - "Message-ID:" field changes, not any particular syntactic difference - that appears (or does not appear) in the message. - - The "In-Reply-To:" and "References:" fields are used when creating a - reply to a message. They hold the message identifier of the original - message and the message identifiers of other messages (for example, - in the case of a reply to a message which was itself a reply). The - "In-Reply-To:" field may be used to identify the message (or - messages) to which the new message is a reply, while the - "References:" field may be used to identify a "thread" of - conversation. - - When creating a reply to a message, the "In-Reply-To:" and - "References:" fields of the resultant message are constructed as - follows: - - The "In-Reply-To:" field will contain the contents of the "Message- - ID:" field of the message to which this one is a reply (the "parent - message"). If there is more than one parent message, then the "In- - Reply-To:" field will contain the contents of all of the parents' - "Message-ID:" fields. If there is no "Message-ID:" field in any of - the parent messages, then the new message will have no "In-Reply-To:" - field. - - - - - - -Resnick Standards Track [Page 24] - -RFC 2822 Internet Message Format April 2001 - - - The "References:" field will contain the contents of the parent's - "References:" field (if any) followed by the contents of the parent's - "Message-ID:" field (if any). If the parent message does not contain - a "References:" field but does have an "In-Reply-To:" field - containing a single message identifier, then the "References:" field - will contain the contents of the parent's "In-Reply-To:" field - followed by the contents of the parent's "Message-ID:" field (if - any). If the parent has none of the "References:", "In-Reply-To:", - or "Message-ID:" fields, then the new message will have no - "References:" field. - - Note: Some implementations parse the "References:" field to display - the "thread of the discussion". These implementations assume that - each new message is a reply to a single parent and hence that they - can walk backwards through the "References:" field to find the parent - of each message listed there. Therefore, trying to form a - "References:" field for a reply that has multiple parents is - discouraged and how to do so is not defined in this document. - - The message identifier (msg-id) itself MUST be a globally unique - identifier for a message. The generator of the message identifier - MUST guarantee that the msg-id is unique. There are several - algorithms that can be used to accomplish this. Since the msg-id has - a similar syntax to angle-addr (identical except that comments and - folding white space are not allowed), a good method is to put the - domain name (or a domain literal IP address) of the host on which the - message identifier was created on the right hand side of the "@", and - put a combination of the current absolute date and time along with - some other currently unique (perhaps sequential) identifier available - on the system (for example, a process id number) on the left hand - side. Using a date on the left hand side and a domain name or domain - literal on the right hand side makes it possible to guarantee - uniqueness since no two hosts use the same domain name or IP address - at the same time. Though other algorithms will work, it is - RECOMMENDED that the right hand side contain some domain identifier - (either of the host itself or otherwise) such that the generator of - the message identifier can guarantee the uniqueness of the left hand - side within the scope of that domain. - - Semantically, the angle bracket characters are not part of the - msg-id; the msg-id is what is contained between the two angle bracket - characters. - - - - - - - - - -Resnick Standards Track [Page 25] - -RFC 2822 Internet Message Format April 2001 - - -3.6.5. Informational fields - - The informational fields are all optional. The "Keywords:" field - contains a comma-separated list of one or more words or - quoted-strings. The "Subject:" and "Comments:" fields are - unstructured fields as defined in section 2.2.1, and therefore may - contain text or folding white space. - -subject = "Subject:" unstructured CRLF - -comments = "Comments:" unstructured CRLF - -keywords = "Keywords:" phrase *("," phrase) CRLF - - These three fields are intended to have only human-readable content - with information about the message. The "Subject:" field is the most - common and contains a short string identifying the topic of the - message. When used in a reply, the field body MAY start with the - string "Re: " (from the Latin "res", in the matter of) followed by - the contents of the "Subject:" field body of the original message. - If this is done, only one instance of the literal string "Re: " ought - to be used since use of other strings or more than one instance can - lead to undesirable consequences. The "Comments:" field contains any - additional comments on the text of the body of the message. The - "Keywords:" field contains a comma-separated list of important words - and phrases that might be useful for the recipient. - -3.6.6. Resent fields - - Resent fields SHOULD be added to any message that is reintroduced by - a user into the transport system. A separate set of resent fields - SHOULD be added each time this is done. All of the resent fields - corresponding to a particular resending of the message SHOULD be - together. Each new set of resent fields is prepended to the message; - that is, the most recent set of resent fields appear earlier in the - message. No other fields in the message are changed when resent - fields are added. - - Each of the resent fields corresponds to a particular field elsewhere - in the syntax. For instance, the "Resent-Date:" field corresponds to - the "Date:" field and the "Resent-To:" field corresponds to the "To:" - field. In each case, the syntax for the field body is identical to - the syntax given previously for the corresponding field. - - When resent fields are used, the "Resent-From:" and "Resent-Date:" - fields MUST be sent. The "Resent-Message-ID:" field SHOULD be sent. - "Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be - identical to "Resent-From:". - - - -Resnick Standards Track [Page 26] - -RFC 2822 Internet Message Format April 2001 - - -resent-date = "Resent-Date:" date-time CRLF - -resent-from = "Resent-From:" mailbox-list CRLF - -resent-sender = "Resent-Sender:" mailbox CRLF - -resent-to = "Resent-To:" address-list CRLF - -resent-cc = "Resent-Cc:" address-list CRLF - -resent-bcc = "Resent-Bcc:" (address-list / [CFWS]) CRLF - -resent-msg-id = "Resent-Message-ID:" msg-id CRLF - - Resent fields are used to identify a message as having been - reintroduced into the transport system by a user. The purpose of - using resent fields is to have the message appear to the final - recipient as if it were sent directly by the original sender, with - all of the original fields remaining the same. Each set of resent - fields correspond to a particular resending event. That is, if a - message is resent multiple times, each set of resent fields gives - identifying information for each individual time. Resent fields are - strictly informational. They MUST NOT be used in the normal - processing of replies or other such automatic actions on messages. - - Note: Reintroducing a message into the transport system and using - resent fields is a different operation from "forwarding". - "Forwarding" has two meanings: One sense of forwarding is that a mail - reading program can be told by a user to forward a copy of a message - to another person, making the forwarded message the body of the new - message. A forwarded message in this sense does not appear to have - come from the original sender, but is an entirely new message from - the forwarder of the message. On the other hand, forwarding is also - used to mean when a mail transport program gets a message and - forwards it on to a different destination for final delivery. Resent - header fields are not intended for use with either type of - forwarding. - - The resent originator fields indicate the mailbox of the person(s) or - system(s) that resent the message. As with the regular originator - fields, there are two forms: a simple "Resent-From:" form which - contains the mailbox of the individual doing the resending, and the - more complex form, when one individual (identified in the - "Resent-Sender:" field) resends a message on behalf of one or more - others (identified in the "Resent-From:" field). - - Note: When replying to a resent message, replies behave just as they - would with any other message, using the original "From:", - - - -Resnick Standards Track [Page 27] - -RFC 2822 Internet Message Format April 2001 - - - "Reply-To:", "Message-ID:", and other fields. The resent fields are - only informational and MUST NOT be used in the normal processing of - replies. - - The "Resent-Date:" indicates the date and time at which the resent - message is dispatched by the resender of the message. Like the - "Date:" field, it is not the date and time that the message was - actually transported. - - The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function - identically to the "To:", "Cc:", and "Bcc:" fields respectively, - except that they indicate the recipients of the resent message, not - the recipients of the original message. - - The "Resent-Message-ID:" field provides a unique identifier for the - resent message. - -3.6.7. Trace fields - - The trace fields are a group of header fields consisting of an - optional "Return-Path:" field, and one or more "Received:" fields. - The "Return-Path:" header field contains a pair of angle brackets - that enclose an optional addr-spec. The "Received:" field contains a - (possibly empty) list of name/value pairs followed by a semicolon and - a date-time specification. The first item of the name/value pair is - defined by item-name, and the second item is either an addr-spec, an - atom, a domain, or a msg-id. Further restrictions may be applied to - the syntax of the trace fields by standards that provide for their - use, such as [RFC2821]. - -trace = [return] - 1*received - -return = "Return-Path:" path CRLF - -path = ([CFWS] "<" ([CFWS] / addr-spec) ">" [CFWS]) / - obs-path - -received = "Received:" name-val-list ";" date-time CRLF - -name-val-list = [CFWS] [name-val-pair *(CFWS name-val-pair)] - -name-val-pair = item-name CFWS item-value - -item-name = ALPHA *(["-"] (ALPHA / DIGIT)) - -item-value = 1*angle-addr / addr-spec / - atom / domain / msg-id - - - -Resnick Standards Track [Page 28] - -RFC 2822 Internet Message Format April 2001 - - - A full discussion of the Internet mail use of trace fields is - contained in [RFC2821]. For the purposes of this standard, the trace - fields are strictly informational, and any formal interpretation of - them is outside of the scope of this document. - -3.6.8. Optional fields - - Fields may appear in messages that are otherwise unspecified in this - standard. They MUST conform to the syntax of an optional-field. - This is a field name, made up of the printable US-ASCII characters - except SP and colon, followed by a colon, followed by any text which - conforms to unstructured. - - The field names of any optional-field MUST NOT be identical to any - field name specified elsewhere in this standard. - -optional-field = field-name ":" unstructured CRLF - -field-name = 1*ftext - -ftext = %d33-57 / ; Any character except - %d59-126 ; controls, SP, and - ; ":". - - For the purposes of this standard, any optional field is - uninterpreted. - -4. Obsolete Syntax - - Earlier versions of this standard allowed for different (usually more - liberal) syntax than is allowed in this version. Also, there have - been syntactic elements used in messages on the Internet whose - interpretation have never been documented. Though some of these - syntactic forms MUST NOT be generated according to the grammar in - section 3, they MUST be accepted and parsed by a conformant receiver. - This section documents many of these syntactic elements. Taking the - grammar in section 3 and adding the definitions presented in this - section will result in the grammar to use for interpretation of - messages. - - Note: This section identifies syntactic forms that any implementation - MUST reasonably interpret. However, there are certainly Internet - messages which do not conform to even the additional syntax given in - this section. The fact that a particular form does not appear in any - section of this document is not justification for computer programs - to crash or for malformed data to be irretrievably lost by any - implementation. To repeat an example, though this document requires - lines in messages to be no longer than 998 characters, silently - - - -Resnick Standards Track [Page 29] - -RFC 2822 Internet Message Format April 2001 - - - discarding the 999th and subsequent characters in a line without - warning would still be bad behavior for an implementation. It is up - to the implementation to deal with messages robustly. - - One important difference between the obsolete (interpreting) and the - current (generating) syntax is that in structured header field bodies - (i.e., between the colon and the CRLF of any structured header - field), white space characters, including folding white space, and - comments can be freely inserted between any syntactic tokens. This - allows many complex forms that have proven difficult for some - implementations to parse. - - Another key difference between the obsolete and the current syntax is - that the rule in section 3.2.3 regarding lines composed entirely of - white space in comments and folding white space does not apply. See - the discussion of folding white space in section 4.2 below. - - Finally, certain characters that were formerly allowed in messages - appear in this section. The NUL character (ASCII value 0) was once - allowed, but is no longer for compatibility reasons. CR and LF were - allowed to appear in messages other than as CRLF; this use is also - shown here. - - Other differences in syntax and semantics are noted in the following - sections. - -4.1. Miscellaneous obsolete tokens - - These syntactic elements are used elsewhere in the obsolete syntax or - in the main syntax. The obs-char and obs-qp elements each add ASCII - value 0. Bare CR and bare LF are added to obs-text and obs-utext. - The period character is added to obs-phrase. The obs-phrase-list - provides for "empty" elements in a comma-separated list of phrases. - - Note: The "period" (or "full stop") character (".") in obs-phrase is - not a form that was allowed in earlier versions of this or any other - standard. Period (nor any other character from specials) was not - allowed in phrase because it introduced a parsing difficulty - distinguishing between phrases and portions of an addr-spec (see - section 4.4). It appears here because the period character is - currently used in many messages in the display-name portion of - addresses, especially for initials in names, and therefore must be - interpreted properly. In the future, period may appear in the - regular syntax of phrase. - -obs-qp = "\" (%d0-127) - -obs-text = *LF *CR *(obs-char *LF *CR) - - - -Resnick Standards Track [Page 30] - -RFC 2822 Internet Message Format April 2001 - - -obs-char = %d0-9 / %d11 / ; %d0-127 except CR and - %d12 / %d14-127 ; LF - -obs-utext = obs-text - -obs-phrase = word *(word / "." / CFWS) - -obs-phrase-list = phrase / 1*([phrase] [CFWS] "," [CFWS]) [phrase] - - Bare CR and bare LF appear in messages with two different meanings. - In many cases, bare CR or bare LF are used improperly instead of CRLF - to indicate line separators. In other cases, bare CR and bare LF are - used simply as ASCII control characters with their traditional ASCII - meanings. - -4.2. Obsolete folding white space - - In the obsolete syntax, any amount of folding white space MAY be - inserted where the obs-FWS rule is allowed. This creates the - possibility of having two consecutive "folds" in a line, and - therefore the possibility that a line which makes up a folded header - field could be composed entirely of white space. - - obs-FWS = 1*WSP *(CRLF 1*WSP) - -4.3. Obsolete Date and Time - - The syntax for the obsolete date format allows a 2 digit year in the - date field and allows for a list of alphabetic time zone - specifications that were used in earlier versions of this standard. - It also permits comments and folding white space between many of the - tokens. - -obs-day-of-week = [CFWS] day-name [CFWS] - -obs-year = [CFWS] 2*DIGIT [CFWS] - -obs-month = CFWS month-name CFWS - -obs-day = [CFWS] 1*2DIGIT [CFWS] - -obs-hour = [CFWS] 2DIGIT [CFWS] - -obs-minute = [CFWS] 2DIGIT [CFWS] - -obs-second = [CFWS] 2DIGIT [CFWS] - -obs-zone = "UT" / "GMT" / ; Universal Time - - - -Resnick Standards Track [Page 31] - -RFC 2822 Internet Message Format April 2001 - - - ; North American UT - ; offsets - "EST" / "EDT" / ; Eastern: - 5/ - 4 - "CST" / "CDT" / ; Central: - 6/ - 5 - "MST" / "MDT" / ; Mountain: - 7/ - 6 - "PST" / "PDT" / ; Pacific: - 8/ - 7 - - %d65-73 / ; Military zones - "A" - %d75-90 / ; through "I" and "K" - %d97-105 / ; through "Z", both - %d107-122 ; upper and lower case - - Where a two or three digit year occurs in a date, the year is to be - interpreted as follows: If a two digit year is encountered whose - value is between 00 and 49, the year is interpreted by adding 2000, - ending up with a value between 2000 and 2049. If a two digit year is - encountered with a value between 50 and 99, or any three digit year - is encountered, the year is interpreted by adding 1900. - - In the obsolete time zone, "UT" and "GMT" are indications of - "Universal Time" and "Greenwich Mean Time" respectively and are both - semantically identical to "+0000". - - The remaining three character zones are the US time zones. The first - letter, "E", "C", "M", or "P" stands for "Eastern", "Central", - "Mountain" and "Pacific". The second letter is either "S" for - "Standard" time, or "D" for "Daylight" (or summer) time. Their - interpretations are as follows: - - EDT is semantically equivalent to -0400 - EST is semantically equivalent to -0500 - CDT is semantically equivalent to -0500 - CST is semantically equivalent to -0600 - MDT is semantically equivalent to -0600 - MST is semantically equivalent to -0700 - PDT is semantically equivalent to -0700 - PST is semantically equivalent to -0800 - - The 1 character military time zones were defined in a non-standard - way in [RFC822] and are therefore unpredictable in their meaning. - The original definitions of the military zones "A" through "I" are - equivalent to "+0100" through "+0900" respectively; "K", "L", and "M" - are equivalent to "+1000", "+1100", and "+1200" respectively; "N" - through "Y" are equivalent to "-0100" through "-1200" respectively; - and "Z" is equivalent to "+0000". However, because of the error in - [RFC822], they SHOULD all be considered equivalent to "-0000" unless - there is out-of-band information confirming their meaning. - - - - -Resnick Standards Track [Page 32] - -RFC 2822 Internet Message Format April 2001 - - - Other multi-character (usually between 3 and 5) alphabetic time zones - have been used in Internet messages. Any such time zone whose - meaning is not known SHOULD be considered equivalent to "-0000" - unless there is out-of-band information confirming their meaning. - -4.4. Obsolete Addressing - - There are three primary differences in addressing. First, mailbox - addresses were allowed to have a route portion before the addr-spec - when enclosed in "<" and ">". The route is simply a comma-separated - list of domain names, each preceded by "@", and the list terminated - by a colon. Second, CFWS were allowed between the period-separated - elements of local-part and domain (i.e., dot-atom was not used). In - addition, local-part is allowed to contain quoted-string in addition - to just atom. Finally, mailbox-list and address-list were allowed to - have "null" members. That is, there could be two or more commas in - such a list with nothing in between them. - -obs-angle-addr = [CFWS] "<" [obs-route] addr-spec ">" [CFWS] - -obs-route = [CFWS] obs-domain-list ":" [CFWS] - -obs-domain-list = "@" domain *(*(CFWS / "," ) [CFWS] "@" domain) - -obs-local-part = word *("." word) - -obs-domain = atom *("." atom) - -obs-mbox-list = 1*([mailbox] [CFWS] "," [CFWS]) [mailbox] - -obs-addr-list = 1*([address] [CFWS] "," [CFWS]) [address] - - When interpreting addresses, the route portion SHOULD be ignored. - -4.5. Obsolete header fields - - Syntactically, the primary difference in the obsolete field syntax is - that it allows multiple occurrences of any of the fields and they may - occur in any order. Also, any amount of white space is allowed - before the ":" at the end of the field name. - -obs-fields = *(obs-return / - obs-received / - obs-orig-date / - obs-from / - obs-sender / - obs-reply-to / - obs-to / - - - -Resnick Standards Track [Page 33] - -RFC 2822 Internet Message Format April 2001 - - - obs-cc / - obs-bcc / - obs-message-id / - obs-in-reply-to / - obs-references / - obs-subject / - obs-comments / - obs-keywords / - obs-resent-date / - obs-resent-from / - obs-resent-send / - obs-resent-rply / - obs-resent-to / - obs-resent-cc / - obs-resent-bcc / - obs-resent-mid / - obs-optional) - - Except for destination address fields (described in section 4.5.3), - the interpretation of multiple occurrences of fields is unspecified. - Also, the interpretation of trace fields and resent fields which do - not occur in blocks prepended to the message is unspecified as well. - Unless otherwise noted in the following sections, interpretation of - other fields is identical to the interpretation of their non-obsolete - counterparts in section 3. - -4.5.1. Obsolete origination date field - -obs-orig-date = "Date" *WSP ":" date-time CRLF - -4.5.2. Obsolete originator fields - -obs-from = "From" *WSP ":" mailbox-list CRLF - -obs-sender = "Sender" *WSP ":" mailbox CRLF - -obs-reply-to = "Reply-To" *WSP ":" mailbox-list CRLF - -4.5.3. Obsolete destination address fields - -obs-to = "To" *WSP ":" address-list CRLF - -obs-cc = "Cc" *WSP ":" address-list CRLF - -obs-bcc = "Bcc" *WSP ":" (address-list / [CFWS]) CRLF - - - - - - -Resnick Standards Track [Page 34] - -RFC 2822 Internet Message Format April 2001 - - - When multiple occurrences of destination address fields occur in a - message, they SHOULD be treated as if the address-list in the first - occurrence of the field is combined with the address lists of the - subsequent occurrences by adding a comma and concatenating. - -4.5.4. Obsolete identification fields - - The obsolete "In-Reply-To:" and "References:" fields differ from the - current syntax in that they allow phrase (words or quoted strings) to - appear. The obsolete forms of the left and right sides of msg-id - allow interspersed CFWS, making them syntactically identical to - local-part and domain respectively. - -obs-message-id = "Message-ID" *WSP ":" msg-id CRLF - -obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF - -obs-references = "References" *WSP ":" *(phrase / msg-id) CRLF - -obs-id-left = local-part - -obs-id-right = domain - - For purposes of interpretation, the phrases in the "In-Reply-To:" and - "References:" fields are ignored. - - Semantically, none of the optional CFWS surrounding the local-part - and the domain are part of the obs-id-left and obs-id-right - respectively. - -4.5.5. Obsolete informational fields - -obs-subject = "Subject" *WSP ":" unstructured CRLF - -obs-comments = "Comments" *WSP ":" unstructured CRLF - -obs-keywords = "Keywords" *WSP ":" obs-phrase-list CRLF - -4.5.6. Obsolete resent fields - - The obsolete syntax adds a "Resent-Reply-To:" field, which consists - of the field name, the optional comments and folding white space, the - colon, and a comma separated list of addresses. - -obs-resent-from = "Resent-From" *WSP ":" mailbox-list CRLF - -obs-resent-send = "Resent-Sender" *WSP ":" mailbox CRLF - - - - -Resnick Standards Track [Page 35] - -RFC 2822 Internet Message Format April 2001 - - -obs-resent-date = "Resent-Date" *WSP ":" date-time CRLF - -obs-resent-to = "Resent-To" *WSP ":" address-list CRLF - -obs-resent-cc = "Resent-Cc" *WSP ":" address-list CRLF - -obs-resent-bcc = "Resent-Bcc" *WSP ":" - (address-list / [CFWS]) CRLF - -obs-resent-mid = "Resent-Message-ID" *WSP ":" msg-id CRLF - -obs-resent-rply = "Resent-Reply-To" *WSP ":" address-list CRLF - - As with other resent fields, the "Resent-Reply-To:" field is to be - treated as trace information only. - -4.5.7. Obsolete trace fields - - The obs-return and obs-received are again given here as template - definitions, just as return and received are in section 3. Their - full syntax is given in [RFC2821]. - -obs-return = "Return-Path" *WSP ":" path CRLF - -obs-received = "Received" *WSP ":" name-val-list CRLF - -obs-path = obs-angle-addr - -4.5.8. Obsolete optional fields - -obs-optional = field-name *WSP ":" unstructured CRLF - -5. Security Considerations - - Care needs to be taken when displaying messages on a terminal or - terminal emulator. Powerful terminals may act on escape sequences - and other combinations of ASCII control characters with a variety of - consequences. They can remap the keyboard or permit other - modifications to the terminal which could lead to denial of service - or even damaged data. They can trigger (sometimes programmable) - answerback messages which can allow a message to cause commands to be - issued on the recipient's behalf. They can also effect the operation - of terminal attached devices such as printers. Message viewers may - wish to strip potentially dangerous terminal escape sequences from - the message prior to display. However, other escape sequences appear - in messages for useful purposes (cf. [RFC2045, RFC2046, RFC2047, - RFC2048, RFC2049, ISO2022]) and therefore should not be stripped - indiscriminately. - - - -Resnick Standards Track [Page 36] - -RFC 2822 Internet Message Format April 2001 - - - Transmission of non-text objects in messages raises additional - security issues. These issues are discussed in [RFC2045, RFC2046, - RFC2047, RFC2048, RFC2049]. - - Many implementations use the "Bcc:" (blind carbon copy) field - described in section 3.6.3 to facilitate sending messages to - recipients without revealing the addresses of one or more of the - addressees to the other recipients. Mishandling this use of "Bcc:" - has implications for confidential information that might be revealed, - which could eventually lead to security problems through knowledge of - even the existence of a particular mail address. For example, if - using the first method described in section 3.6.3, where the "Bcc:" - line is removed from the message, blind recipients have no explicit - indication that they have been sent a blind copy, except insofar as - their address does not appear in the message header. Because of - this, one of the blind addressees could potentially send a reply to - all of the shown recipients and accidentally reveal that the message - went to the blind recipient. When the second method from section - 3.6.3 is used, the blind recipient's address appears in the "Bcc:" - field of a separate copy of the message. If the "Bcc:" field sent - contains all of the blind addressees, all of the "Bcc:" recipients - will be seen by each "Bcc:" recipient. Even if a separate message is - sent to each "Bcc:" recipient with only the individual's address, - implementations still need to be careful to process replies to the - message as per section 3.6.3 so as not to accidentally reveal the - blind recipient to other recipients. - -6. Bibliography - - [ASCII] American National Standards Institute (ANSI), Coded - Character Set - 7-Bit American National Standard Code for - Information Interchange, ANSI X3.4, 1986. - - [ISO2022] International Organization for Standardization (ISO), - Information processing - ISO 7-bit and 8-bit coded - character sets - Code extension techniques, Third edition - - 1986-05-01, ISO 2022, 1986. - - [RFC822] Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", RFC 822, August 1982. - - [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message - Bodies", RFC 2045, November 1996. - - [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Two: Media Types", RFC 2046, - November 1996. - - - -Resnick Standards Track [Page 37] - -RFC 2822 Internet Message Format April 2001 - - - [RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME) - Part Three: Message Header Extensions for Non-ASCII Text", - RFC 2047, November 1996. - - [RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose - Internet Mail Extensions (MIME) Part Four: Format of - Internet Message Bodies", RFC 2048, November 1996. - - [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part Five: Conformance Criteria and - Examples", RFC 2049, November 1996. - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC2234] Crocker, D., Editor, and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", RFC 2234, November 1997. - - [RFC2821] Klensin, J., Editor, "Simple Mail Transfer Protocol", RFC - 2821, March 2001. - - [STD3] Braden, R., "Host Requirements", STD 3, RFC 1122 and RFC - 1123, October 1989. - - [STD12] Mills, D., "Network Time Protocol", STD 12, RFC 1119, - September 1989. - - [STD13] Mockapetris, P., "Domain Name System", STD 13, RFC 1034 - and RFC 1035, November 1987. - - [STD14] Partridge, C., "Mail Routing and the Domain System", STD - 14, RFC 974, January 1986. - -7. Editor's Address - - Peter W. Resnick - QUALCOMM Incorporated - 5775 Morehouse Drive - San Diego, CA 92121-1714 - USA - - Phone: +1 858 651 4478 - Fax: +1 858 651 1102 - EMail: presnick@qualcomm.com - - - - - - - -Resnick Standards Track [Page 38] - -RFC 2822 Internet Message Format April 2001 - - -8. Acknowledgements - - Many people contributed to this document. They included folks who - participated in the Detailed Revision and Update of Messaging - Standards (DRUMS) Working Group of the Internet Engineering Task - Force (IETF), the chair of DRUMS, the Area Directors of the IETF, and - people who simply sent their comments in via e-mail. The editor is - deeply indebted to them all and thanks them sincerely. The below - list includes everyone who sent e-mail concerning this document. - Hopefully, everyone who contributed is named here: - - Matti Aarnio Barry Finkel Larry Masinter - Tanaka Akira Erik Forsberg Denis McKeon - Russ Allbery Chuck Foster William P McQuillan - Eric Allman Paul Fox Alexey Melnikov - Harald Tveit Alvestrand Klaus M. Frank Perry E. Metzger - Ran Atkinson Ned Freed Steven Miller - Jos Backus Jochen Friedrich Keith Moore - Bruce Balden Randall C. Gellens John Gardiner Myers - Dave Barr Sukvinder Singh Gill Chris Newman - Alan Barrett Tim Goodwin John W. Noerenberg - John Beck Philip Guenther Eric Norman - J. Robert von Behren Tony Hansen Mike O'Dell - Jos den Bekker John Hawkinson Larry Osterman - D. J. Bernstein Philip Hazel Paul Overell - James Berriman Kai Henningsen Jacob Palme - Norbert Bollow Robert Herriot Michael A. Patton - Raj Bose Paul Hethmon Uzi Paz - Antony Bowesman Jim Hill Michael A. Quinlan - Scott Bradner Paul E. Hoffman Eric S. Raymond - Randy Bush Steve Hole Sam Roberts - Tom Byrer Kari Hurtta Hugh Sasse - Bruce Campbell Marco S. Hyman Bart Schaefer - Larry Campbell Ofer Inbar Tom Scola - W. J. Carpenter Olle Jarnefors Wolfgang Segmuller - Michael Chapman Kevin Johnson Nick Shelness - Richard Clayton Sudish Joseph John Stanley - Maurizio Codogno Maynard Kang Einar Stefferud - Jim Conklin Prabhat Keni Jeff Stephenson - R. Kelley Cook John C. Klensin Bernard Stern - Steve Coya Graham Klyne Peter Sylvester - Mark Crispin Brad Knowles Mark Symons - Dave Crocker Shuhei Kobayashi Eric Thomas - Matt Curtin Peter Koch Lee Thompson - Michael D'Errico Dan Kohn Karel De Vriendt - Cyrus Daboo Christian Kuhtz Matthew Wall - Jutta Degener Anand Kumria Rolf Weber - Mark Delany Steen Larsen Brent B. Welch - - - -Resnick Standards Track [Page 39] - -RFC 2822 Internet Message Format April 2001 - - - Steve Dorner Eliot Lear Dan Wing - Harold A. Driscoll Barry Leiba Jack De Winter - Michael Elkins Jay Levitt Gregory J. Woodhouse - Robert Elz Lars-Johan Liman Greg A. Woods - Johnny Eriksson Charles Lindsey Kazu Yamamoto - Erik E. Fair Pete Loshin Alain Zahm - Roger Fajman Simon Lyall Jamie Zawinski - Patrik Faltstrom Bill Manning Timothy S. Zurcher - Claus Andre Farber John Martin - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 40] - -RFC 2822 Internet Message Format April 2001 - - -Appendix A. Example messages - - This section presents a selection of messages. These are intended to - assist in the implementation of this standard, but should not be - taken as normative; that is to say, although the examples in this - section were carefully reviewed, if there happens to be a conflict - between these examples and the syntax described in sections 3 and 4 - of this document, the syntax in those sections is to be taken as - correct. - - Messages are delimited in this section between lines of "----". The - "----" lines are not part of the message itself. - -A.1. Addressing examples - - The following are examples of messages that might be sent between two - individuals. - -A.1.1. A message from one person to another with simple addressing - - This could be called a canonical message. It has a single author, - John Doe, a single recipient, Mary Smith, a subject, the date, a - message identifier, and a textual message in the body. - ----- -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 41] - -RFC 2822 Internet Message Format April 2001 - - - If John's secretary Michael actually sent the message, though John - was the author and replies to this message should go back to him, the - sender field would be used: - ----- -From: John Doe <jdoe@machine.example> -Sender: Michael Jones <mjones@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - -A.1.2. Different types of mailboxes - - This message includes multiple addresses in the destination fields - and also uses several different forms of addresses. - ----- -From: "Joe Q. Public" <john.q.public@example.com> -To: Mary Smith <mary@x.test>, jdoe@example.org, Who? <one@y.test> -Cc: <boss@nil.test>, "Giant; \"Big\" Box" <sysservices@example.net> -Date: Tue, 1 Jul 2003 10:52:37 +0200 -Message-ID: <5678.21-Nov-1997@example.com> - -Hi everyone. ----- - - Note that the display names for Joe Q. Public and Giant; "Big" Box - needed to be enclosed in double-quotes because the former contains - the period and the latter contains both semicolon and double-quote - characters (the double-quote characters appearing as quoted-pair - construct). Conversely, the display name for Who? could appear - without them because the question mark is legal in an atom. Notice - also that jdoe@example.org and boss@nil.test have no display names - associated with them at all, and jdoe@example.org uses the simpler - address form without the angle brackets. - - - - - - - - - - - -Resnick Standards Track [Page 42] - -RFC 2822 Internet Message Format April 2001 - - -A.1.3. Group addresses - ----- -From: Pete <pete@silly.example> -To: A Group:Chris Jones <c@a.test>,joe@where.test,John <jdoe@one.test>; -Cc: Undisclosed recipients:; -Date: Thu, 13 Feb 1969 23:32:54 -0330 -Message-ID: <testabcd.1234@silly.example> - -Testing. ----- - - In this message, the "To:" field has a single group recipient named A - Group which contains 3 addresses, and a "Cc:" field with an empty - group recipient named Undisclosed recipients. - -A.2. Reply messages - - The following is a series of three messages that make up a - conversation thread between John and Mary. John firsts sends a - message to Mary, Mary then replies to John's message, and then John - replies to Mary's reply message. - - Note especially the "Message-ID:", "References:", and "In-Reply-To:" - fields in each message. - ----- -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - - - - - - - - - - - - - - - -Resnick Standards Track [Page 43] - -RFC 2822 Internet Message Format April 2001 - - - When sending replies, the Subject field is often retained, though - prepended with "Re: " as described in section 3.6.5. - ----- -From: Mary Smith <mary@example.net> -To: John Doe <jdoe@machine.example> -Reply-To: "Mary Smith: Personal Account" <smith@home.example> -Subject: Re: Saying Hello -Date: Fri, 21 Nov 1997 10:01:10 -0600 -Message-ID: <3456@example.net> -In-Reply-To: <1234@local.machine.example> -References: <1234@local.machine.example> - -This is a reply to your hello. ----- - - Note the "Reply-To:" field in the above message. When John replies - to Mary's message above, the reply should go to the address in the - "Reply-To:" field instead of the address in the "From:" field. - ----- -To: "Mary Smith: Personal Account" <smith@home.example> -From: John Doe <jdoe@machine.example> -Subject: Re: Saying Hello -Date: Fri, 21 Nov 1997 11:00:00 -0600 -Message-ID: <abcd.1234@local.machine.tld> -In-Reply-To: <3456@example.net> -References: <1234@local.machine.example> <3456@example.net> - -This is a reply to your reply. ----- - -A.3. Resent messages - - Start with the message that has been used as an example several - times: - ----- -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - - - - -Resnick Standards Track [Page 44] - -RFC 2822 Internet Message Format April 2001 - - - Say that Mary, upon receiving this message, wishes to send a copy of - the message to Jane such that (a) the message would appear to have - come straight from John; (b) if Jane replies to the message, the - reply should go back to John; and (c) all of the original - information, like the date the message was originally sent to Mary, - the message identifier, and the original addressee, is preserved. In - this case, resent fields are prepended to the message: - ----- -Resent-From: Mary Smith <mary@example.net> -Resent-To: Jane Brown <j-brown@other.example> -Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800 -Resent-Message-ID: <78910@example.net> -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - - If Jane, in turn, wished to resend this message to another person, - she would prepend her own set of resent header fields to the above - and send that. - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 45] - -RFC 2822 Internet Message Format April 2001 - - -A.4. Messages with trace fields - - As messages are sent through the transport system as described in - [RFC2821], trace fields are prepended to the message. The following - is an example of what those trace fields might look like. Note that - there is some folding white space in the first one since these lines - can be long. - ----- -Received: from x.y.test - by example.net - via TCP - with ESMTP - id ABC12345 - for <mary@example.net>; 21 Nov 1997 10:05:43 -0600 -Received: from machine.example by x.y.test; 21 Nov 1997 10:01:22 -0600 -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: Fri, 21 Nov 1997 09:55:06 -0600 -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 46] - -RFC 2822 Internet Message Format April 2001 - - -A.5. White space, comments, and other oddities - - White space, including folding white space, and comments can be - inserted between many of the tokens of fields. Taking the example - from A.1.3, white space and comments can be inserted into all of the - fields. - ----- -From: Pete(A wonderful \) chap) <pete(his account)@silly.test(his host)> -To:A Group(Some people) - :Chris Jones <c@(Chris's host.)public.example>, - joe@example.org, - John <jdoe@one.test> (my dear friend); (the end of the group) -Cc:(Empty list)(start)Undisclosed recipients :(nobody(that I know)) ; -Date: Thu, - 13 - Feb - 1969 - 23:32 - -0330 (Newfoundland Time) -Message-ID: <testabcd.1234@silly.test> - -Testing. ----- - - The above example is aesthetically displeasing, but perfectly legal. - Note particularly (1) the comments in the "From:" field (including - one that has a ")" character appearing as part of a quoted-pair); (2) - the white space absent after the ":" in the "To:" field as well as - the comment and folding white space after the group name, the special - character (".") in the comment in Chris Jones's address, and the - folding white space before and after "joe@example.org,"; (3) the - multiple and nested comments in the "Cc:" field as well as the - comment immediately following the ":" after "Cc"; (4) the folding - white space (but no comments except at the end) and the missing - seconds in the time of the date field; and (5) the white space before - (but not within) the identifier in the "Message-ID:" field. - -A.6. Obsoleted forms - - The following are examples of obsolete (that is, the "MUST NOT - generate") syntactic elements described in section 4 of this - document. - - - - - - - - -Resnick Standards Track [Page 47] - -RFC 2822 Internet Message Format April 2001 - - -A.6.1. Obsolete addressing - - Note in the below example the lack of quotes around Joe Q. Public, - the route that appears in the address for Mary Smith, the two commas - that appear in the "To:" field, and the spaces that appear around the - "." in the jdoe address. - ----- -From: Joe Q. Public <john.q.public@example.com> -To: Mary Smith <@machine.tld:mary@example.net>, , jdoe@test . example -Date: Tue, 1 Jul 2003 10:52:37 +0200 -Message-ID: <5678.21-Nov-1997@example.com> - -Hi everyone. ----- - -A.6.2. Obsolete dates - - The following message uses an obsolete date format, including a non- - numeric time zone and a two digit year. Note that although the - day-of-week is missing, that is not specific to the obsolete syntax; - it is optional in the current syntax as well. - ----- -From: John Doe <jdoe@machine.example> -To: Mary Smith <mary@example.net> -Subject: Saying Hello -Date: 21 Nov 97 09:55:06 GMT -Message-ID: <1234@local.machine.example> - -This is a message just to say hello. -So, "Hello". ----- - -A.6.3. Obsolete white space and comments - - White space and comments can appear between many more elements than - in the current syntax. Also, folding lines that are made up entirely - of white space are legal. - - - - - - - - - - - - -Resnick Standards Track [Page 48] - -RFC 2822 Internet Message Format April 2001 - - ----- -From : John Doe <jdoe@machine(comment). example> -To : Mary Smith -__ - <mary@example.net> -Subject : Saying Hello -Date : Fri, 21 Nov 1997 09(comment): 55 : 06 -0600 -Message-ID : <1234 @ local(blah) .machine .example> - -This is a message just to say hello. -So, "Hello". ----- - - Note especially the second line of the "To:" field. It starts with - two space characters. (Note that "__" represent blank spaces.) - Therefore, it is considered part of the folding as described in - section 4.2. Also, the comments and white space throughout - addresses, dates, and message identifiers are all part of the - obsolete syntax. - -Appendix B. Differences from earlier standards - - This appendix contains a list of changes that have been made in the - Internet Message Format from earlier standards, specifically [RFC822] - and [STD3]. Items marked with an asterisk (*) below are items which - appear in section 4 of this document and therefore can no longer be - generated. - - 1. Period allowed in obsolete form of phrase. - 2. ABNF moved out of document to [RFC2234]. - 3. Four or more digits allowed for year. - 4. Header field ordering (and lack thereof) made explicit. - 5. Encrypted header field removed. - 6. Received syntax loosened to allow any token/value pair. - 7. Specifically allow and give meaning to "-0000" time zone. - 8. Folding white space is not allowed between every token. - 9. Requirement for destinations removed. - 10. Forwarding and resending redefined. - 11. Extension header fields no longer specifically called out. - 12. ASCII 0 (null) removed.* - 13. Folding continuation lines cannot contain only white space.* - 14. Free insertion of comments not allowed in date.* - 15. Non-numeric time zones not allowed.* - 16. Two digit years not allowed.* - 17. Three digit years interpreted, but not allowed for generation. - 18. Routes in addresses not allowed.* - 19. CFWS within local-parts and domains not allowed.* - 20. Empty members of address lists not allowed.* - - - -Resnick Standards Track [Page 49] - -RFC 2822 Internet Message Format April 2001 - - - 21. Folding white space between field name and colon not allowed.* - 22. Comments between field name and colon not allowed. - 23. Tightened syntax of in-reply-to and references.* - 24. CFWS within msg-id not allowed.* - 25. Tightened semantics of resent fields as informational only. - 26. Resent-Reply-To not allowed.* - 27. No multiple occurrences of fields (except resent and received).* - 28. Free CR and LF not allowed.* - 29. Routes in return path not allowed.* - 30. Line length limits specified. - 31. Bcc more clearly specified. - -Appendix C. Notices - - Intellectual Property - - The IETF takes no position regarding the validity or scope of any - intellectual property or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; neither does it represent that it - has made any effort to identify any such rights. Information on the - IETF's procedures with respect to rights in standards-track and - standards-related documentation can be found in BCP-11. Copies of - claims of rights made available for publication and any assurances of - licenses to be made available, or the result of an attempt made to - obtain a general license or permission for the use of such - proprietary rights by implementors or users of this specification can - be obtained from the IETF Secretariat. - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 50] - -RFC 2822 Internet Message Format April 2001 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2001). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 51] - diff --git a/proto/rfc3501.txt b/proto/rfc3501.txt @@ -1,6051 +0,0 @@ - - - - - - -Network Working Group M. Crispin -Request for Comments: 3501 University of Washington -Obsoletes: 2060 March 2003 -Category: Standards Track - - - INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1 - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2003). All Rights Reserved. - -Abstract - - The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1) - allows a client to access and manipulate electronic mail messages on - a server. IMAP4rev1 permits manipulation of mailboxes (remote - message folders) in a way that is functionally equivalent to local - folders. IMAP4rev1 also provides the capability for an offline - client to resynchronize with the server. - - IMAP4rev1 includes operations for creating, deleting, and renaming - mailboxes, checking for new messages, permanently removing messages, - setting and clearing flags, RFC 2822 and RFC 2045 parsing, searching, - and selective fetching of message attributes, texts, and portions - thereof. Messages in IMAP4rev1 are accessed by the use of numbers. - These numbers are either message sequence numbers or unique - identifiers. - - IMAP4rev1 supports a single server. A mechanism for accessing - configuration information to support multiple IMAP4rev1 servers is - discussed in RFC 2244. - - IMAP4rev1 does not specify a means of posting mail; this function is - handled by a mail transfer protocol such as RFC 2821. - - - - - - - - -Crispin Standards Track [Page 1] - -RFC 3501 IMAPv4 March 2003 - - -Table of Contents - - IMAP4rev1 Protocol Specification ................................ 4 - 1. How to Read This Document ............................... 4 - 1.1. Organization of This Document ........................... 4 - 1.2. Conventions Used in This Document ....................... 4 - 1.3. Special Notes to Implementors ........................... 5 - 2. Protocol Overview ....................................... 6 - 2.1. Link Level .............................................. 6 - 2.2. Commands and Responses .................................. 6 - 2.2.1. Client Protocol Sender and Server Protocol Receiver ..... 6 - 2.2.2. Server Protocol Sender and Client Protocol Receiver ..... 7 - 2.3. Message Attributes ...................................... 8 - 2.3.1. Message Numbers ......................................... 8 - 2.3.1.1. Unique Identifier (UID) Message Attribute ....... 8 - 2.3.1.2. Message Sequence Number Message Attribute ....... 10 - 2.3.2. Flags Message Attribute ................................. 11 - 2.3.3. Internal Date Message Attribute ......................... 12 - 2.3.4. [RFC-2822] Size Message Attribute ....................... 12 - 2.3.5. Envelope Structure Message Attribute .................... 12 - 2.3.6. Body Structure Message Attribute ........................ 12 - 2.4. Message Texts ........................................... 13 - 3. State and Flow Diagram .................................. 13 - 3.1. Not Authenticated State ................................. 13 - 3.2. Authenticated State ..................................... 13 - 3.3. Selected State .......................................... 13 - 3.4. Logout State ............................................ 14 - 4. Data Formats ............................................ 16 - 4.1. Atom .................................................... 16 - 4.2. Number .................................................. 16 - 4.3. String .................................................. 16 - 4.3.1. 8-bit and Binary Strings ................................ 17 - 4.4. Parenthesized List ...................................... 17 - 4.5. NIL ..................................................... 17 - 5. Operational Considerations .............................. 18 - 5.1. Mailbox Naming .......................................... 18 - 5.1.1. Mailbox Hierarchy Naming ................................ 19 - 5.1.2. Mailbox Namespace Naming Convention ..................... 19 - 5.1.3. Mailbox International Naming Convention ................. 19 - 5.2. Mailbox Size and Message Status Updates ................. 21 - 5.3. Response when no Command in Progress .................... 21 - 5.4. Autologout Timer ........................................ 22 - 5.5. Multiple Commands in Progress ........................... 22 - 6. Client Commands ........................................ 23 - 6.1. Client Commands - Any State ............................ 24 - 6.1.1. CAPABILITY Command ..................................... 24 - 6.1.2. NOOP Command ........................................... 25 - 6.1.3. LOGOUT Command ......................................... 26 - - - -Crispin Standards Track [Page 2] - -RFC 3501 IMAPv4 March 2003 - - - 6.2. Client Commands - Not Authenticated State .............. 26 - 6.2.1. STARTTLS Command ....................................... 27 - 6.2.2. AUTHENTICATE Command ................................... 28 - 6.2.3. LOGIN Command .......................................... 30 - 6.3. Client Commands - Authenticated State .................. 31 - 6.3.1. SELECT Command ......................................... 32 - 6.3.2. EXAMINE Command ........................................ 34 - 6.3.3. CREATE Command ......................................... 34 - 6.3.4. DELETE Command ......................................... 35 - 6.3.5. RENAME Command ......................................... 37 - 6.3.6. SUBSCRIBE Command ...................................... 39 - 6.3.7. UNSUBSCRIBE Command .................................... 39 - 6.3.8. LIST Command ........................................... 40 - 6.3.9. LSUB Command ........................................... 43 - 6.3.10. STATUS Command ......................................... 44 - 6.3.11. APPEND Command ......................................... 46 - 6.4. Client Commands - Selected State ....................... 47 - 6.4.1. CHECK Command .......................................... 47 - 6.4.2. CLOSE Command .......................................... 48 - 6.4.3. EXPUNGE Command ........................................ 49 - 6.4.4. SEARCH Command ......................................... 49 - 6.4.5. FETCH Command .......................................... 54 - 6.4.6. STORE Command .......................................... 58 - 6.4.7. COPY Command ........................................... 59 - 6.4.8. UID Command ............................................ 60 - 6.5. Client Commands - Experimental/Expansion ............... 62 - 6.5.1. X<atom> Command ........................................ 62 - 7. Server Responses ....................................... 62 - 7.1. Server Responses - Status Responses .................... 63 - 7.1.1. OK Response ............................................ 65 - 7.1.2. NO Response ............................................ 66 - 7.1.3. BAD Response ........................................... 66 - 7.1.4. PREAUTH Response ....................................... 67 - 7.1.5. BYE Response ........................................... 67 - 7.2. Server Responses - Server and Mailbox Status ........... 68 - 7.2.1. CAPABILITY Response .................................... 68 - 7.2.2. LIST Response .......................................... 69 - 7.2.3. LSUB Response .......................................... 70 - 7.2.4 STATUS Response ........................................ 70 - 7.2.5. SEARCH Response ........................................ 71 - 7.2.6. FLAGS Response ......................................... 71 - 7.3. Server Responses - Mailbox Size ........................ 71 - 7.3.1. EXISTS Response ........................................ 71 - 7.3.2. RECENT Response ........................................ 72 - 7.4. Server Responses - Message Status ...................... 72 - 7.4.1. EXPUNGE Response ....................................... 72 - 7.4.2. FETCH Response ......................................... 73 - 7.5. Server Responses - Command Continuation Request ........ 79 - - - -Crispin Standards Track [Page 3] - -RFC 3501 IMAPv4 March 2003 - - - 8. Sample IMAP4rev1 connection ............................ 80 - 9. Formal Syntax .......................................... 81 - 10. Author's Note .......................................... 92 - 11. Security Considerations ................................ 92 - 11.1. STARTTLS Security Considerations ....................... 92 - 11.2. Other Security Considerations .......................... 93 - 12. IANA Considerations .................................... 94 - Appendices ..................................................... 95 - A. References ............................................. 95 - B. Changes from RFC 2060 .................................. 97 - C. Key Word Index ......................................... 103 - Author's Address ............................................... 107 - Full Copyright Statement ....................................... 108 - -IMAP4rev1 Protocol Specification - -1. How to Read This Document - -1.1. Organization of This Document - - This document is written from the point of view of the implementor of - an IMAP4rev1 client or server. Beyond the protocol overview in - section 2, it is not optimized for someone trying to understand the - operation of the protocol. The material in sections 3 through 5 - provides the general context and definitions with which IMAP4rev1 - operates. - - Sections 6, 7, and 9 describe the IMAP commands, responses, and - syntax, respectively. The relationships among these are such that it - is almost impossible to understand any of them separately. In - particular, do not attempt to deduce command syntax from the command - section alone; instead refer to the Formal Syntax section. - -1.2. Conventions Used in This Document - - "Conventions" are basic principles or procedures. Document - conventions are noted in this section. - - In examples, "C:" and "S:" indicate lines sent by the client and - server respectively. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "MAY", and "OPTIONAL" in this document are to - be interpreted as described in [KEYWORDS]. - - The word "can" (not "may") is used to refer to a possible - circumstance or situation, as opposed to an optional facility of the - protocol. - - - -Crispin Standards Track [Page 4] - -RFC 3501 IMAPv4 March 2003 - - - "User" is used to refer to a human user, whereas "client" refers to - the software being run by the user. - - "Connection" refers to the entire sequence of client/server - interaction from the initial establishment of the network connection - until its termination. - - "Session" refers to the sequence of client/server interaction from - the time that a mailbox is selected (SELECT or EXAMINE command) until - the time that selection ends (SELECT or EXAMINE of another mailbox, - CLOSE command, or connection termination). - - Characters are 7-bit US-ASCII unless otherwise specified. Other - character sets are indicated using a "CHARSET", as described in - [MIME-IMT] and defined in [CHARSET]. CHARSETs have important - additional semantics in addition to defining character set; refer to - these documents for more detail. - - There are several protocol conventions in IMAP. These refer to - aspects of the specification which are not strictly part of the IMAP - protocol, but reflect generally-accepted practice. Implementations - need to be aware of these conventions, and avoid conflicts whether or - not they implement the convention. For example, "&" may not be used - as a hierarchy delimiter since it conflicts with the Mailbox - International Naming Convention, and other uses of "&" in mailbox - names are impacted as well. - -1.3. Special Notes to Implementors - - Implementors of the IMAP protocol are strongly encouraged to read the - IMAP implementation recommendations document [IMAP-IMPLEMENTATION] in - conjunction with this document, to help understand the intricacies of - this protocol and how best to build an interoperable product. - - IMAP4rev1 is designed to be upwards compatible from the [IMAP2] and - unpublished IMAP2bis protocols. IMAP4rev1 is largely compatible with - the IMAP4 protocol described in RFC 1730; the exception being in - certain facilities added in RFC 1730 that proved problematic and were - subsequently removed. In the course of the evolution of IMAP4rev1, - some aspects in the earlier protocols have become obsolete. Obsolete - commands, responses, and data formats which an IMAP4rev1 - implementation can encounter when used with an earlier implementation - are described in [IMAP-OBSOLETE]. - - Other compatibility issues with IMAP2bis, the most common variant of - the earlier protocol, are discussed in [IMAP-COMPAT]. A full - discussion of compatibility issues with rare (and presumed extinct) - - - - -Crispin Standards Track [Page 5] - -RFC 3501 IMAPv4 March 2003 - - - variants of [IMAP2] is in [IMAP-HISTORICAL]; this document is - primarily of historical interest. - - IMAP was originally developed for the older [RFC-822] standard, and - as a consequence several fetch items in IMAP incorporate "RFC822" in - their name. With the exception of RFC822.SIZE, there are more modern - replacements; for example, the modern version of RFC822.HEADER is - BODY.PEEK[HEADER]. In all cases, "RFC822" should be interpreted as a - reference to the updated [RFC-2822] standard. - -2. Protocol Overview - -2.1. Link Level - - The IMAP4rev1 protocol assumes a reliable data stream such as that - provided by TCP. When TCP is used, an IMAP4rev1 server listens on - port 143. - -2.2. Commands and Responses - - An IMAP4rev1 connection consists of the establishment of a - client/server network connection, an initial greeting from the - server, and client/server interactions. These client/server - interactions consist of a client command, server data, and a server - completion result response. - - All interactions transmitted by client and server are in the form of - lines, that is, strings that end with a CRLF. The protocol receiver - of an IMAP4rev1 client or server is either reading a line, or is - reading a sequence of octets with a known count followed by a line. - -2.2.1. Client Protocol Sender and Server Protocol Receiver - - The client command begins an operation. Each client command is - prefixed with an identifier (typically a short alphanumeric string, - e.g., A0001, A0002, etc.) called a "tag". A different tag is - generated by the client for each command. - - Clients MUST follow the syntax outlined in this specification - strictly. It is a syntax error to send a command with missing or - extraneous spaces or arguments. - - There are two cases in which a line from the client does not - represent a complete command. In one case, a command argument is - quoted with an octet count (see the description of literal in String - under Data Formats); in the other case, the command arguments require - server feedback (see the AUTHENTICATE command). In either case, the - - - - -Crispin Standards Track [Page 6] - -RFC 3501 IMAPv4 March 2003 - - - server sends a command continuation request response if it is ready - for the octets (if appropriate) and the remainder of the command. - This response is prefixed with the token "+". - - Note: If instead, the server detected an error in the - command, it sends a BAD completion response with a tag - matching the command (as described below) to reject the - command and prevent the client from sending any more of the - command. - - It is also possible for the server to send a completion - response for some other command (if multiple commands are - in progress), or untagged data. In either case, the - command continuation request is still pending; the client - takes the appropriate action for the response, and reads - another response from the server. In all cases, the client - MUST send a complete command (including receiving all - command continuation request responses and command - continuations for the command) before initiating a new - command. - - The protocol receiver of an IMAP4rev1 server reads a command line - from the client, parses the command and its arguments, and transmits - server data and a server command completion result response. - -2.2.2. Server Protocol Sender and Client Protocol Receiver - - Data transmitted by the server to the client and status responses - that do not indicate command completion are prefixed with the token - "*", and are called untagged responses. - - Server data MAY be sent as a result of a client command, or MAY be - sent unilaterally by the server. There is no syntactic difference - between server data that resulted from a specific command and server - data that were sent unilaterally. - - The server completion result response indicates the success or - failure of the operation. It is tagged with the same tag as the - client command which began the operation. Thus, if more than one - command is in progress, the tag in a server completion response - identifies the command to which the response applies. There are - three possible server completion responses: OK (indicating success), - NO (indicating failure), or BAD (indicating a protocol error such as - unrecognized command or command syntax error). - - Servers SHOULD enforce the syntax outlined in this specification - strictly. Any client command with a protocol syntax error, including - (but not limited to) missing or extraneous spaces or arguments, - - - -Crispin Standards Track [Page 7] - -RFC 3501 IMAPv4 March 2003 - - - SHOULD be rejected, and the client given a BAD server completion - response. - - The protocol receiver of an IMAP4rev1 client reads a response line - from the server. It then takes action on the response based upon the - first token of the response, which can be a tag, a "*", or a "+". - - A client MUST be prepared to accept any server response at all times. - This includes server data that was not requested. Server data SHOULD - be recorded, so that the client can reference its recorded copy - rather than sending a command to the server to request the data. In - the case of certain server data, the data MUST be recorded. - - This topic is discussed in greater detail in the Server Responses - section. - -2.3. Message Attributes - - In addition to message text, each message has several attributes - associated with it. These attributes can be retrieved individually - or in conjunction with other attributes or message texts. - -2.3.1. Message Numbers - - Messages in IMAP4rev1 are accessed by one of two numbers; the unique - identifier or the message sequence number. - - -2.3.1.1. Unique Identifier (UID) Message Attribute - - A 32-bit value assigned to each message, which when used with the - unique identifier validity value (see below) forms a 64-bit value - that MUST NOT refer to any other message in the mailbox or any - subsequent mailbox with the same name forever. Unique identifiers - are assigned in a strictly ascending fashion in the mailbox; as each - message is added to the mailbox it is assigned a higher UID than the - message(s) which were added previously. Unlike message sequence - numbers, unique identifiers are not necessarily contiguous. - - The unique identifier of a message MUST NOT change during the - session, and SHOULD NOT change between sessions. Any change of - unique identifiers between sessions MUST be detectable using the - UIDVALIDITY mechanism discussed below. Persistent unique identifiers - are required for a client to resynchronize its state from a previous - session with the server (e.g., disconnected or offline access - clients); this is discussed further in [IMAP-DISC]. - - - - - -Crispin Standards Track [Page 8] - -RFC 3501 IMAPv4 March 2003 - - - Associated with every mailbox are two values which aid in unique - identifier handling: the next unique identifier value and the unique - identifier validity value. - - The next unique identifier value is the predicted value that will be - assigned to a new message in the mailbox. Unless the unique - identifier validity also changes (see below), the next unique - identifier value MUST have the following two characteristics. First, - the next unique identifier value MUST NOT change unless new messages - are added to the mailbox; and second, the next unique identifier - value MUST change whenever new messages are added to the mailbox, - even if those new messages are subsequently expunged. - - Note: The next unique identifier value is intended to - provide a means for a client to determine whether any - messages have been delivered to the mailbox since the - previous time it checked this value. It is not intended to - provide any guarantee that any message will have this - unique identifier. A client can only assume, at the time - that it obtains the next unique identifier value, that - messages arriving after that time will have a UID greater - than or equal to that value. - - The unique identifier validity value is sent in a UIDVALIDITY - response code in an OK untagged response at mailbox selection time. - If unique identifiers from an earlier session fail to persist in this - session, the unique identifier validity value MUST be greater than - the one used in the earlier session. - - Note: Ideally, unique identifiers SHOULD persist at all - times. Although this specification recognizes that failure - to persist can be unavoidable in certain server - environments, it STRONGLY ENCOURAGES message store - implementation techniques that avoid this problem. For - example: - - 1) Unique identifiers MUST be strictly ascending in the - mailbox at all times. If the physical message store is - re-ordered by a non-IMAP agent, this requires that the - unique identifiers in the mailbox be regenerated, since - the former unique identifiers are no longer strictly - ascending as a result of the re-ordering. - - 2) If the message store has no mechanism to store unique - identifiers, it must regenerate unique identifiers at - each session, and each session must have a unique - UIDVALIDITY value. - - - - -Crispin Standards Track [Page 9] - -RFC 3501 IMAPv4 March 2003 - - - 3) If the mailbox is deleted and a new mailbox with the - same name is created at a later date, the server must - either keep track of unique identifiers from the - previous instance of the mailbox, or it must assign a - new UIDVALIDITY value to the new instance of the - mailbox. A good UIDVALIDITY value to use in this case - is a 32-bit representation of the creation date/time of - the mailbox. It is alright to use a constant such as - 1, but only if it guaranteed that unique identifiers - will never be reused, even in the case of a mailbox - being deleted (or renamed) and a new mailbox by the - same name created at some future time. - - 4) The combination of mailbox name, UIDVALIDITY, and UID - must refer to a single immutable message on that server - forever. In particular, the internal date, [RFC-2822] - size, envelope, body structure, and message texts - (RFC822, RFC822.HEADER, RFC822.TEXT, and all BODY[...] - fetch data items) must never change. This does not - include message numbers, nor does it include attributes - that can be set by a STORE command (e.g., FLAGS). - - -2.3.1.2. Message Sequence Number Message Attribute - - A relative position from 1 to the number of messages in the mailbox. - This position MUST be ordered by ascending unique identifier. As - each new message is added, it is assigned a message sequence number - that is 1 higher than the number of messages in the mailbox before - that new message was added. - - Message sequence numbers can be reassigned during the session. For - example, when a message is permanently removed (expunged) from the - mailbox, the message sequence number for all subsequent messages is - decremented. The number of messages in the mailbox is also - decremented. Similarly, a new message can be assigned a message - sequence number that was once held by some other message prior to an - expunge. - - In addition to accessing messages by relative position in the - mailbox, message sequence numbers can be used in mathematical - calculations. For example, if an untagged "11 EXISTS" is received, - and previously an untagged "8 EXISTS" was received, three new - messages have arrived with message sequence numbers of 9, 10, and 11. - Another example, if message 287 in a 523 message mailbox has UID - 12345, there are exactly 286 messages which have lesser UIDs and 236 - messages which have greater UIDs. - - - - -Crispin Standards Track [Page 10] - -RFC 3501 IMAPv4 March 2003 - - -2.3.2. Flags Message Attribute - - A list of zero or more named tokens associated with the message. A - flag is set by its addition to this list, and is cleared by its - removal. There are two types of flags in IMAP4rev1. A flag of - either type can be permanent or session-only. - - A system flag is a flag name that is pre-defined in this - specification. All system flags begin with "\". Certain system - flags (\Deleted and \Seen) have special semantics described - elsewhere. The currently-defined system flags are: - - \Seen - Message has been read - - \Answered - Message has been answered - - \Flagged - Message is "flagged" for urgent/special attention - - \Deleted - Message is "deleted" for removal by later EXPUNGE - - \Draft - Message has not completed composition (marked as a draft). - - \Recent - Message is "recently" arrived in this mailbox. This session - is the first session to have been notified about this - message; if the session is read-write, subsequent sessions - will not see \Recent set for this message. This flag can not - be altered by the client. - - If it is not possible to determine whether or not this - session is the first session to be notified about a message, - then that message SHOULD be considered recent. - - If multiple connections have the same mailbox selected - simultaneously, it is undefined which of these connections - will see newly-arrived messages with \Recent set and which - will see it without \Recent set. - - A keyword is defined by the server implementation. Keywords do not - begin with "\". Servers MAY permit the client to define new keywords - in the mailbox (see the description of the PERMANENTFLAGS response - code for more information). - - - - -Crispin Standards Track [Page 11] - -RFC 3501 IMAPv4 March 2003 - - - A flag can be permanent or session-only on a per-flag basis. - Permanent flags are those which the client can add or remove from the - message flags permanently; that is, concurrent and subsequent - sessions will see any change in permanent flags. Changes to session - flags are valid only in that session. - - Note: The \Recent system flag is a special case of a - session flag. \Recent can not be used as an argument in a - STORE or APPEND command, and thus can not be changed at - all. - -2.3.3. Internal Date Message Attribute - - The internal date and time of the message on the server. This - is not the date and time in the [RFC-2822] header, but rather a - date and time which reflects when the message was received. In - the case of messages delivered via [SMTP], this SHOULD be the - date and time of final delivery of the message as defined by - [SMTP]. In the case of messages delivered by the IMAP4rev1 COPY - command, this SHOULD be the internal date and time of the source - message. In the case of messages delivered by the IMAP4rev1 - APPEND command, this SHOULD be the date and time as specified in - the APPEND command description. All other cases are - implementation defined. - -2.3.4. [RFC-2822] Size Message Attribute - - The number of octets in the message, as expressed in [RFC-2822] - format. - -2.3.5. Envelope Structure Message Attribute - - A parsed representation of the [RFC-2822] header of the message. - Note that the IMAP Envelope structure is not the same as an - [SMTP] envelope. - -2.3.6. Body Structure Message Attribute - - A parsed representation of the [MIME-IMB] body structure - information of the message. - - - - - - - - - - - -Crispin Standards Track [Page 12] - -RFC 3501 IMAPv4 March 2003 - - -2.4. Message Texts - - In addition to being able to fetch the full [RFC-2822] text of a - message, IMAP4rev1 permits the fetching of portions of the full - message text. Specifically, it is possible to fetch the - [RFC-2822] message header, [RFC-2822] message body, a [MIME-IMB] - body part, or a [MIME-IMB] header. - -3. State and Flow Diagram - - Once the connection between client and server is established, an - IMAP4rev1 connection is in one of four states. The initial - state is identified in the server greeting. Most commands are - only valid in certain states. It is a protocol error for the - client to attempt a command while the connection is in an - inappropriate state, and the server will respond with a BAD or - NO (depending upon server implementation) command completion - result. - -3.1. Not Authenticated State - - In the not authenticated state, the client MUST supply - authentication credentials before most commands will be - permitted. This state is entered when a connection starts - unless the connection has been pre-authenticated. - -3.2. Authenticated State - - In the authenticated state, the client is authenticated and MUST - select a mailbox to access before commands that affect messages - will be permitted. This state is entered when a - pre-authenticated connection starts, when acceptable - authentication credentials have been provided, after an error in - selecting a mailbox, or after a successful CLOSE command. - -3.3. Selected State - - In a selected state, a mailbox has been selected to access. - This state is entered when a mailbox has been successfully - selected. - - - - - - - - - - - -Crispin Standards Track [Page 13] - -RFC 3501 IMAPv4 March 2003 - - -3.4. Logout State - - In the logout state, the connection is being terminated. This - state can be entered as a result of a client request (via the - LOGOUT command) or by unilateral action on the part of either - the client or server. - - If the client requests the logout state, the server MUST send an - untagged BYE response and a tagged OK response to the LOGOUT - command before the server closes the connection; and the client - MUST read the tagged OK response to the LOGOUT command before - the client closes the connection. - - A server MUST NOT unilaterally close the connection without - sending an untagged BYE response that contains the reason for - having done so. A client SHOULD NOT unilaterally close the - connection, and instead SHOULD issue a LOGOUT command. If the - server detects that the client has unilaterally closed the - connection, the server MAY omit the untagged BYE response and - simply close its connection. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 14] - -RFC 3501 IMAPv4 March 2003 - - - +----------------------+ - |connection established| - +----------------------+ - || - \/ - +--------------------------------------+ - | server greeting | - +--------------------------------------+ - || (1) || (2) || (3) - \/ || || - +-----------------+ || || - |Not Authenticated| || || - +-----------------+ || || - || (7) || (4) || || - || \/ \/ || - || +----------------+ || - || | Authenticated |<=++ || - || +----------------+ || || - || || (7) || (5) || (6) || - || || \/ || || - || || +--------+ || || - || || |Selected|==++ || - || || +--------+ || - || || || (7) || - \/ \/ \/ \/ - +--------------------------------------+ - | Logout | - +--------------------------------------+ - || - \/ - +-------------------------------+ - |both sides close the connection| - +-------------------------------+ - - (1) connection without pre-authentication (OK greeting) - (2) pre-authenticated connection (PREAUTH greeting) - (3) rejected connection (BYE greeting) - (4) successful LOGIN or AUTHENTICATE command - (5) successful SELECT or EXAMINE command - (6) CLOSE command, or failed SELECT or EXAMINE command - (7) LOGOUT command, server shutdown, or connection closed - - - - - - - - - - -Crispin Standards Track [Page 15] - -RFC 3501 IMAPv4 March 2003 - - -4. Data Formats - - IMAP4rev1 uses textual commands and responses. Data in - IMAP4rev1 can be in one of several forms: atom, number, string, - parenthesized list, or NIL. Note that a particular data item - may take more than one form; for example, a data item defined as - using "astring" syntax may be either an atom or a string. - -4.1. Atom - - An atom consists of one or more non-special characters. - -4.2. Number - - A number consists of one or more digit characters, and - represents a numeric value. - -4.3. String - - A string is in one of two forms: either literal or quoted - string. The literal form is the general form of string. The - quoted string form is an alternative that avoids the overhead of - processing a literal at the cost of limitations of characters - which may be used. - - A literal is a sequence of zero or more octets (including CR and - LF), prefix-quoted with an octet count in the form of an open - brace ("{"), the number of octets, close brace ("}"), and CRLF. - In the case of literals transmitted from server to client, the - CRLF is immediately followed by the octet data. In the case of - literals transmitted from client to server, the client MUST wait - to receive a command continuation request (described later in - this document) before sending the octet data (and the remainder - of the command). - - A quoted string is a sequence of zero or more 7-bit characters, - excluding CR and LF, with double quote (<">) characters at each - end. - - The empty string is represented as either "" (a quoted string - with zero characters between double quotes) or as {0} followed - by CRLF (a literal with an octet count of 0). - - Note: Even if the octet count is 0, a client transmitting a - literal MUST wait to receive a command continuation request. - - - - - - -Crispin Standards Track [Page 16] - -RFC 3501 IMAPv4 March 2003 - - -4.3.1. 8-bit and Binary Strings - - 8-bit textual and binary mail is supported through the use of a - [MIME-IMB] content transfer encoding. IMAP4rev1 implementations MAY - transmit 8-bit or multi-octet characters in literals, but SHOULD do - so only when the [CHARSET] is identified. - - Although a BINARY body encoding is defined, unencoded binary strings - are not permitted. A "binary string" is any string with NUL - characters. Implementations MUST encode binary data into a textual - form, such as BASE64, before transmitting the data. A string with an - excessive amount of CTL characters MAY also be considered to be - binary. - -4.4. Parenthesized List - - Data structures are represented as a "parenthesized list"; a sequence - of data items, delimited by space, and bounded at each end by - parentheses. A parenthesized list can contain other parenthesized - lists, using multiple levels of parentheses to indicate nesting. - - The empty list is represented as () -- a parenthesized list with no - members. - -4.5. NIL - - The special form "NIL" represents the non-existence of a particular - data item that is represented as a string or parenthesized list, as - distinct from the empty string "" or the empty parenthesized list (). - - Note: NIL is never used for any data item which takes the - form of an atom. For example, a mailbox name of "NIL" is a - mailbox named NIL as opposed to a non-existent mailbox - name. This is because mailbox uses "astring" syntax which - is an atom or a string. Conversely, an addr-name of NIL is - a non-existent personal name, because addr-name uses - "nstring" syntax which is NIL or a string, but never an - atom. - - - - - - - - - - - - - -Crispin Standards Track [Page 17] - -RFC 3501 IMAPv4 March 2003 - - -5. Operational Considerations - - The following rules are listed here to ensure that all IMAP4rev1 - implementations interoperate properly. - -5.1. Mailbox Naming - - Mailbox names are 7-bit. Client implementations MUST NOT attempt to - create 8-bit mailbox names, and SHOULD interpret any 8-bit mailbox - names returned by LIST or LSUB as UTF-8. Server implementations - SHOULD prohibit the creation of 8-bit mailbox names, and SHOULD NOT - return 8-bit mailbox names in LIST or LSUB. See section 5.1.3 for - more information on how to represent non-ASCII mailbox names. - - Note: 8-bit mailbox names were undefined in earlier - versions of this protocol. Some sites used a local 8-bit - character set to represent non-ASCII mailbox names. Such - usage is not interoperable, and is now formally deprecated. - - The case-insensitive mailbox name INBOX is a special name reserved to - mean "the primary mailbox for this user on this server". The - interpretation of all other names is implementation-dependent. - - In particular, this specification takes no position on case - sensitivity in non-INBOX mailbox names. Some server implementations - are fully case-sensitive; others preserve case of a newly-created - name but otherwise are case-insensitive; and yet others coerce names - to a particular case. Client implementations MUST interact with any - of these. If a server implementation interprets non-INBOX mailbox - names as case-insensitive, it MUST treat names using the - international naming convention specially as described in section - 5.1.3. - - There are certain client considerations when creating a new mailbox - name: - - 1) Any character which is one of the atom-specials (see the Formal - Syntax) will require that the mailbox name be represented as a - quoted string or literal. - - 2) CTL and other non-graphic characters are difficult to represent - in a user interface and are best avoided. - - 3) Although the list-wildcard characters ("%" and "*") are valid - in a mailbox name, it is difficult to use such mailbox names - with the LIST and LSUB commands due to the conflict with - wildcard interpretation. - - - - -Crispin Standards Track [Page 18] - -RFC 3501 IMAPv4 March 2003 - - - 4) Usually, a character (determined by the server implementation) - is reserved to delimit levels of hierarchy. - - 5) Two characters, "#" and "&", have meanings by convention, and - should be avoided except when used in that convention. - -5.1.1. Mailbox Hierarchy Naming - - If it is desired to export hierarchical mailbox names, mailbox names - MUST be left-to-right hierarchical using a single character to - separate levels of hierarchy. The same hierarchy separator character - is used for all levels of hierarchy within a single name. - -5.1.2. Mailbox Namespace Naming Convention - - By convention, the first hierarchical element of any mailbox name - which begins with "#" identifies the "namespace" of the remainder of - the name. This makes it possible to disambiguate between different - types of mailbox stores, each of which have their own namespaces. - - For example, implementations which offer access to USENET - newsgroups MAY use the "#news" namespace to partition the - USENET newsgroup namespace from that of other mailboxes. - Thus, the comp.mail.misc newsgroup would have a mailbox - name of "#news.comp.mail.misc", and the name - "comp.mail.misc" can refer to a different object (e.g., a - user's private mailbox). - -5.1.3. Mailbox International Naming Convention - - By convention, international mailbox names in IMAP4rev1 are specified - using a modified version of the UTF-7 encoding described in [UTF-7]. - Modified UTF-7 may also be usable in servers that implement an - earlier version of this protocol. - - In modified UTF-7, printable US-ASCII characters, except for "&", - represent themselves; that is, characters with octet values 0x20-0x25 - and 0x27-0x7e. The character "&" (0x26) is represented by the - two-octet sequence "&-". - - All other characters (octet values 0x00-0x1f and 0x7f-0xff) are - represented in modified BASE64, with a further modification from - [UTF-7] that "," is used instead of "/". Modified BASE64 MUST NOT be - used to represent any printing US-ASCII character which can represent - itself. - - - - - - -Crispin Standards Track [Page 19] - -RFC 3501 IMAPv4 March 2003 - - - "&" is used to shift to modified BASE64 and "-" to shift back to - US-ASCII. There is no implicit shift from BASE64 to US-ASCII, and - null shifts ("-&" while in BASE64; note that "&-" while in US-ASCII - means "&") are not permitted. However, all names start in US-ASCII, - and MUST end in US-ASCII; that is, a name that ends with a non-ASCII - ISO-10646 character MUST end with a "-"). - - The purpose of these modifications is to correct the following - problems with UTF-7: - - 1) UTF-7 uses the "+" character for shifting; this conflicts with - the common use of "+" in mailbox names, in particular USENET - newsgroup names. - - 2) UTF-7's encoding is BASE64 which uses the "/" character; this - conflicts with the use of "/" as a popular hierarchy delimiter. - - 3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with - the use of "\" as a popular hierarchy delimiter. - - 4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with - the use of "~" in some servers as a home directory indicator. - - 5) UTF-7 permits multiple alternate forms to represent the same - string; in particular, printable US-ASCII characters can be - represented in encoded form. - - Although modified UTF-7 is a convention, it establishes certain - requirements on server handling of any mailbox name with an - embedded "&" character. In particular, server implementations - MUST preserve the exact form of the modified BASE64 portion of a - modified UTF-7 name and treat that text as case-sensitive, even if - names are otherwise case-insensitive or case-folded. - - Server implementations SHOULD verify that any mailbox name with an - embedded "&" character, used as an argument to CREATE, is: in the - correctly modified UTF-7 syntax, has no superfluous shifts, and - has no encoding in modified BASE64 of any printing US-ASCII - character which can represent itself. However, client - implementations MUST NOT depend upon the server doing this, and - SHOULD NOT attempt to create a mailbox name with an embedded "&" - character unless it complies with the modified UTF-7 syntax. - - Server implementations which export a mail store that does not - follow the modified UTF-7 convention MUST convert to modified - UTF-7 any mailbox name that contains either non-ASCII characters - or the "&" character. - - - - -Crispin Standards Track [Page 20] - -RFC 3501 IMAPv4 March 2003 - - - For example, here is a mailbox name which mixes English, - Chinese, and Japanese text: - ~peter/mail/&U,BTFw-/&ZeVnLIqe- - - For example, the string "&Jjo!" is not a valid mailbox - name because it does not contain a shift to US-ASCII - before the "!". The correct form is "&Jjo-!". The - string "&U,BTFw-&ZeVnLIqe-" is not permitted because it - contains a superfluous shift. The correct form is - "&U,BTF2XlZyyKng-". - -5.2. Mailbox Size and Message Status Updates - - At any time, a server can send data that the client did not request. - Sometimes, such behavior is REQUIRED. For example, agents other than - the server MAY add messages to the mailbox (e.g., new message - delivery), change the flags of the messages in the mailbox (e.g., - simultaneous access to the same mailbox by multiple agents), or even - remove messages from the mailbox. A server MUST send mailbox size - updates automatically if a mailbox size change is observed during the - processing of a command. A server SHOULD send message flag updates - automatically, without requiring the client to request such updates - explicitly. - - Special rules exist for server notification of a client about the - removal of messages to prevent synchronization errors; see the - description of the EXPUNGE response for more detail. In particular, - it is NOT permitted to send an EXISTS response that would reduce the - number of messages in the mailbox; only the EXPUNGE response can do - this. - - Regardless of what implementation decisions a client makes on - remembering data from the server, a client implementation MUST record - mailbox size updates. It MUST NOT assume that any command after the - initial mailbox selection will return the size of the mailbox. - -5.3. Response when no Command in Progress - - Server implementations are permitted to send an untagged response - (except for EXPUNGE) while there is no command in progress. Server - implementations that send such responses MUST deal with flow control - considerations. Specifically, they MUST either (1) verify that the - size of the data does not exceed the underlying transport's available - window size, or (2) use non-blocking writes. - - - - - - - -Crispin Standards Track [Page 21] - -RFC 3501 IMAPv4 March 2003 - - -5.4. Autologout Timer - - If a server has an inactivity autologout timer, the duration of that - timer MUST be at least 30 minutes. The receipt of ANY command from - the client during that interval SHOULD suffice to reset the - autologout timer. - -5.5. Multiple Commands in Progress - - The client MAY send another command without waiting for the - completion result response of a command, subject to ambiguity rules - (see below) and flow control constraints on the underlying data - stream. Similarly, a server MAY begin processing another command - before processing the current command to completion, subject to - ambiguity rules. However, any command continuation request responses - and command continuations MUST be negotiated before any subsequent - command is initiated. - - The exception is if an ambiguity would result because of a command - that would affect the results of other commands. Clients MUST NOT - send multiple commands without waiting if an ambiguity would result. - If the server detects a possible ambiguity, it MUST execute commands - to completion in the order given by the client. - - The most obvious example of ambiguity is when a command would affect - the results of another command, e.g., a FETCH of a message's flags - and a STORE of that same message's flags. - - A non-obvious ambiguity occurs with commands that permit an untagged - EXPUNGE response (commands other than FETCH, STORE, and SEARCH), - since an untagged EXPUNGE response can invalidate sequence numbers in - a subsequent command. This is not a problem for FETCH, STORE, or - SEARCH commands because servers are prohibited from sending EXPUNGE - responses while any of those commands are in progress. Therefore, if - the client sends any command other than FETCH, STORE, or SEARCH, it - MUST wait for the completion result response before sending a command - with message sequence numbers. - - Note: UID FETCH, UID STORE, and UID SEARCH are different - commands from FETCH, STORE, and SEARCH. If the client - sends a UID command, it must wait for a completion result - response before sending a command with message sequence - numbers. - - - - - - - - -Crispin Standards Track [Page 22] - -RFC 3501 IMAPv4 March 2003 - - - For example, the following non-waiting command sequences are invalid: - - FETCH + NOOP + STORE - STORE + COPY + FETCH - COPY + COPY - CHECK + FETCH - - The following are examples of valid non-waiting command sequences: - - FETCH + STORE + SEARCH + CHECK - STORE + COPY + EXPUNGE - - UID SEARCH + UID SEARCH may be valid or invalid as a non-waiting - command sequence, depending upon whether or not the second UID - SEARCH contains message sequence numbers. - -6. Client Commands - - IMAP4rev1 commands are described in this section. Commands are - organized by the state in which the command is permitted. Commands - which are permitted in multiple states are listed in the minimum - permitted state (for example, commands valid in authenticated and - selected state are listed in the authenticated state commands). - - Command arguments, identified by "Arguments:" in the command - descriptions below, are described by function, not by syntax. The - precise syntax of command arguments is described in the Formal Syntax - section. - - Some commands cause specific server responses to be returned; these - are identified by "Responses:" in the command descriptions below. - See the response descriptions in the Responses section for - information on these responses, and the Formal Syntax section for the - precise syntax of these responses. It is possible for server data to - be transmitted as a result of any command. Thus, commands that do - not specifically require server data specify "no specific responses - for this command" instead of "none". - - The "Result:" in the command description refers to the possible - tagged status responses to a command, and any special interpretation - of these status responses. - - The state of a connection is only changed by successful commands - which are documented as changing state. A rejected command (BAD - response) never changes the state of the connection or of the - selected mailbox. A failed command (NO response) generally does not - change the state of the connection or of the selected mailbox; the - exception being the SELECT and EXAMINE commands. - - - -Crispin Standards Track [Page 23] - -RFC 3501 IMAPv4 March 2003 - - -6.1. Client Commands - Any State - - The following commands are valid in any state: CAPABILITY, NOOP, and - LOGOUT. - -6.1.1. CAPABILITY Command - - Arguments: none - - Responses: REQUIRED untagged response: CAPABILITY - - Result: OK - capability completed - BAD - command unknown or arguments invalid - - The CAPABILITY command requests a listing of capabilities that the - server supports. The server MUST send a single untagged - CAPABILITY response with "IMAP4rev1" as one of the listed - capabilities before the (tagged) OK response. - - A capability name which begins with "AUTH=" indicates that the - server supports that particular authentication mechanism. All - such names are, by definition, part of this specification. For - example, the authorization capability for an experimental - "blurdybloop" authenticator would be "AUTH=XBLURDYBLOOP" and not - "XAUTH=BLURDYBLOOP" or "XAUTH=XBLURDYBLOOP". - - Other capability names refer to extensions, revisions, or - amendments to this specification. See the documentation of the - CAPABILITY response for additional information. No capabilities, - beyond the base IMAP4rev1 set defined in this specification, are - enabled without explicit client action to invoke the capability. - - Client and server implementations MUST implement the STARTTLS, - LOGINDISABLED, and AUTH=PLAIN (described in [IMAP-TLS]) - capabilities. See the Security Considerations section for - important information. - - See the section entitled "Client Commands - - Experimental/Expansion" for information about the form of site or - implementation-specific capabilities. - - - - - - - - - - - -Crispin Standards Track [Page 24] - -RFC 3501 IMAPv4 March 2003 - - - Example: C: abcd CAPABILITY - S: * CAPABILITY IMAP4rev1 STARTTLS AUTH=GSSAPI - LOGINDISABLED - S: abcd OK CAPABILITY completed - C: efgh STARTTLS - S: efgh OK STARTLS completed - <TLS negotiation, further commands are under [TLS] layer> - C: ijkl CAPABILITY - S: * CAPABILITY IMAP4rev1 AUTH=GSSAPI AUTH=PLAIN - S: ijkl OK CAPABILITY completed - - -6.1.2. NOOP Command - - Arguments: none - - Responses: no specific responses for this command (but see below) - - Result: OK - noop completed - BAD - command unknown or arguments invalid - - The NOOP command always succeeds. It does nothing. - - Since any command can return a status update as untagged data, the - NOOP command can be used as a periodic poll for new messages or - message status updates during a period of inactivity (this is the - preferred method to do this). The NOOP command can also be used - to reset any inactivity autologout timer on the server. - - Example: C: a002 NOOP - S: a002 OK NOOP completed - . . . - C: a047 NOOP - S: * 22 EXPUNGE - S: * 23 EXISTS - S: * 3 RECENT - S: * 14 FETCH (FLAGS (\Seen \Deleted)) - S: a047 OK NOOP completed - - - - - - - - - - - - - -Crispin Standards Track [Page 25] - -RFC 3501 IMAPv4 March 2003 - - -6.1.3. LOGOUT Command - - Arguments: none - - Responses: REQUIRED untagged response: BYE - - Result: OK - logout completed - BAD - command unknown or arguments invalid - - The LOGOUT command informs the server that the client is done with - the connection. The server MUST send a BYE untagged response - before the (tagged) OK response, and then close the network - connection. - - Example: C: A023 LOGOUT - S: * BYE IMAP4rev1 Server logging out - S: A023 OK LOGOUT completed - (Server and client then close the connection) - -6.2. Client Commands - Not Authenticated State - - In the not authenticated state, the AUTHENTICATE or LOGIN command - establishes authentication and enters the authenticated state. The - AUTHENTICATE command provides a general mechanism for a variety of - authentication techniques, privacy protection, and integrity - checking; whereas the LOGIN command uses a traditional user name and - plaintext password pair and has no means of establishing privacy - protection or integrity checking. - - The STARTTLS command is an alternate form of establishing session - privacy protection and integrity checking, but does not establish - authentication or enter the authenticated state. - - Server implementations MAY allow access to certain mailboxes without - establishing authentication. This can be done by means of the - ANONYMOUS [SASL] authenticator described in [ANONYMOUS]. An older - convention is a LOGIN command using the userid "anonymous"; in this - case, a password is required although the server may choose to accept - any password. The restrictions placed on anonymous users are - implementation-dependent. - - Once authenticated (including as anonymous), it is not possible to - re-enter not authenticated state. - - - - - - - - -Crispin Standards Track [Page 26] - -RFC 3501 IMAPv4 March 2003 - - - In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), - the following commands are valid in the not authenticated state: - STARTTLS, AUTHENTICATE and LOGIN. See the Security Considerations - section for important information about these commands. - -6.2.1. STARTTLS Command - - Arguments: none - - Responses: no specific response for this command - - Result: OK - starttls completed, begin TLS negotiation - BAD - command unknown or arguments invalid - - A [TLS] negotiation begins immediately after the CRLF at the end - of the tagged OK response from the server. Once a client issues a - STARTTLS command, it MUST NOT issue further commands until a - server response is seen and the [TLS] negotiation is complete. - - The server remains in the non-authenticated state, even if client - credentials are supplied during the [TLS] negotiation. This does - not preclude an authentication mechanism such as EXTERNAL (defined - in [SASL]) from using client identity determined by the [TLS] - negotiation. - - Once [TLS] has been started, the client MUST discard cached - information about server capabilities and SHOULD re-issue the - CAPABILITY command. This is necessary to protect against man-in- - the-middle attacks which alter the capabilities list prior to - STARTTLS. The server MAY advertise different capabilities after - STARTTLS. - - Example: C: a001 CAPABILITY - S: * CAPABILITY IMAP4rev1 STARTTLS LOGINDISABLED - S: a001 OK CAPABILITY completed - C: a002 STARTTLS - S: a002 OK Begin TLS negotiation now - <TLS negotiation, further commands are under [TLS] layer> - C: a003 CAPABILITY - S: * CAPABILITY IMAP4rev1 AUTH=PLAIN - S: a003 OK CAPABILITY completed - C: a004 LOGIN joe password - S: a004 OK LOGIN completed - - - - - - - - -Crispin Standards Track [Page 27] - -RFC 3501 IMAPv4 March 2003 - - -6.2.2. AUTHENTICATE Command - - Arguments: authentication mechanism name - - Responses: continuation data can be requested - - Result: OK - authenticate completed, now in authenticated state - NO - authenticate failure: unsupported authentication - mechanism, credentials rejected - BAD - command unknown or arguments invalid, - authentication exchange cancelled - - The AUTHENTICATE command indicates a [SASL] authentication - mechanism to the server. If the server supports the requested - authentication mechanism, it performs an authentication protocol - exchange to authenticate and identify the client. It MAY also - negotiate an OPTIONAL security layer for subsequent protocol - interactions. If the requested authentication mechanism is not - supported, the server SHOULD reject the AUTHENTICATE command by - sending a tagged NO response. - - The AUTHENTICATE command does not support the optional "initial - response" feature of [SASL]. Section 5.1 of [SASL] specifies how - to handle an authentication mechanism which uses an initial - response. - - The service name specified by this protocol's profile of [SASL] is - "imap". - - The authentication protocol exchange consists of a series of - server challenges and client responses that are specific to the - authentication mechanism. A server challenge consists of a - command continuation request response with the "+" token followed - by a BASE64 encoded string. The client response consists of a - single line consisting of a BASE64 encoded string. If the client - wishes to cancel an authentication exchange, it issues a line - consisting of a single "*". If the server receives such a - response, it MUST reject the AUTHENTICATE command by sending a - tagged BAD response. - - If a security layer is negotiated through the [SASL] - authentication exchange, it takes effect immediately following the - CRLF that concludes the authentication exchange for the client, - and the CRLF of the tagged OK response for the server. - - While client and server implementations MUST implement the - AUTHENTICATE command itself, it is not required to implement any - authentication mechanisms other than the PLAIN mechanism described - - - -Crispin Standards Track [Page 28] - -RFC 3501 IMAPv4 March 2003 - - - in [IMAP-TLS]. Also, an authentication mechanism is not required - to support any security layers. - - Note: a server implementation MUST implement a - configuration in which it does NOT permit any plaintext - password mechanisms, unless either the STARTTLS command - has been negotiated or some other mechanism that - protects the session from password snooping has been - provided. Server sites SHOULD NOT use any configuration - which permits a plaintext password mechanism without - such a protection mechanism against password snooping. - Client and server implementations SHOULD implement - additional [SASL] mechanisms that do not use plaintext - passwords, such the GSSAPI mechanism described in [SASL] - and/or the [DIGEST-MD5] mechanism. - - Servers and clients can support multiple authentication - mechanisms. The server SHOULD list its supported authentication - mechanisms in the response to the CAPABILITY command so that the - client knows which authentication mechanisms to use. - - A server MAY include a CAPABILITY response code in the tagged OK - response of a successful AUTHENTICATE command in order to send - capabilities automatically. It is unnecessary for a client to - send a separate CAPABILITY command if it recognizes these - automatic capabilities. This should only be done if a security - layer was not negotiated by the AUTHENTICATE command, because the - tagged OK response as part of an AUTHENTICATE command is not - protected by encryption/integrity checking. [SASL] requires the - client to re-issue a CAPABILITY command in this case. - - If an AUTHENTICATE command fails with a NO response, the client - MAY try another authentication mechanism by issuing another - AUTHENTICATE command. It MAY also attempt to authenticate by - using the LOGIN command (see section 6.2.3 for more detail). In - other words, the client MAY request authentication types in - decreasing order of preference, with the LOGIN command as a last - resort. - - The authorization identity passed from the client to the server - during the authentication exchange is interpreted by the server as - the user name whose privileges the client is requesting. - - - - - - - - - -Crispin Standards Track [Page 29] - -RFC 3501 IMAPv4 March 2003 - - - Example: S: * OK IMAP4rev1 Server - C: A001 AUTHENTICATE GSSAPI - S: + - C: YIIB+wYJKoZIhvcSAQICAQBuggHqMIIB5qADAgEFoQMCAQ6iBw - MFACAAAACjggEmYYIBIjCCAR6gAwIBBaESGxB1Lndhc2hpbmd0 - b24uZWR1oi0wK6ADAgEDoSQwIhsEaW1hcBsac2hpdmFtcy5jYW - Mud2FzaGluZ3Rvbi5lZHWjgdMwgdCgAwIBAaEDAgEDooHDBIHA - cS1GSa5b+fXnPZNmXB9SjL8Ollj2SKyb+3S0iXMljen/jNkpJX - AleKTz6BQPzj8duz8EtoOuNfKgweViyn/9B9bccy1uuAE2HI0y - C/PHXNNU9ZrBziJ8Lm0tTNc98kUpjXnHZhsMcz5Mx2GR6dGknb - I0iaGcRerMUsWOuBmKKKRmVMMdR9T3EZdpqsBd7jZCNMWotjhi - vd5zovQlFqQ2Wjc2+y46vKP/iXxWIuQJuDiisyXF0Y8+5GTpAL - pHDc1/pIGmMIGjoAMCAQGigZsEgZg2on5mSuxoDHEA1w9bcW9n - FdFxDKpdrQhVGVRDIzcCMCTzvUboqb5KjY1NJKJsfjRQiBYBdE - NKfzK+g5DlV8nrw81uOcP8NOQCLR5XkoMHC0Dr/80ziQzbNqhx - O6652Npft0LQwJvenwDI13YxpwOdMXzkWZN/XrEqOWp6GCgXTB - vCyLWLlWnbaUkZdEYbKHBPjd8t/1x5Yg== - S: + YGgGCSqGSIb3EgECAgIAb1kwV6ADAgEFoQMCAQ+iSzBJoAMC - AQGiQgRAtHTEuOP2BXb9sBYFR4SJlDZxmg39IxmRBOhXRKdDA0 - uHTCOT9Bq3OsUTXUlk0CsFLoa8j+gvGDlgHuqzWHPSQg== - C: - S: + YDMGCSqGSIb3EgECAgIBAAD/////6jcyG4GE3KkTzBeBiVHe - ceP2CWY0SR0fAQAgAAQEBAQ= - C: YDMGCSqGSIb3EgECAgIBAAD/////3LQBHXTpFfZgrejpLlLImP - wkhbfa2QteAQAgAG1yYwE= - S: A001 OK GSSAPI authentication successful - - Note: The line breaks within server challenges and client - responses are for editorial clarity and are not in real - authenticators. - - -6.2.3. LOGIN Command - - Arguments: user name - password - - Responses: no specific responses for this command - - Result: OK - login completed, now in authenticated state - NO - login failure: user name or password rejected - BAD - command unknown or arguments invalid - - The LOGIN command identifies the client to the server and carries - the plaintext password authenticating this user. - - - - - - -Crispin Standards Track [Page 30] - -RFC 3501 IMAPv4 March 2003 - - - A server MAY include a CAPABILITY response code in the tagged OK - response to a successful LOGIN command in order to send - capabilities automatically. It is unnecessary for a client to - send a separate CAPABILITY command if it recognizes these - automatic capabilities. - - Example: C: a001 LOGIN SMITH SESAME - S: a001 OK LOGIN completed - - Note: Use of the LOGIN command over an insecure network - (such as the Internet) is a security risk, because anyone - monitoring network traffic can obtain plaintext passwords. - The LOGIN command SHOULD NOT be used except as a last - resort, and it is recommended that client implementations - have a means to disable any automatic use of the LOGIN - command. - - Unless either the STARTTLS command has been negotiated or - some other mechanism that protects the session from - password snooping has been provided, a server - implementation MUST implement a configuration in which it - advertises the LOGINDISABLED capability and does NOT permit - the LOGIN command. Server sites SHOULD NOT use any - configuration which permits the LOGIN command without such - a protection mechanism against password snooping. A client - implementation MUST NOT send a LOGIN command if the - LOGINDISABLED capability is advertised. - -6.3. Client Commands - Authenticated State - - In the authenticated state, commands that manipulate mailboxes as - atomic entities are permitted. Of these commands, the SELECT and - EXAMINE commands will select a mailbox for access and enter the - selected state. - - In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), - the following commands are valid in the authenticated state: SELECT, - EXAMINE, CREATE, DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB, - STATUS, and APPEND. - - - - - - - - - - - - -Crispin Standards Track [Page 31] - -RFC 3501 IMAPv4 March 2003 - - -6.3.1. SELECT Command - - Arguments: mailbox name - - Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT - REQUIRED OK untagged responses: UNSEEN, PERMANENTFLAGS, - UIDNEXT, UIDVALIDITY - - Result: OK - select completed, now in selected state - NO - select failure, now in authenticated state: no - such mailbox, can't access mailbox - BAD - command unknown or arguments invalid - - The SELECT command selects a mailbox so that messages in the - mailbox can be accessed. Before returning an OK to the client, - the server MUST send the following untagged data to the client. - Note that earlier versions of this protocol only required the - FLAGS, EXISTS, and RECENT untagged data; consequently, client - implementations SHOULD implement default behavior for missing data - as discussed with the individual item. - - FLAGS Defined flags in the mailbox. See the description - of the FLAGS response for more detail. - - <n> EXISTS The number of messages in the mailbox. See the - description of the EXISTS response for more detail. - - <n> RECENT The number of messages with the \Recent flag set. - See the description of the RECENT response for more - detail. - - OK [UNSEEN <n>] - The message sequence number of the first unseen - message in the mailbox. If this is missing, the - client can not make any assumptions about the first - unseen message in the mailbox, and needs to issue a - SEARCH command if it wants to find it. - - OK [PERMANENTFLAGS (<list of flags>)] - A list of message flags that the client can change - permanently. If this is missing, the client should - assume that all flags can be changed permanently. - - OK [UIDNEXT <n>] - The next unique identifier value. Refer to section - 2.3.1.1 for more information. If this is missing, - the client can not make any assumptions about the - next unique identifier value. - - - -Crispin Standards Track [Page 32] - -RFC 3501 IMAPv4 March 2003 - - - OK [UIDVALIDITY <n>] - The unique identifier validity value. Refer to - section 2.3.1.1 for more information. If this is - missing, the server does not support unique - identifiers. - - Only one mailbox can be selected at a time in a connection; - simultaneous access to multiple mailboxes requires multiple - connections. The SELECT command automatically deselects any - currently selected mailbox before attempting the new selection. - Consequently, if a mailbox is selected and a SELECT command that - fails is attempted, no mailbox is selected. - - If the client is permitted to modify the mailbox, the server - SHOULD prefix the text of the tagged OK response with the - "[READ-WRITE]" response code. - - If the client is not permitted to modify the mailbox but is - permitted read access, the mailbox is selected as read-only, and - the server MUST prefix the text of the tagged OK response to - SELECT with the "[READ-ONLY]" response code. Read-only access - through SELECT differs from the EXAMINE command in that certain - read-only mailboxes MAY permit the change of permanent state on a - per-user (as opposed to global) basis. Netnews messages marked in - a server-based .newsrc file are an example of such per-user - permanent state that can be modified with read-only mailboxes. - - Example: C: A142 SELECT INBOX - S: * 172 EXISTS - S: * 1 RECENT - S: * OK [UNSEEN 12] Message 12 is first unseen - S: * OK [UIDVALIDITY 3857529045] UIDs valid - S: * OK [UIDNEXT 4392] Predicted next UID - S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) - S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited - S: A142 OK [READ-WRITE] SELECT completed - - - - - - - - - - - - - - - -Crispin Standards Track [Page 33] - -RFC 3501 IMAPv4 March 2003 - - -6.3.2. EXAMINE Command - - Arguments: mailbox name - - Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT - REQUIRED OK untagged responses: UNSEEN, PERMANENTFLAGS, - UIDNEXT, UIDVALIDITY - - Result: OK - examine completed, now in selected state - NO - examine failure, now in authenticated state: no - such mailbox, can't access mailbox - BAD - command unknown or arguments invalid - - The EXAMINE command is identical to SELECT and returns the same - output; however, the selected mailbox is identified as read-only. - No changes to the permanent state of the mailbox, including - per-user state, are permitted; in particular, EXAMINE MUST NOT - cause messages to lose the \Recent flag. - - The text of the tagged OK response to the EXAMINE command MUST - begin with the "[READ-ONLY]" response code. - - Example: C: A932 EXAMINE blurdybloop - S: * 17 EXISTS - S: * 2 RECENT - S: * OK [UNSEEN 8] Message 8 is first unseen - S: * OK [UIDVALIDITY 3857529045] UIDs valid - S: * OK [UIDNEXT 4392] Predicted next UID - S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) - S: * OK [PERMANENTFLAGS ()] No permanent flags permitted - S: A932 OK [READ-ONLY] EXAMINE completed - - -6.3.3. CREATE Command - - Arguments: mailbox name - - Responses: no specific responses for this command - - Result: OK - create completed - NO - create failure: can't create mailbox with that name - BAD - command unknown or arguments invalid - - The CREATE command creates a mailbox with the given name. An OK - response is returned only if a new mailbox with that name has been - created. It is an error to attempt to create INBOX or a mailbox - with a name that refers to an extant mailbox. Any error in - creation will return a tagged NO response. - - - -Crispin Standards Track [Page 34] - -RFC 3501 IMAPv4 March 2003 - - - If the mailbox name is suffixed with the server's hierarchy - separator character (as returned from the server by a LIST - command), this is a declaration that the client intends to create - mailbox names under this name in the hierarchy. Server - implementations that do not require this declaration MUST ignore - the declaration. In any case, the name created is without the - trailing hierarchy delimiter. - - If the server's hierarchy separator character appears elsewhere in - the name, the server SHOULD create any superior hierarchical names - that are needed for the CREATE command to be successfully - completed. In other words, an attempt to create "foo/bar/zap" on - a server in which "/" is the hierarchy separator character SHOULD - create foo/ and foo/bar/ if they do not already exist. - - If a new mailbox is created with the same name as a mailbox which - was deleted, its unique identifiers MUST be greater than any - unique identifiers used in the previous incarnation of the mailbox - UNLESS the new incarnation has a different unique identifier - validity value. See the description of the UID command for more - detail. - - Example: C: A003 CREATE owatagusiam/ - S: A003 OK CREATE completed - C: A004 CREATE owatagusiam/blurdybloop - S: A004 OK CREATE completed - - Note: The interpretation of this example depends on whether - "/" was returned as the hierarchy separator from LIST. If - "/" is the hierarchy separator, a new level of hierarchy - named "owatagusiam" with a member called "blurdybloop" is - created. Otherwise, two mailboxes at the same hierarchy - level are created. - - -6.3.4. DELETE Command - - Arguments: mailbox name - - Responses: no specific responses for this command - - Result: OK - delete completed - NO - delete failure: can't delete mailbox with that name - BAD - command unknown or arguments invalid - - - - - - - -Crispin Standards Track [Page 35] - -RFC 3501 IMAPv4 March 2003 - - - The DELETE command permanently removes the mailbox with the given - name. A tagged OK response is returned only if the mailbox has - been deleted. It is an error to attempt to delete INBOX or a - mailbox name that does not exist. - - The DELETE command MUST NOT remove inferior hierarchical names. - For example, if a mailbox "foo" has an inferior "foo.bar" - (assuming "." is the hierarchy delimiter character), removing - "foo" MUST NOT remove "foo.bar". It is an error to attempt to - delete a name that has inferior hierarchical names and also has - the \Noselect mailbox name attribute (see the description of the - LIST response for more details). - - It is permitted to delete a name that has inferior hierarchical - names and does not have the \Noselect mailbox name attribute. In - this case, all messages in that mailbox are removed, and the name - will acquire the \Noselect mailbox name attribute. - - The value of the highest-used unique identifier of the deleted - mailbox MUST be preserved so that a new mailbox created with the - same name will not reuse the identifiers of the former - incarnation, UNLESS the new incarnation has a different unique - identifier validity value. See the description of the UID command - for more detail. - - Examples: C: A682 LIST "" * - S: * LIST () "/" blurdybloop - S: * LIST (\Noselect) "/" foo - S: * LIST () "/" foo/bar - S: A682 OK LIST completed - C: A683 DELETE blurdybloop - S: A683 OK DELETE completed - C: A684 DELETE foo - S: A684 NO Name "foo" has inferior hierarchical names - C: A685 DELETE foo/bar - S: A685 OK DELETE Completed - C: A686 LIST "" * - S: * LIST (\Noselect) "/" foo - S: A686 OK LIST completed - C: A687 DELETE foo - S: A687 OK DELETE Completed - - - - - - - - - - -Crispin Standards Track [Page 36] - -RFC 3501 IMAPv4 March 2003 - - - C: A82 LIST "" * - S: * LIST () "." blurdybloop - S: * LIST () "." foo - S: * LIST () "." foo.bar - S: A82 OK LIST completed - C: A83 DELETE blurdybloop - S: A83 OK DELETE completed - C: A84 DELETE foo - S: A84 OK DELETE Completed - C: A85 LIST "" * - S: * LIST () "." foo.bar - S: A85 OK LIST completed - C: A86 LIST "" % - S: * LIST (\Noselect) "." foo - S: A86 OK LIST completed - - -6.3.5. RENAME Command - - Arguments: existing mailbox name - new mailbox name - - Responses: no specific responses for this command - - Result: OK - rename completed - NO - rename failure: can't rename mailbox with that name, - can't rename to mailbox with that name - BAD - command unknown or arguments invalid - - The RENAME command changes the name of a mailbox. A tagged OK - response is returned only if the mailbox has been renamed. It is - an error to attempt to rename from a mailbox name that does not - exist or to a mailbox name that already exists. Any error in - renaming will return a tagged NO response. - - If the name has inferior hierarchical names, then the inferior - hierarchical names MUST also be renamed. For example, a rename of - "foo" to "zap" will rename "foo/bar" (assuming "/" is the - hierarchy delimiter character) to "zap/bar". - - If the server's hierarchy separator character appears in the name, - the server SHOULD create any superior hierarchical names that are - needed for the RENAME command to complete successfully. In other - words, an attempt to rename "foo/bar/zap" to baz/rag/zowie on a - server in which "/" is the hierarchy separator character SHOULD - create baz/ and baz/rag/ if they do not already exist. - - - - - -Crispin Standards Track [Page 37] - -RFC 3501 IMAPv4 March 2003 - - - The value of the highest-used unique identifier of the old mailbox - name MUST be preserved so that a new mailbox created with the same - name will not reuse the identifiers of the former incarnation, - UNLESS the new incarnation has a different unique identifier - validity value. See the description of the UID command for more - detail. - - Renaming INBOX is permitted, and has special behavior. It moves - all messages in INBOX to a new mailbox with the given name, - leaving INBOX empty. If the server implementation supports - inferior hierarchical names of INBOX, these are unaffected by a - rename of INBOX. - - Examples: C: A682 LIST "" * - S: * LIST () "/" blurdybloop - S: * LIST (\Noselect) "/" foo - S: * LIST () "/" foo/bar - S: A682 OK LIST completed - C: A683 RENAME blurdybloop sarasoop - S: A683 OK RENAME completed - C: A684 RENAME foo zowie - S: A684 OK RENAME Completed - C: A685 LIST "" * - S: * LIST () "/" sarasoop - S: * LIST (\Noselect) "/" zowie - S: * LIST () "/" zowie/bar - S: A685 OK LIST completed - - C: Z432 LIST "" * - S: * LIST () "." INBOX - S: * LIST () "." INBOX.bar - S: Z432 OK LIST completed - C: Z433 RENAME INBOX old-mail - S: Z433 OK RENAME completed - C: Z434 LIST "" * - S: * LIST () "." INBOX - S: * LIST () "." INBOX.bar - S: * LIST () "." old-mail - S: Z434 OK LIST completed - - - - - - - - - - - - -Crispin Standards Track [Page 38] - -RFC 3501 IMAPv4 March 2003 - - -6.3.6. SUBSCRIBE Command - - Arguments: mailbox - - Responses: no specific responses for this command - - Result: OK - subscribe completed - NO - subscribe failure: can't subscribe to that name - BAD - command unknown or arguments invalid - - The SUBSCRIBE command adds the specified mailbox name to the - server's set of "active" or "subscribed" mailboxes as returned by - the LSUB command. This command returns a tagged OK response only - if the subscription is successful. - - A server MAY validate the mailbox argument to SUBSCRIBE to verify - that it exists. However, it MUST NOT unilaterally remove an - existing mailbox name from the subscription list even if a mailbox - by that name no longer exists. - - Note: This requirement is because a server site can - choose to routinely remove a mailbox with a well-known - name (e.g., "system-alerts") after its contents expire, - with the intention of recreating it when new contents - are appropriate. - - - Example: C: A002 SUBSCRIBE #news.comp.mail.mime - S: A002 OK SUBSCRIBE completed - - -6.3.7. UNSUBSCRIBE Command - - Arguments: mailbox name - - Responses: no specific responses for this command - - Result: OK - unsubscribe completed - NO - unsubscribe failure: can't unsubscribe that name - BAD - command unknown or arguments invalid - - The UNSUBSCRIBE command removes the specified mailbox name from - the server's set of "active" or "subscribed" mailboxes as returned - by the LSUB command. This command returns a tagged OK response - only if the unsubscription is successful. - - Example: C: A002 UNSUBSCRIBE #news.comp.mail.mime - S: A002 OK UNSUBSCRIBE completed - - - -Crispin Standards Track [Page 39] - -RFC 3501 IMAPv4 March 2003 - - -6.3.8. LIST Command - - Arguments: reference name - mailbox name with possible wildcards - - Responses: untagged responses: LIST - - Result: OK - list completed - NO - list failure: can't list that reference or name - BAD - command unknown or arguments invalid - - The LIST command returns a subset of names from the complete set - of all names available to the client. Zero or more untagged LIST - replies are returned, containing the name attributes, hierarchy - delimiter, and name; see the description of the LIST reply for - more detail. - - The LIST command SHOULD return its data quickly, without undue - delay. For example, it SHOULD NOT go to excess trouble to - calculate the \Marked or \Unmarked status or perform other - processing; if each name requires 1 second of processing, then a - list of 1200 names would take 20 minutes! - - An empty ("" string) reference name argument indicates that the - mailbox name is interpreted as by SELECT. The returned mailbox - names MUST match the supplied mailbox name pattern. A non-empty - reference name argument is the name of a mailbox or a level of - mailbox hierarchy, and indicates the context in which the mailbox - name is interpreted. - - An empty ("" string) mailbox name argument is a special request to - return the hierarchy delimiter and the root name of the name given - in the reference. The value returned as the root MAY be the empty - string if the reference is non-rooted or is an empty string. In - all cases, a hierarchy delimiter (or NIL if there is no hierarchy) - is returned. This permits a client to get the hierarchy delimiter - (or find out that the mailbox names are flat) even when no - mailboxes by that name currently exist. - - The reference and mailbox name arguments are interpreted into a - canonical form that represents an unambiguous left-to-right - hierarchy. The returned mailbox names will be in the interpreted - form. - - - - - - - - -Crispin Standards Track [Page 40] - -RFC 3501 IMAPv4 March 2003 - - - Note: The interpretation of the reference argument is - implementation-defined. It depends upon whether the - server implementation has a concept of the "current - working directory" and leading "break out characters", - which override the current working directory. - - For example, on a server which exports a UNIX or NT - filesystem, the reference argument contains the current - working directory, and the mailbox name argument would - contain the name as interpreted in the current working - directory. - - If a server implementation has no concept of break out - characters, the canonical form is normally the reference - name appended with the mailbox name. Note that if the - server implements the namespace convention (section - 5.1.2), "#" is a break out character and must be treated - as such. - - If the reference argument is not a level of mailbox - hierarchy (that is, it is a \NoInferiors name), and/or - the reference argument does not end with the hierarchy - delimiter, it is implementation-dependent how this is - interpreted. For example, a reference of "foo/bar" and - mailbox name of "rag/baz" could be interpreted as - "foo/bar/rag/baz", "foo/barrag/baz", or "foo/rag/baz". - A client SHOULD NOT use such a reference argument except - at the explicit request of the user. A hierarchical - browser MUST NOT make any assumptions about server - interpretation of the reference unless the reference is - a level of mailbox hierarchy AND ends with the hierarchy - delimiter. - - Any part of the reference argument that is included in the - interpreted form SHOULD prefix the interpreted form. It SHOULD - also be in the same form as the reference name argument. This - rule permits the client to determine if the returned mailbox name - is in the context of the reference argument, or if something about - the mailbox argument overrode the reference argument. Without - this rule, the client would have to have knowledge of the server's - naming semantics including what characters are "breakouts" that - override a naming context. - - - - - - - - - -Crispin Standards Track [Page 41] - -RFC 3501 IMAPv4 March 2003 - - - For example, here are some examples of how references - and mailbox names might be interpreted on a UNIX-based - server: - - Reference Mailbox Name Interpretation - ------------ ------------ -------------- - ~smith/Mail/ foo.* ~smith/Mail/foo.* - archive/ % archive/% - #news. comp.mail.* #news.comp.mail.* - ~smith/Mail/ /usr/doc/foo /usr/doc/foo - archive/ ~fred/Mail/* ~fred/Mail/* - - The first three examples demonstrate interpretations in - the context of the reference argument. Note that - "~smith/Mail" SHOULD NOT be transformed into something - like "/u2/users/smith/Mail", or it would be impossible - for the client to determine that the interpretation was - in the context of the reference. - - The character "*" is a wildcard, and matches zero or more - characters at this position. The character "%" is similar to "*", - but it does not match a hierarchy delimiter. If the "%" wildcard - is the last character of a mailbox name argument, matching levels - of hierarchy are also returned. If these levels of hierarchy are - not also selectable mailboxes, they are returned with the - \Noselect mailbox name attribute (see the description of the LIST - response for more details). - - Server implementations are permitted to "hide" otherwise - accessible mailboxes from the wildcard characters, by preventing - certain characters or names from matching a wildcard in certain - situations. For example, a UNIX-based server might restrict the - interpretation of "*" so that an initial "/" character does not - match. - - The special name INBOX is included in the output from LIST, if - INBOX is supported by this server for this user and if the - uppercase string "INBOX" matches the interpreted reference and - mailbox name arguments with wildcards as described above. The - criteria for omitting INBOX is whether SELECT INBOX will return - failure; it is not relevant whether the user's real INBOX resides - on this or some other server. - - - - - - - - - -Crispin Standards Track [Page 42] - -RFC 3501 IMAPv4 March 2003 - - - Example: C: A101 LIST "" "" - S: * LIST (\Noselect) "/" "" - S: A101 OK LIST Completed - C: A102 LIST #news.comp.mail.misc "" - S: * LIST (\Noselect) "." #news. - S: A102 OK LIST Completed - C: A103 LIST /usr/staff/jones "" - S: * LIST (\Noselect) "/" / - S: A103 OK LIST Completed - C: A202 LIST ~/Mail/ % - S: * LIST (\Noselect) "/" ~/Mail/foo - S: * LIST () "/" ~/Mail/meetings - S: A202 OK LIST completed - - -6.3.9. LSUB Command - - Arguments: reference name - mailbox name with possible wildcards - - Responses: untagged responses: LSUB - - Result: OK - lsub completed - NO - lsub failure: can't list that reference or name - BAD - command unknown or arguments invalid - - The LSUB command returns a subset of names from the set of names - that the user has declared as being "active" or "subscribed". - Zero or more untagged LSUB replies are returned. The arguments to - LSUB are in the same form as those for LIST. - - The returned untagged LSUB response MAY contain different mailbox - flags from a LIST untagged response. If this should happen, the - flags in the untagged LIST are considered more authoritative. - - A special situation occurs when using LSUB with the % wildcard. - Consider what happens if "foo/bar" (with a hierarchy delimiter of - "/") is subscribed but "foo" is not. A "%" wildcard to LSUB must - return foo, not foo/bar, in the LSUB response, and it MUST be - flagged with the \Noselect attribute. - - The server MUST NOT unilaterally remove an existing mailbox name - from the subscription list even if a mailbox by that name no - longer exists. - - - - - - - -Crispin Standards Track [Page 43] - -RFC 3501 IMAPv4 March 2003 - - - Example: C: A002 LSUB "#news." "comp.mail.*" - S: * LSUB () "." #news.comp.mail.mime - S: * LSUB () "." #news.comp.mail.misc - S: A002 OK LSUB completed - C: A003 LSUB "#news." "comp.%" - S: * LSUB (\NoSelect) "." #news.comp.mail - S: A003 OK LSUB completed - - -6.3.10. STATUS Command - - Arguments: mailbox name - status data item names - - Responses: untagged responses: STATUS - - Result: OK - status completed - NO - status failure: no status for that name - BAD - command unknown or arguments invalid - - The STATUS command requests the status of the indicated mailbox. - It does not change the currently selected mailbox, nor does it - affect the state of any messages in the queried mailbox (in - particular, STATUS MUST NOT cause messages to lose the \Recent - flag). - - The STATUS command provides an alternative to opening a second - IMAP4rev1 connection and doing an EXAMINE command on a mailbox to - query that mailbox's status without deselecting the current - mailbox in the first IMAP4rev1 connection. - - Unlike the LIST command, the STATUS command is not guaranteed to - be fast in its response. Under certain circumstances, it can be - quite slow. In some implementations, the server is obliged to - open the mailbox read-only internally to obtain certain status - information. Also unlike the LIST command, the STATUS command - does not accept wildcards. - - Note: The STATUS command is intended to access the - status of mailboxes other than the currently selected - mailbox. Because the STATUS command can cause the - mailbox to be opened internally, and because this - information is available by other means on the selected - mailbox, the STATUS command SHOULD NOT be used on the - currently selected mailbox. - - - - - - -Crispin Standards Track [Page 44] - -RFC 3501 IMAPv4 March 2003 - - - The STATUS command MUST NOT be used as a "check for new - messages in the selected mailbox" operation (refer to - sections 7, 7.3.1, and 7.3.2 for more information about - the proper method for new message checking). - - Because the STATUS command is not guaranteed to be fast - in its results, clients SHOULD NOT expect to be able to - issue many consecutive STATUS commands and obtain - reasonable performance. - - The currently defined status data items that can be requested are: - - MESSAGES - The number of messages in the mailbox. - - RECENT - The number of messages with the \Recent flag set. - - UIDNEXT - The next unique identifier value of the mailbox. Refer to - section 2.3.1.1 for more information. - - UIDVALIDITY - The unique identifier validity value of the mailbox. Refer to - section 2.3.1.1 for more information. - - UNSEEN - The number of messages which do not have the \Seen flag set. - - - Example: C: A042 STATUS blurdybloop (UIDNEXT MESSAGES) - S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292) - S: A042 OK STATUS completed - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 45] - -RFC 3501 IMAPv4 March 2003 - - -6.3.11. APPEND Command - - Arguments: mailbox name - OPTIONAL flag parenthesized list - OPTIONAL date/time string - message literal - - Responses: no specific responses for this command - - Result: OK - append completed - NO - append error: can't append to that mailbox, error - in flags or date/time or message text - BAD - command unknown or arguments invalid - - The APPEND command appends the literal argument as a new message - to the end of the specified destination mailbox. This argument - SHOULD be in the format of an [RFC-2822] message. 8-bit - characters are permitted in the message. A server implementation - that is unable to preserve 8-bit data properly MUST be able to - reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB] - content transfer encoding. - - Note: There MAY be exceptions, e.g., draft messages, in - which required [RFC-2822] header lines are omitted in - the message literal argument to APPEND. The full - implications of doing so MUST be understood and - carefully weighed. - - If a flag parenthesized list is specified, the flags SHOULD be set - in the resulting message; otherwise, the flag list of the - resulting message is set to empty by default. In either case, the - Recent flag is also set. - - If a date-time is specified, the internal date SHOULD be set in - the resulting message; otherwise, the internal date of the - resulting message is set to the current date and time by default. - - If the append is unsuccessful for any reason, the mailbox MUST be - restored to its state before the APPEND attempt; no partial - appending is permitted. - - If the destination mailbox does not exist, a server MUST return an - error, and MUST NOT automatically create the mailbox. Unless it - is certain that the destination mailbox can not be created, the - server MUST send the response code "[TRYCREATE]" as the prefix of - the text of the tagged NO response. This gives a hint to the - client that it can attempt a CREATE command and retry the APPEND - if the CREATE is successful. - - - -Crispin Standards Track [Page 46] - -RFC 3501 IMAPv4 March 2003 - - - If the mailbox is currently selected, the normal new message - actions SHOULD occur. Specifically, the server SHOULD notify the - client immediately via an untagged EXISTS response. If the server - does not do so, the client MAY issue a NOOP command (or failing - that, a CHECK command) after one or more APPEND commands. - - Example: C: A003 APPEND saved-messages (\Seen) {310} - S: + Ready for literal data - C: Date: Mon, 7 Feb 1994 21:52:25 -0800 (PST) - C: From: Fred Foobar <foobar@Blurdybloop.COM> - C: Subject: afternoon meeting - C: To: mooch@owatagu.siam.edu - C: Message-Id: <B27397-0100000@Blurdybloop.COM> - C: MIME-Version: 1.0 - C: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII - C: - C: Hello Joe, do you think we can meet at 3:30 tomorrow? - C: - S: A003 OK APPEND completed - - Note: The APPEND command is not used for message delivery, - because it does not provide a mechanism to transfer [SMTP] - envelope information. - -6.4. Client Commands - Selected State - - In the selected state, commands that manipulate messages in a mailbox - are permitted. - - In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), - and the authenticated state commands (SELECT, EXAMINE, CREATE, - DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB, STATUS, and - APPEND), the following commands are valid in the selected state: - CHECK, CLOSE, EXPUNGE, SEARCH, FETCH, STORE, COPY, and UID. - -6.4.1. CHECK Command - - Arguments: none - - Responses: no specific responses for this command - - Result: OK - check completed - BAD - command unknown or arguments invalid - - The CHECK command requests a checkpoint of the currently selected - mailbox. A checkpoint refers to any implementation-dependent - housekeeping associated with the mailbox (e.g., resolving the - server's in-memory state of the mailbox with the state on its - - - -Crispin Standards Track [Page 47] - -RFC 3501 IMAPv4 March 2003 - - - disk) that is not normally executed as part of each command. A - checkpoint MAY take a non-instantaneous amount of real time to - complete. If a server implementation has no such housekeeping - considerations, CHECK is equivalent to NOOP. - - There is no guarantee that an EXISTS untagged response will happen - as a result of CHECK. NOOP, not CHECK, SHOULD be used for new - message polling. - - Example: C: FXXZ CHECK - S: FXXZ OK CHECK Completed - - -6.4.2. CLOSE Command - - Arguments: none - - Responses: no specific responses for this command - - Result: OK - close completed, now in authenticated state - BAD - command unknown or arguments invalid - - The CLOSE command permanently removes all messages that have the - \Deleted flag set from the currently selected mailbox, and returns - to the authenticated state from the selected state. No untagged - EXPUNGE responses are sent. - - No messages are removed, and no error is given, if the mailbox is - selected by an EXAMINE command or is otherwise selected read-only. - - Even if a mailbox is selected, a SELECT, EXAMINE, or LOGOUT - command MAY be issued without previously issuing a CLOSE command. - The SELECT, EXAMINE, and LOGOUT commands implicitly close the - currently selected mailbox without doing an expunge. However, - when many messages are deleted, a CLOSE-LOGOUT or CLOSE-SELECT - sequence is considerably faster than an EXPUNGE-LOGOUT or - EXPUNGE-SELECT because no untagged EXPUNGE responses (which the - client would probably ignore) are sent. - - Example: C: A341 CLOSE - S: A341 OK CLOSE completed - - - - - - - - - - -Crispin Standards Track [Page 48] - -RFC 3501 IMAPv4 March 2003 - - -6.4.3. EXPUNGE Command - - Arguments: none - - Responses: untagged responses: EXPUNGE - - Result: OK - expunge completed - NO - expunge failure: can't expunge (e.g., permission - denied) - BAD - command unknown or arguments invalid - - The EXPUNGE command permanently removes all messages that have the - \Deleted flag set from the currently selected mailbox. Before - returning an OK to the client, an untagged EXPUNGE response is - sent for each message that is removed. - - Example: C: A202 EXPUNGE - S: * 3 EXPUNGE - S: * 3 EXPUNGE - S: * 5 EXPUNGE - S: * 8 EXPUNGE - S: A202 OK EXPUNGE completed - - Note: In this example, messages 3, 4, 7, and 11 had the - \Deleted flag set. See the description of the EXPUNGE - response for further explanation. - - -6.4.4. SEARCH Command - - Arguments: OPTIONAL [CHARSET] specification - searching criteria (one or more) - - Responses: REQUIRED untagged response: SEARCH - - Result: OK - search completed - NO - search error: can't search that [CHARSET] or - criteria - BAD - command unknown or arguments invalid - - The SEARCH command searches the mailbox for messages that match - the given searching criteria. Searching criteria consist of one - or more search keys. The untagged SEARCH response from the server - contains a listing of message sequence numbers corresponding to - those messages that match the searching criteria. - - - - - - -Crispin Standards Track [Page 49] - -RFC 3501 IMAPv4 March 2003 - - - When multiple keys are specified, the result is the intersection - (AND function) of all the messages that match those keys. For - example, the criteria DELETED FROM "SMITH" SINCE 1-Feb-1994 refers - to all deleted messages from Smith that were placed in the mailbox - since February 1, 1994. A search key can also be a parenthesized - list of one or more search keys (e.g., for use with the OR and NOT - keys). - - Server implementations MAY exclude [MIME-IMB] body parts with - terminal content media types other than TEXT and MESSAGE from - consideration in SEARCH matching. - - The OPTIONAL [CHARSET] specification consists of the word - "CHARSET" followed by a registered [CHARSET]. It indicates the - [CHARSET] of the strings that appear in the search criteria. - [MIME-IMB] content transfer encodings, and [MIME-HDRS] strings in - [RFC-2822]/[MIME-IMB] headers, MUST be decoded before comparing - text in a [CHARSET] other than US-ASCII. US-ASCII MUST be - supported; other [CHARSET]s MAY be supported. - - If the server does not support the specified [CHARSET], it MUST - return a tagged NO response (not a BAD). This response SHOULD - contain the BADCHARSET response code, which MAY list the - [CHARSET]s supported by the server. - - In all search keys that use strings, a message matches the key if - the string is a substring of the field. The matching is - case-insensitive. - - The defined search keys are as follows. Refer to the Formal - Syntax section for the precise syntactic definitions of the - arguments. - - <sequence set> - Messages with message sequence numbers corresponding to the - specified message sequence number set. - - ALL - All messages in the mailbox; the default initial key for - ANDing. - - ANSWERED - Messages with the \Answered flag set. - - - - - - - - -Crispin Standards Track [Page 50] - -RFC 3501 IMAPv4 March 2003 - - - BCC <string> - Messages that contain the specified string in the envelope - structure's BCC field. - - BEFORE <date> - Messages whose internal date (disregarding time and timezone) - is earlier than the specified date. - - BODY <string> - Messages that contain the specified string in the body of the - message. - - CC <string> - Messages that contain the specified string in the envelope - structure's CC field. - - DELETED - Messages with the \Deleted flag set. - - DRAFT - Messages with the \Draft flag set. - - FLAGGED - Messages with the \Flagged flag set. - - FROM <string> - Messages that contain the specified string in the envelope - structure's FROM field. - - HEADER <field-name> <string> - Messages that have a header with the specified field-name (as - defined in [RFC-2822]) and that contains the specified string - in the text of the header (what comes after the colon). If the - string to search is zero-length, this matches all messages that - have a header line with the specified field-name regardless of - the contents. - - KEYWORD <flag> - Messages with the specified keyword flag set. - - LARGER <n> - Messages with an [RFC-2822] size larger than the specified - number of octets. - - NEW - Messages that have the \Recent flag set but not the \Seen flag. - This is functionally equivalent to "(RECENT UNSEEN)". - - - - -Crispin Standards Track [Page 51] - -RFC 3501 IMAPv4 March 2003 - - - NOT <search-key> - Messages that do not match the specified search key. - - OLD - Messages that do not have the \Recent flag set. This is - functionally equivalent to "NOT RECENT" (as opposed to "NOT - NEW"). - - ON <date> - Messages whose internal date (disregarding time and timezone) - is within the specified date. - - OR <search-key1> <search-key2> - Messages that match either search key. - - RECENT - Messages that have the \Recent flag set. - - SEEN - Messages that have the \Seen flag set. - - SENTBEFORE <date> - Messages whose [RFC-2822] Date: header (disregarding time and - timezone) is earlier than the specified date. - - SENTON <date> - Messages whose [RFC-2822] Date: header (disregarding time and - timezone) is within the specified date. - - SENTSINCE <date> - Messages whose [RFC-2822] Date: header (disregarding time and - timezone) is within or later than the specified date. - - SINCE <date> - Messages whose internal date (disregarding time and timezone) - is within or later than the specified date. - - SMALLER <n> - Messages with an [RFC-2822] size smaller than the specified - number of octets. - - - - - - - - - - - -Crispin Standards Track [Page 52] - -RFC 3501 IMAPv4 March 2003 - - - SUBJECT <string> - Messages that contain the specified string in the envelope - structure's SUBJECT field. - - TEXT <string> - Messages that contain the specified string in the header or - body of the message. - - TO <string> - Messages that contain the specified string in the envelope - structure's TO field. - - UID <sequence set> - Messages with unique identifiers corresponding to the specified - unique identifier set. Sequence set ranges are permitted. - - UNANSWERED - Messages that do not have the \Answered flag set. - - UNDELETED - Messages that do not have the \Deleted flag set. - - UNDRAFT - Messages that do not have the \Draft flag set. - - UNFLAGGED - Messages that do not have the \Flagged flag set. - - UNKEYWORD <flag> - Messages that do not have the specified keyword flag set. - - UNSEEN - Messages that do not have the \Seen flag set. - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 53] - -RFC 3501 IMAPv4 March 2003 - - - Example: C: A282 SEARCH FLAGGED SINCE 1-Feb-1994 NOT FROM "Smith" - S: * SEARCH 2 84 882 - S: A282 OK SEARCH completed - C: A283 SEARCH TEXT "string not in mailbox" - S: * SEARCH - S: A283 OK SEARCH completed - C: A284 SEARCH CHARSET UTF-8 TEXT {6} - C: XXXXXX - S: * SEARCH 43 - S: A284 OK SEARCH completed - - Note: Since this document is restricted to 7-bit ASCII - text, it is not possible to show actual UTF-8 data. The - "XXXXXX" is a placeholder for what would be 6 octets of - 8-bit data in an actual transaction. - - -6.4.5. FETCH Command - - Arguments: sequence set - message data item names or macro - - Responses: untagged responses: FETCH - - Result: OK - fetch completed - NO - fetch error: can't fetch that data - BAD - command unknown or arguments invalid - - The FETCH command retrieves data associated with a message in the - mailbox. The data items to be fetched can be either a single atom - or a parenthesized list. - - Most data items, identified in the formal syntax under the - msg-att-static rule, are static and MUST NOT change for any - particular message. Other data items, identified in the formal - syntax under the msg-att-dynamic rule, MAY change, either as a - result of a STORE command or due to external events. - - For example, if a client receives an ENVELOPE for a - message when it already knows the envelope, it can - safely ignore the newly transmitted envelope. - - There are three macros which specify commonly-used sets of data - items, and can be used instead of data items. A macro must be - used by itself, and not in conjunction with other macros or data - items. - - - - - -Crispin Standards Track [Page 54] - -RFC 3501 IMAPv4 March 2003 - - - ALL - Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE) - - FAST - Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE) - - FULL - Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE - BODY) - - The currently defined data items that can be fetched are: - - BODY - Non-extensible form of BODYSTRUCTURE. - - BODY[<section>]<<partial>> - The text of a particular body section. The section - specification is a set of zero or more part specifiers - delimited by periods. A part specifier is either a part number - or one of the following: HEADER, HEADER.FIELDS, - HEADER.FIELDS.NOT, MIME, and TEXT. An empty section - specification refers to the entire message, including the - header. - - Every message has at least one part number. Non-[MIME-IMB] - messages, and non-multipart [MIME-IMB] messages with no - encapsulated message, only have a part 1. - - Multipart messages are assigned consecutive part numbers, as - they occur in the message. If a particular part is of type - message or multipart, its parts MUST be indicated by a period - followed by the part number within that nested multipart part. - - A part of type MESSAGE/RFC822 also has nested part numbers, - referring to parts of the MESSAGE part's body. - - The HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, and TEXT part - specifiers can be the sole part specifier or can be prefixed by - one or more numeric part specifiers, provided that the numeric - part specifier refers to a part of type MESSAGE/RFC822. The - MIME part specifier MUST be prefixed by one or more numeric - part specifiers. - - The HEADER, HEADER.FIELDS, and HEADER.FIELDS.NOT part - specifiers refer to the [RFC-2822] header of the message or of - an encapsulated [MIME-IMT] MESSAGE/RFC822 message. - HEADER.FIELDS and HEADER.FIELDS.NOT are followed by a list of - field-name (as defined in [RFC-2822]) names, and return a - - - -Crispin Standards Track [Page 55] - -RFC 3501 IMAPv4 March 2003 - - - subset of the header. The subset returned by HEADER.FIELDS - contains only those header fields with a field-name that - matches one of the names in the list; similarly, the subset - returned by HEADER.FIELDS.NOT contains only the header fields - with a non-matching field-name. The field-matching is - case-insensitive but otherwise exact. Subsetting does not - exclude the [RFC-2822] delimiting blank line between the header - and the body; the blank line is included in all header fetches, - except in the case of a message which has no body and no blank - line. - - The MIME part specifier refers to the [MIME-IMB] header for - this part. - - The TEXT part specifier refers to the text body of the message, - omitting the [RFC-2822] header. - - Here is an example of a complex message with some of its - part specifiers: - - HEADER ([RFC-2822] header of the message) - TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED - 1 TEXT/PLAIN - 2 APPLICATION/OCTET-STREAM - 3 MESSAGE/RFC822 - 3.HEADER ([RFC-2822] header of the message) - 3.TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED - 3.1 TEXT/PLAIN - 3.2 APPLICATION/OCTET-STREAM - 4 MULTIPART/MIXED - 4.1 IMAGE/GIF - 4.1.MIME ([MIME-IMB] header for the IMAGE/GIF) - 4.2 MESSAGE/RFC822 - 4.2.HEADER ([RFC-2822] header of the message) - 4.2.TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED - 4.2.1 TEXT/PLAIN - 4.2.2 MULTIPART/ALTERNATIVE - 4.2.2.1 TEXT/PLAIN - 4.2.2.2 TEXT/RICHTEXT - - - It is possible to fetch a substring of the designated text. - This is done by appending an open angle bracket ("<"), the - octet position of the first desired octet, a period, the - maximum number of octets desired, and a close angle bracket - (">") to the part specifier. If the starting octet is beyond - the end of the text, an empty string is returned. - - - - -Crispin Standards Track [Page 56] - -RFC 3501 IMAPv4 March 2003 - - - Any partial fetch that attempts to read beyond the end of the - text is truncated as appropriate. A partial fetch that starts - at octet 0 is returned as a partial fetch, even if this - truncation happened. - - Note: This means that BODY[]<0.2048> of a 1500-octet message - will return BODY[]<0> with a literal of size 1500, not - BODY[]. - - Note: A substring fetch of a HEADER.FIELDS or - HEADER.FIELDS.NOT part specifier is calculated after - subsetting the header. - - The \Seen flag is implicitly set; if this causes the flags to - change, they SHOULD be included as part of the FETCH responses. - - BODY.PEEK[<section>]<<partial>> - An alternate form of BODY[<section>] that does not implicitly - set the \Seen flag. - - BODYSTRUCTURE - The [MIME-IMB] body structure of the message. This is computed - by the server by parsing the [MIME-IMB] header fields in the - [RFC-2822] header and [MIME-IMB] headers. - - ENVELOPE - The envelope structure of the message. This is computed by the - server by parsing the [RFC-2822] header into the component - parts, defaulting various fields as necessary. - - FLAGS - The flags that are set for this message. - - INTERNALDATE - The internal date of the message. - - RFC822 - Functionally equivalent to BODY[], differing in the syntax of - the resulting untagged FETCH data (RFC822 is returned). - - RFC822.HEADER - Functionally equivalent to BODY.PEEK[HEADER], differing in the - syntax of the resulting untagged FETCH data (RFC822.HEADER is - returned). - - RFC822.SIZE - The [RFC-2822] size of the message. - - - - -Crispin Standards Track [Page 57] - -RFC 3501 IMAPv4 March 2003 - - - RFC822.TEXT - Functionally equivalent to BODY[TEXT], differing in the syntax - of the resulting untagged FETCH data (RFC822.TEXT is returned). - - UID - The unique identifier for the message. - - - Example: C: A654 FETCH 2:4 (FLAGS BODY[HEADER.FIELDS (DATE FROM)]) - S: * 2 FETCH .... - S: * 3 FETCH .... - S: * 4 FETCH .... - S: A654 OK FETCH completed - - -6.4.6. STORE Command - - Arguments: sequence set - message data item name - value for message data item - - Responses: untagged responses: FETCH - - Result: OK - store completed - NO - store error: can't store that data - BAD - command unknown or arguments invalid - - The STORE command alters data associated with a message in the - mailbox. Normally, STORE will return the updated value of the - data with an untagged FETCH response. A suffix of ".SILENT" in - the data item name prevents the untagged FETCH, and the server - SHOULD assume that the client has determined the updated value - itself or does not care about the updated value. - - Note: Regardless of whether or not the ".SILENT" suffix - was used, the server SHOULD send an untagged FETCH - response if a change to a message's flags from an - external source is observed. The intent is that the - status of the flags is determinate without a race - condition. - - - - - - - - - - - -Crispin Standards Track [Page 58] - -RFC 3501 IMAPv4 March 2003 - - - The currently defined data items that can be stored are: - - FLAGS <flag list> - Replace the flags for the message (other than \Recent) with the - argument. The new value of the flags is returned as if a FETCH - of those flags was done. - - FLAGS.SILENT <flag list> - Equivalent to FLAGS, but without returning a new value. - - +FLAGS <flag list> - Add the argument to the flags for the message. The new value - of the flags is returned as if a FETCH of those flags was done. - - +FLAGS.SILENT <flag list> - Equivalent to +FLAGS, but without returning a new value. - - -FLAGS <flag list> - Remove the argument from the flags for the message. The new - value of the flags is returned as if a FETCH of those flags was - done. - - -FLAGS.SILENT <flag list> - Equivalent to -FLAGS, but without returning a new value. - - - Example: C: A003 STORE 2:4 +FLAGS (\Deleted) - S: * 2 FETCH (FLAGS (\Deleted \Seen)) - S: * 3 FETCH (FLAGS (\Deleted)) - S: * 4 FETCH (FLAGS (\Deleted \Flagged \Seen)) - S: A003 OK STORE completed - - -6.4.7. COPY Command - - Arguments: sequence set - mailbox name - - Responses: no specific responses for this command - - Result: OK - copy completed - NO - copy error: can't copy those messages or to that - name - BAD - command unknown or arguments invalid - - - - - - - -Crispin Standards Track [Page 59] - -RFC 3501 IMAPv4 March 2003 - - - The COPY command copies the specified message(s) to the end of the - specified destination mailbox. The flags and internal date of the - message(s) SHOULD be preserved, and the Recent flag SHOULD be set, - in the copy. - - If the destination mailbox does not exist, a server SHOULD return - an error. It SHOULD NOT automatically create the mailbox. Unless - it is certain that the destination mailbox can not be created, the - server MUST send the response code "[TRYCREATE]" as the prefix of - the text of the tagged NO response. This gives a hint to the - client that it can attempt a CREATE command and retry the COPY if - the CREATE is successful. - - If the COPY command is unsuccessful for any reason, server - implementations MUST restore the destination mailbox to its state - before the COPY attempt. - - Example: C: A003 COPY 2:4 MEETING - S: A003 OK COPY completed - - -6.4.8. UID Command - - Arguments: command name - command arguments - - Responses: untagged responses: FETCH, SEARCH - - Result: OK - UID command completed - NO - UID command error - BAD - command unknown or arguments invalid - - The UID command has two forms. In the first form, it takes as its - arguments a COPY, FETCH, or STORE command with arguments - appropriate for the associated command. However, the numbers in - the sequence set argument are unique identifiers instead of - message sequence numbers. Sequence set ranges are permitted, but - there is no guarantee that unique identifiers will be contiguous. - - A non-existent unique identifier is ignored without any error - message generated. Thus, it is possible for a UID FETCH command - to return an OK without any data or a UID COPY or UID STORE to - return an OK without performing any operations. - - In the second form, the UID command takes a SEARCH command with - SEARCH command arguments. The interpretation of the arguments is - the same as with SEARCH; however, the numbers returned in a SEARCH - response for a UID SEARCH command are unique identifiers instead - - - -Crispin Standards Track [Page 60] - -RFC 3501 IMAPv4 March 2003 - - - of message sequence numbers. For example, the command UID SEARCH - 1:100 UID 443:557 returns the unique identifiers corresponding to - the intersection of two sequence sets, the message sequence number - range 1:100 and the UID range 443:557. - - Note: in the above example, the UID range 443:557 - appears. The same comment about a non-existent unique - identifier being ignored without any error message also - applies here. Hence, even if neither UID 443 or 557 - exist, this range is valid and would include an existing - UID 495. - - Also note that a UID range of 559:* always includes the - UID of the last message in the mailbox, even if 559 is - higher than any assigned UID value. This is because the - contents of a range are independent of the order of the - range endpoints. Thus, any UID range with * as one of - the endpoints indicates at least one message (the - message with the highest numbered UID), unless the - mailbox is empty. - - The number after the "*" in an untagged FETCH response is always a - message sequence number, not a unique identifier, even for a UID - command response. However, server implementations MUST implicitly - include the UID message data item as part of any FETCH response - caused by a UID command, regardless of whether a UID was specified - as a message data item to the FETCH. - - - Note: The rule about including the UID message data item as part - of a FETCH response primarily applies to the UID FETCH and UID - STORE commands, including a UID FETCH command that does not - include UID as a message data item. Although it is unlikely that - the other UID commands will cause an untagged FETCH, this rule - applies to these commands as well. - - Example: C: A999 UID FETCH 4827313:4828442 FLAGS - S: * 23 FETCH (FLAGS (\Seen) UID 4827313) - S: * 24 FETCH (FLAGS (\Seen) UID 4827943) - S: * 25 FETCH (FLAGS (\Seen) UID 4828442) - S: A999 OK UID FETCH completed - - - - - - - - - - -Crispin Standards Track [Page 61] - -RFC 3501 IMAPv4 March 2003 - - -6.5. Client Commands - Experimental/Expansion - - -6.5.1. X<atom> Command - - Arguments: implementation defined - - Responses: implementation defined - - Result: OK - command completed - NO - failure - BAD - command unknown or arguments invalid - - Any command prefixed with an X is an experimental command. - Commands which are not part of this specification, a standard or - standards-track revision of this specification, or an - IESG-approved experimental protocol, MUST use the X prefix. - - Any added untagged responses issued by an experimental command - MUST also be prefixed with an X. Server implementations MUST NOT - send any such untagged responses, unless the client requested it - by issuing the associated experimental command. - - Example: C: a441 CAPABILITY - S: * CAPABILITY IMAP4rev1 XPIG-LATIN - S: a441 OK CAPABILITY completed - C: A442 XPIG-LATIN - S: * XPIG-LATIN ow-nay eaking-spay ig-pay atin-lay - S: A442 OK XPIG-LATIN ompleted-cay - -7. Server Responses - - Server responses are in three forms: status responses, server data, - and command continuation request. The information contained in a - server response, identified by "Contents:" in the response - descriptions below, is described by function, not by syntax. The - precise syntax of server responses is described in the Formal Syntax - section. - - The client MUST be prepared to accept any response at all times. - - Status responses can be tagged or untagged. Tagged status responses - indicate the completion result (OK, NO, or BAD status) of a client - command, and have a tag matching the command. - - Some status responses, and all server data, are untagged. An - untagged response is indicated by the token "*" instead of a tag. - Untagged status responses indicate server greeting, or server status - - - -Crispin Standards Track [Page 62] - -RFC 3501 IMAPv4 March 2003 - - - that does not indicate the completion of a command (for example, an - impending system shutdown alert). For historical reasons, untagged - server data responses are also called "unsolicited data", although - strictly speaking, only unilateral server data is truly - "unsolicited". - - Certain server data MUST be recorded by the client when it is - received; this is noted in the description of that data. Such data - conveys critical information which affects the interpretation of all - subsequent commands and responses (e.g., updates reflecting the - creation or destruction of messages). - - Other server data SHOULD be recorded for later reference; if the - client does not need to record the data, or if recording the data has - no obvious purpose (e.g., a SEARCH response when no SEARCH command is - in progress), the data SHOULD be ignored. - - An example of unilateral untagged server data occurs when the IMAP - connection is in the selected state. In the selected state, the - server checks the mailbox for new messages as part of command - execution. Normally, this is part of the execution of every command; - hence, a NOOP command suffices to check for new messages. If new - messages are found, the server sends untagged EXISTS and RECENT - responses reflecting the new size of the mailbox. Server - implementations that offer multiple simultaneous access to the same - mailbox SHOULD also send appropriate unilateral untagged FETCH and - EXPUNGE responses if another agent changes the state of any message - flags or expunges any messages. - - Command continuation request responses use the token "+" instead of a - tag. These responses are sent by the server to indicate acceptance - of an incomplete client command and readiness for the remainder of - the command. - -7.1. Server Responses - Status Responses - - Status responses are OK, NO, BAD, PREAUTH and BYE. OK, NO, and BAD - can be tagged or untagged. PREAUTH and BYE are always untagged. - - Status responses MAY include an OPTIONAL "response code". A response - code consists of data inside square brackets in the form of an atom, - possibly followed by a space and arguments. The response code - contains additional information or status codes for client software - beyond the OK/NO/BAD condition, and are defined when there is a - specific action that a client can take based upon the additional - information. - - - - - -Crispin Standards Track [Page 63] - -RFC 3501 IMAPv4 March 2003 - - - The currently defined response codes are: - - ALERT - - The human-readable text contains a special alert that MUST be - presented to the user in a fashion that calls the user's - attention to the message. - - BADCHARSET - - Optionally followed by a parenthesized list of charsets. A - SEARCH failed because the given charset is not supported by - this implementation. If the optional list of charsets is - given, this lists the charsets that are supported by this - implementation. - - CAPABILITY - - Followed by a list of capabilities. This can appear in the - initial OK or PREAUTH response to transmit an initial - capabilities list. This makes it unnecessary for a client to - send a separate CAPABILITY command if it recognizes this - response. - - PARSE - - The human-readable text represents an error in parsing the - [RFC-2822] header or [MIME-IMB] headers of a message in the - mailbox. - - PERMANENTFLAGS - - Followed by a parenthesized list of flags, indicates which of - the known flags the client can change permanently. Any flags - that are in the FLAGS untagged response, but not the - PERMANENTFLAGS list, can not be set permanently. If the client - attempts to STORE a flag that is not in the PERMANENTFLAGS - list, the server will either ignore the change or store the - state change for the remainder of the current session only. - The PERMANENTFLAGS list can also include the special flag \*, - which indicates that it is possible to create new keywords by - attempting to store those flags in the mailbox. - - - - - - - - - -Crispin Standards Track [Page 64] - -RFC 3501 IMAPv4 March 2003 - - - READ-ONLY - - The mailbox is selected read-only, or its access while selected - has changed from read-write to read-only. - - READ-WRITE - - The mailbox is selected read-write, or its access while - selected has changed from read-only to read-write. - - TRYCREATE - - An APPEND or COPY attempt is failing because the target mailbox - does not exist (as opposed to some other reason). This is a - hint to the client that the operation can succeed if the - mailbox is first created by the CREATE command. - - UIDNEXT - - Followed by a decimal number, indicates the next unique - identifier value. Refer to section 2.3.1.1 for more - information. - - UIDVALIDITY - - Followed by a decimal number, indicates the unique identifier - validity value. Refer to section 2.3.1.1 for more information. - - UNSEEN - - Followed by a decimal number, indicates the number of the first - message without the \Seen flag set. - - Additional response codes defined by particular client or server - implementations SHOULD be prefixed with an "X" until they are - added to a revision of this protocol. Client implementations - SHOULD ignore response codes that they do not recognize. - -7.1.1. OK Response - - Contents: OPTIONAL response code - human-readable text - - The OK response indicates an information message from the server. - When tagged, it indicates successful completion of the associated - command. The human-readable text MAY be presented to the user as - an information message. The untagged form indicates an - - - - -Crispin Standards Track [Page 65] - -RFC 3501 IMAPv4 March 2003 - - - information-only message; the nature of the information MAY be - indicated by a response code. - - The untagged form is also used as one of three possible greetings - at connection startup. It indicates that the connection is not - yet authenticated and that a LOGIN command is needed. - - Example: S: * OK IMAP4rev1 server ready - C: A001 LOGIN fred blurdybloop - S: * OK [ALERT] System shutdown in 10 minutes - S: A001 OK LOGIN Completed - - -7.1.2. NO Response - - Contents: OPTIONAL response code - human-readable text - - The NO response indicates an operational error message from the - server. When tagged, it indicates unsuccessful completion of the - associated command. The untagged form indicates a warning; the - command can still complete successfully. The human-readable text - describes the condition. - - Example: C: A222 COPY 1:2 owatagusiam - S: * NO Disk is 98% full, please delete unnecessary data - S: A222 OK COPY completed - C: A223 COPY 3:200 blurdybloop - S: * NO Disk is 98% full, please delete unnecessary data - S: * NO Disk is 99% full, please delete unnecessary data - S: A223 NO COPY failed: disk is full - - -7.1.3. BAD Response - - Contents: OPTIONAL response code - human-readable text - - The BAD response indicates an error message from the server. When - tagged, it reports a protocol-level error in the client's command; - the tag indicates the command that caused the error. The untagged - form indicates a protocol-level error for which the associated - command can not be determined; it can also indicate an internal - server failure. The human-readable text describes the condition. - - - - - - - -Crispin Standards Track [Page 66] - -RFC 3501 IMAPv4 March 2003 - - - Example: C: ...very long command line... - S: * BAD Command line too long - C: ...empty line... - S: * BAD Empty command line - C: A443 EXPUNGE - S: * BAD Disk crash, attempting salvage to a new disk! - S: * OK Salvage successful, no data lost - S: A443 OK Expunge completed - - -7.1.4. PREAUTH Response - - Contents: OPTIONAL response code - human-readable text - - The PREAUTH response is always untagged, and is one of three - possible greetings at connection startup. It indicates that the - connection has already been authenticated by external means; thus - no LOGIN command is needed. - - Example: S: * PREAUTH IMAP4rev1 server logged in as Smith - - -7.1.5. BYE Response - - Contents: OPTIONAL response code - human-readable text - - The BYE response is always untagged, and indicates that the server - is about to close the connection. The human-readable text MAY be - displayed to the user in a status report by the client. The BYE - response is sent under one of four conditions: - - 1) as part of a normal logout sequence. The server will close - the connection after sending the tagged OK response to the - LOGOUT command. - - 2) as a panic shutdown announcement. The server closes the - connection immediately. - - 3) as an announcement of an inactivity autologout. The server - closes the connection immediately. - - 4) as one of three possible greetings at connection startup, - indicating that the server is not willing to accept a - connection from this client. The server closes the - connection immediately. - - - - -Crispin Standards Track [Page 67] - -RFC 3501 IMAPv4 March 2003 - - - The difference between a BYE that occurs as part of a normal - LOGOUT sequence (the first case) and a BYE that occurs because of - a failure (the other three cases) is that the connection closes - immediately in the failure case. In all cases the client SHOULD - continue to read response data from the server until the - connection is closed; this will ensure that any pending untagged - or completion responses are read and processed. - - Example: S: * BYE Autologout; idle for too long - -7.2. Server Responses - Server and Mailbox Status - - These responses are always untagged. This is how server and mailbox - status data are transmitted from the server to the client. Many of - these responses typically result from a command with the same name. - -7.2.1. CAPABILITY Response - - Contents: capability listing - - The CAPABILITY response occurs as a result of a CAPABILITY - command. The capability listing contains a space-separated - listing of capability names that the server supports. The - capability listing MUST include the atom "IMAP4rev1". - - In addition, client and server implementations MUST implement the - STARTTLS, LOGINDISABLED, and AUTH=PLAIN (described in [IMAP-TLS]) - capabilities. See the Security Considerations section for - important information. - - A capability name which begins with "AUTH=" indicates that the - server supports that particular authentication mechanism. - - The LOGINDISABLED capability indicates that the LOGIN command is - disabled, and that the server will respond with a tagged NO - response to any attempt to use the LOGIN command even if the user - name and password are valid. An IMAP client MUST NOT issue the - LOGIN command if the server advertises the LOGINDISABLED - capability. - - Other capability names indicate that the server supports an - extension, revision, or amendment to the IMAP4rev1 protocol. - Server responses MUST conform to this document until the client - issues a command that uses the associated capability. - - Capability names MUST either begin with "X" or be standard or - standards-track IMAP4rev1 extensions, revisions, or amendments - registered with IANA. A server MUST NOT offer unregistered or - - - -Crispin Standards Track [Page 68] - -RFC 3501 IMAPv4 March 2003 - - - non-standard capability names, unless such names are prefixed with - an "X". - - Client implementations SHOULD NOT require any capability name - other than "IMAP4rev1", and MUST ignore any unknown capability - names. - - A server MAY send capabilities automatically, by using the - CAPABILITY response code in the initial PREAUTH or OK responses, - and by sending an updated CAPABILITY response code in the tagged - OK response as part of a successful authentication. It is - unnecessary for a client to send a separate CAPABILITY command if - it recognizes these automatic capabilities. - - Example: S: * CAPABILITY IMAP4rev1 STARTTLS AUTH=GSSAPI XPIG-LATIN - - -7.2.2. LIST Response - - Contents: name attributes - hierarchy delimiter - name - - The LIST response occurs as a result of a LIST command. It - returns a single name that matches the LIST specification. There - can be multiple LIST responses for a single LIST command. - - Four name attributes are defined: - - \Noinferiors - It is not possible for any child levels of hierarchy to exist - under this name; no child levels exist now and none can be - created in the future. - - \Noselect - It is not possible to use this name as a selectable mailbox. - - \Marked - The mailbox has been marked "interesting" by the server; the - mailbox probably contains messages that have been added since - the last time the mailbox was selected. - - \Unmarked - The mailbox does not contain any additional messages since the - last time the mailbox was selected. - - - - - - -Crispin Standards Track [Page 69] - -RFC 3501 IMAPv4 March 2003 - - - If it is not feasible for the server to determine whether or not - the mailbox is "interesting", or if the name is a \Noselect name, - the server SHOULD NOT send either \Marked or \Unmarked. - - The hierarchy delimiter is a character used to delimit levels of - hierarchy in a mailbox name. A client can use it to create child - mailboxes, and to search higher or lower levels of naming - hierarchy. All children of a top-level hierarchy node MUST use - the same separator character. A NIL hierarchy delimiter means - that no hierarchy exists; the name is a "flat" name. - - The name represents an unambiguous left-to-right hierarchy, and - MUST be valid for use as a reference in LIST and LSUB commands. - Unless \Noselect is indicated, the name MUST also be valid as an - argument for commands, such as SELECT, that accept mailbox names. - - Example: S: * LIST (\Noselect) "/" ~/Mail/foo - - -7.2.3. LSUB Response - - Contents: name attributes - hierarchy delimiter - name - - The LSUB response occurs as a result of an LSUB command. It - returns a single name that matches the LSUB specification. There - can be multiple LSUB responses for a single LSUB command. The - data is identical in format to the LIST response. - - Example: S: * LSUB () "." #news.comp.mail.misc - - -7.2.4 STATUS Response - - Contents: name - status parenthesized list - - The STATUS response occurs as a result of an STATUS command. It - returns the mailbox name that matches the STATUS specification and - the requested mailbox status information. - - Example: S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292) - - - - - - - - -Crispin Standards Track [Page 70] - -RFC 3501 IMAPv4 March 2003 - - -7.2.5. SEARCH Response - - Contents: zero or more numbers - - The SEARCH response occurs as a result of a SEARCH or UID SEARCH - command. The number(s) refer to those messages that match the - search criteria. For SEARCH, these are message sequence numbers; - for UID SEARCH, these are unique identifiers. Each number is - delimited by a space. - - Example: S: * SEARCH 2 3 6 - - -7.2.6. FLAGS Response - - Contents: flag parenthesized list - - The FLAGS response occurs as a result of a SELECT or EXAMINE - command. The flag parenthesized list identifies the flags (at a - minimum, the system-defined flags) that are applicable for this - mailbox. Flags other than the system flags can also exist, - depending on server implementation. - - The update from the FLAGS response MUST be recorded by the client. - - Example: S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) - - -7.3. Server Responses - Mailbox Size - - These responses are always untagged. This is how changes in the size - of the mailbox are transmitted from the server to the client. - Immediately following the "*" token is a number that represents a - message count. - -7.3.1. EXISTS Response - - Contents: none - - The EXISTS response reports the number of messages in the mailbox. - This response occurs as a result of a SELECT or EXAMINE command, - and if the size of the mailbox changes (e.g., new messages). - - The update from the EXISTS response MUST be recorded by the - client. - - Example: S: * 23 EXISTS - - - - -Crispin Standards Track [Page 71] - -RFC 3501 IMAPv4 March 2003 - - -7.3.2. RECENT Response - - Contents: none - - The RECENT response reports the number of messages with the - \Recent flag set. This response occurs as a result of a SELECT or - EXAMINE command, and if the size of the mailbox changes (e.g., new - messages). - - Note: It is not guaranteed that the message sequence - numbers of recent messages will be a contiguous range of - the highest n messages in the mailbox (where n is the - value reported by the RECENT response). Examples of - situations in which this is not the case are: multiple - clients having the same mailbox open (the first session - to be notified will see it as recent, others will - probably see it as non-recent), and when the mailbox is - re-ordered by a non-IMAP agent. - - The only reliable way to identify recent messages is to - look at message flags to see which have the \Recent flag - set, or to do a SEARCH RECENT. - - The update from the RECENT response MUST be recorded by the - client. - - Example: S: * 5 RECENT - - -7.4. Server Responses - Message Status - - These responses are always untagged. This is how message data are - transmitted from the server to the client, often as a result of a - command with the same name. Immediately following the "*" token is a - number that represents a message sequence number. - -7.4.1. EXPUNGE Response - - Contents: none - - The EXPUNGE response reports that the specified message sequence - number has been permanently removed from the mailbox. The message - sequence number for each successive message in the mailbox is - immediately decremented by 1, and this decrement is reflected in - message sequence numbers in subsequent responses (including other - untagged EXPUNGE responses). - - - - - -Crispin Standards Track [Page 72] - -RFC 3501 IMAPv4 March 2003 - - - The EXPUNGE response also decrements the number of messages in the - mailbox; it is not necessary to send an EXISTS response with the - new value. - - As a result of the immediate decrement rule, message sequence - numbers that appear in a set of successive EXPUNGE responses - depend upon whether the messages are removed starting from lower - numbers to higher numbers, or from higher numbers to lower - numbers. For example, if the last 5 messages in a 9-message - mailbox are expunged, a "lower to higher" server will send five - untagged EXPUNGE responses for message sequence number 5, whereas - a "higher to lower server" will send successive untagged EXPUNGE - responses for message sequence numbers 9, 8, 7, 6, and 5. - - An EXPUNGE response MUST NOT be sent when no command is in - progress, nor while responding to a FETCH, STORE, or SEARCH - command. This rule is necessary to prevent a loss of - synchronization of message sequence numbers between client and - server. A command is not "in progress" until the complete command - has been received; in particular, a command is not "in progress" - during the negotiation of command continuation. - - Note: UID FETCH, UID STORE, and UID SEARCH are different - commands from FETCH, STORE, and SEARCH. An EXPUNGE - response MAY be sent during a UID command. - - The update from the EXPUNGE response MUST be recorded by the - client. - - Example: S: * 44 EXPUNGE - - -7.4.2. FETCH Response - - Contents: message data - - The FETCH response returns data about a message to the client. - The data are pairs of data item names and their values in - parentheses. This response occurs as the result of a FETCH or - STORE command, as well as by unilateral server decision (e.g., - flag updates). - - The current data items are: - - BODY - A form of BODYSTRUCTURE without extension data. - - - - - -Crispin Standards Track [Page 73] - -RFC 3501 IMAPv4 March 2003 - - - BODY[<section>]<<origin octet>> - A string expressing the body contents of the specified section. - The string SHOULD be interpreted by the client according to the - content transfer encoding, body type, and subtype. - - If the origin octet is specified, this string is a substring of - the entire body contents, starting at that origin octet. This - means that BODY[]<0> MAY be truncated, but BODY[] is NEVER - truncated. - - Note: The origin octet facility MUST NOT be used by a server - in a FETCH response unless the client specifically requested - it by means of a FETCH of a BODY[<section>]<<partial>> data - item. - - 8-bit textual data is permitted if a [CHARSET] identifier is - part of the body parameter parenthesized list for this section. - Note that headers (part specifiers HEADER or MIME, or the - header portion of a MESSAGE/RFC822 part), MUST be 7-bit; 8-bit - characters are not permitted in headers. Note also that the - [RFC-2822] delimiting blank line between the header and the - body is not affected by header line subsetting; the blank line - is always included as part of header data, except in the case - of a message which has no body and no blank line. - - Non-textual data such as binary data MUST be transfer encoded - into a textual form, such as BASE64, prior to being sent to the - client. To derive the original binary data, the client MUST - decode the transfer encoded string. - - BODYSTRUCTURE - A parenthesized list that describes the [MIME-IMB] body - structure of a message. This is computed by the server by - parsing the [MIME-IMB] header fields, defaulting various fields - as necessary. - - For example, a simple text message of 48 lines and 2279 octets - can have a body structure of: ("TEXT" "PLAIN" ("CHARSET" - "US-ASCII") NIL NIL "7BIT" 2279 48) - - Multiple parts are indicated by parenthesis nesting. Instead - of a body type as the first element of the parenthesized list, - there is a sequence of one or more nested body structures. The - second element of the parenthesized list is the multipart - subtype (mixed, digest, parallel, alternative, etc.). - - - - - - -Crispin Standards Track [Page 74] - -RFC 3501 IMAPv4 March 2003 - - - For example, a two part message consisting of a text and a - BASE64-encoded text attachment can have a body structure of: - (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 - 23)("TEXT" "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff") - "<960723163407.20117h@cac.washington.edu>" "Compiler diff" - "BASE64" 4554 73) "MIXED") - - Extension data follows the multipart subtype. Extension data - is never returned with the BODY fetch, but can be returned with - a BODYSTRUCTURE fetch. Extension data, if present, MUST be in - the defined order. The extension data of a multipart body part - are in the following order: - - body parameter parenthesized list - A parenthesized list of attribute/value pairs [e.g., ("foo" - "bar" "baz" "rag") where "bar" is the value of "foo", and - "rag" is the value of "baz"] as defined in [MIME-IMB]. - - body disposition - A parenthesized list, consisting of a disposition type - string, followed by a parenthesized list of disposition - attribute/value pairs as defined in [DISPOSITION]. - - body language - A string or parenthesized list giving the body language - value as defined in [LANGUAGE-TAGS]. - - body location - A string list giving the body content URI as defined in - [LOCATION]. - - Any following extension data are not yet defined in this - version of the protocol. Such extension data can consist of - zero or more NILs, strings, numbers, or potentially nested - parenthesized lists of such data. Client implementations that - do a BODYSTRUCTURE fetch MUST be prepared to accept such - extension data. Server implementations MUST NOT send such - extension data until it has been defined by a revision of this - protocol. - - The basic fields of a non-multipart body part are in the - following order: - - body type - A string giving the content media type name as defined in - [MIME-IMB]. - - - - - -Crispin Standards Track [Page 75] - -RFC 3501 IMAPv4 March 2003 - - - body subtype - A string giving the content subtype name as defined in - [MIME-IMB]. - - body parameter parenthesized list - A parenthesized list of attribute/value pairs [e.g., ("foo" - "bar" "baz" "rag") where "bar" is the value of "foo" and - "rag" is the value of "baz"] as defined in [MIME-IMB]. - - body id - A string giving the content id as defined in [MIME-IMB]. - - body description - A string giving the content description as defined in - [MIME-IMB]. - - body encoding - A string giving the content transfer encoding as defined in - [MIME-IMB]. - - body size - A number giving the size of the body in octets. Note that - this size is the size in its transfer encoding and not the - resulting size after any decoding. - - A body type of type MESSAGE and subtype RFC822 contains, - immediately after the basic fields, the envelope structure, - body structure, and size in text lines of the encapsulated - message. - - A body type of type TEXT contains, immediately after the basic - fields, the size of the body in text lines. Note that this - size is the size in its content transfer encoding and not the - resulting size after any decoding. - - Extension data follows the basic fields and the type-specific - fields listed above. Extension data is never returned with the - BODY fetch, but can be returned with a BODYSTRUCTURE fetch. - Extension data, if present, MUST be in the defined order. - - The extension data of a non-multipart body part are in the - following order: - - body MD5 - A string giving the body MD5 value as defined in [MD5]. - - - - - - -Crispin Standards Track [Page 76] - -RFC 3501 IMAPv4 March 2003 - - - body disposition - A parenthesized list with the same content and function as - the body disposition for a multipart body part. - - body language - A string or parenthesized list giving the body language - value as defined in [LANGUAGE-TAGS]. - - body location - A string list giving the body content URI as defined in - [LOCATION]. - - Any following extension data are not yet defined in this - version of the protocol, and would be as described above under - multipart extension data. - - ENVELOPE - A parenthesized list that describes the envelope structure of a - message. This is computed by the server by parsing the - [RFC-2822] header into the component parts, defaulting various - fields as necessary. - - The fields of the envelope structure are in the following - order: date, subject, from, sender, reply-to, to, cc, bcc, - in-reply-to, and message-id. The date, subject, in-reply-to, - and message-id fields are strings. The from, sender, reply-to, - to, cc, and bcc fields are parenthesized lists of address - structures. - - An address structure is a parenthesized list that describes an - electronic mail address. The fields of an address structure - are in the following order: personal name, [SMTP] - at-domain-list (source route), mailbox name, and host name. - - [RFC-2822] group syntax is indicated by a special form of - address structure in which the host name field is NIL. If the - mailbox name field is also NIL, this is an end of group marker - (semi-colon in RFC 822 syntax). If the mailbox name field is - non-NIL, this is a start of group marker, and the mailbox name - field holds the group name phrase. - - If the Date, Subject, In-Reply-To, and Message-ID header lines - are absent in the [RFC-2822] header, the corresponding member - of the envelope is NIL; if these header lines are present but - empty the corresponding member of the envelope is the empty - string. - - - - - -Crispin Standards Track [Page 77] - -RFC 3501 IMAPv4 March 2003 - - - Note: some servers may return a NIL envelope member in the - "present but empty" case. Clients SHOULD treat NIL and - empty string as identical. - - Note: [RFC-2822] requires that all messages have a valid - Date header. Therefore, the date member in the envelope can - not be NIL or the empty string. - - Note: [RFC-2822] requires that the In-Reply-To and - Message-ID headers, if present, have non-empty content. - Therefore, the in-reply-to and message-id members in the - envelope can not be the empty string. - - If the From, To, cc, and bcc header lines are absent in the - [RFC-2822] header, or are present but empty, the corresponding - member of the envelope is NIL. - - If the Sender or Reply-To lines are absent in the [RFC-2822] - header, or are present but empty, the server sets the - corresponding member of the envelope to be the same value as - the from member (the client is not expected to know to do - this). - - Note: [RFC-2822] requires that all messages have a valid - From header. Therefore, the from, sender, and reply-to - members in the envelope can not be NIL. - - FLAGS - A parenthesized list of flags that are set for this message. - - INTERNALDATE - A string representing the internal date of the message. - - RFC822 - Equivalent to BODY[]. - - RFC822.HEADER - Equivalent to BODY[HEADER]. Note that this did not result in - \Seen being set, because RFC822.HEADER response data occurs as - a result of a FETCH of RFC822.HEADER. BODY[HEADER] response - data occurs as a result of a FETCH of BODY[HEADER] (which sets - \Seen) or BODY.PEEK[HEADER] (which does not set \Seen). - - RFC822.SIZE - A number expressing the [RFC-2822] size of the message. - - - - - - -Crispin Standards Track [Page 78] - -RFC 3501 IMAPv4 March 2003 - - - RFC822.TEXT - Equivalent to BODY[TEXT]. - - UID - A number expressing the unique identifier of the message. - - - Example: S: * 23 FETCH (FLAGS (\Seen) RFC822.SIZE 44827) - - -7.5. Server Responses - Command Continuation Request - - The command continuation request response is indicated by a "+" token - instead of a tag. This form of response indicates that the server is - ready to accept the continuation of a command from the client. The - remainder of this response is a line of text. - - This response is used in the AUTHENTICATE command to transmit server - data to the client, and request additional client data. This - response is also used if an argument to any command is a literal. - - The client is not permitted to send the octets of the literal unless - the server indicates that it is expected. This permits the server to - process commands and reject errors on a line-by-line basis. The - remainder of the command, including the CRLF that terminates a - command, follows the octets of the literal. If there are any - additional command arguments, the literal octets are followed by a - space and those arguments. - - Example: C: A001 LOGIN {11} - S: + Ready for additional command text - C: FRED FOOBAR {7} - S: + Ready for additional command text - C: fat man - S: A001 OK LOGIN completed - C: A044 BLURDYBLOOP {102856} - S: A044 BAD No such command as "BLURDYBLOOP" - - - - - - - - - - - - - - -Crispin Standards Track [Page 79] - -RFC 3501 IMAPv4 March 2003 - - -8. Sample IMAP4rev1 connection - - The following is a transcript of an IMAP4rev1 connection. A long - line in this sample is broken for editorial clarity. - -S: * OK IMAP4rev1 Service Ready -C: a001 login mrc secret -S: a001 OK LOGIN completed -C: a002 select inbox -S: * 18 EXISTS -S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) -S: * 2 RECENT -S: * OK [UNSEEN 17] Message 17 is the first unseen message -S: * OK [UIDVALIDITY 3857529045] UIDs valid -S: a002 OK [READ-WRITE] SELECT completed -C: a003 fetch 12 full -S: * 12 FETCH (FLAGS (\Seen) INTERNALDATE "17-Jul-1996 02:44:25 -0700" - RFC822.SIZE 4286 ENVELOPE ("Wed, 17 Jul 1996 02:23:25 -0700 (PDT)" - "IMAP4rev1 WG mtg summary and minutes" - (("Terry Gray" NIL "gray" "cac.washington.edu")) - (("Terry Gray" NIL "gray" "cac.washington.edu")) - (("Terry Gray" NIL "gray" "cac.washington.edu")) - ((NIL NIL "imap" "cac.washington.edu")) - ((NIL NIL "minutes" "CNRI.Reston.VA.US") - ("John Klensin" NIL "KLENSIN" "MIT.EDU")) NIL NIL - "<B27397-0100000@cac.washington.edu>") - BODY ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 3028 - 92)) -S: a003 OK FETCH completed -C: a004 fetch 12 body[header] -S: * 12 FETCH (BODY[HEADER] {342} -S: Date: Wed, 17 Jul 1996 02:23:25 -0700 (PDT) -S: From: Terry Gray <gray@cac.washington.edu> -S: Subject: IMAP4rev1 WG mtg summary and minutes -S: To: imap@cac.washington.edu -S: cc: minutes@CNRI.Reston.VA.US, John Klensin <KLENSIN@MIT.EDU> -S: Message-Id: <B27397-0100000@cac.washington.edu> -S: MIME-Version: 1.0 -S: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII -S: -S: ) -S: a004 OK FETCH completed -C: a005 store 12 +flags \deleted -S: * 12 FETCH (FLAGS (\Seen \Deleted)) -S: a005 OK +FLAGS completed -C: a006 logout -S: * BYE IMAP4rev1 server terminating connection -S: a006 OK LOGOUT completed - - - -Crispin Standards Track [Page 80] - -RFC 3501 IMAPv4 March 2003 - - -9. Formal Syntax - - The following syntax specification uses the Augmented Backus-Naur - Form (ABNF) notation as specified in [ABNF]. - - In the case of alternative or optional rules in which a later rule - overlaps an earlier rule, the rule which is listed earlier MUST take - priority. For example, "\Seen" when parsed as a flag is the \Seen - flag name and not a flag-extension, even though "\Seen" can be parsed - as a flag-extension. Some, but not all, instances of this rule are - noted below. - - Note: [ABNF] rules MUST be followed strictly; in - particular: - - (1) Except as noted otherwise, all alphabetic characters - are case-insensitive. The use of upper or lower case - characters to define token strings is for editorial clarity - only. Implementations MUST accept these strings in a - case-insensitive fashion. - - (2) In all cases, SP refers to exactly one space. It is - NOT permitted to substitute TAB, insert additional spaces, - or otherwise treat SP as being equivalent to LWSP. - - (3) The ASCII NUL character, %x00, MUST NOT be used at any - time. - -address = "(" addr-name SP addr-adl SP addr-mailbox SP - addr-host ")" - -addr-adl = nstring - ; Holds route from [RFC-2822] route-addr if - ; non-NIL - -addr-host = nstring - ; NIL indicates [RFC-2822] group syntax. - ; Otherwise, holds [RFC-2822] domain name - -addr-mailbox = nstring - ; NIL indicates end of [RFC-2822] group; if - ; non-NIL and addr-host is NIL, holds - ; [RFC-2822] group name. - ; Otherwise, holds [RFC-2822] local-part - ; after removing [RFC-2822] quoting - - - - - - -Crispin Standards Track [Page 81] - -RFC 3501 IMAPv4 March 2003 - - -addr-name = nstring - ; If non-NIL, holds phrase from [RFC-2822] - ; mailbox after removing [RFC-2822] quoting - -append = "APPEND" SP mailbox [SP flag-list] [SP date-time] SP - literal - -astring = 1*ASTRING-CHAR / string - -ASTRING-CHAR = ATOM-CHAR / resp-specials - -atom = 1*ATOM-CHAR - -ATOM-CHAR = <any CHAR except atom-specials> - -atom-specials = "(" / ")" / "{" / SP / CTL / list-wildcards / - quoted-specials / resp-specials - -authenticate = "AUTHENTICATE" SP auth-type *(CRLF base64) - -auth-type = atom - ; Defined by [SASL] - -base64 = *(4base64-char) [base64-terminal] - -base64-char = ALPHA / DIGIT / "+" / "/" - ; Case-sensitive - -base64-terminal = (2base64-char "==") / (3base64-char "=") - -body = "(" (body-type-1part / body-type-mpart) ")" - -body-extension = nstring / number / - "(" body-extension *(SP body-extension) ")" - ; Future expansion. Client implementations - ; MUST accept body-extension fields. Server - ; implementations MUST NOT generate - ; body-extension fields except as defined by - ; future standard or standards-track - ; revisions of this specification. - -body-ext-1part = body-fld-md5 [SP body-fld-dsp [SP body-fld-lang - [SP body-fld-loc *(SP body-extension)]]] - ; MUST NOT be returned on non-extensible - ; "BODY" fetch - - - - - - -Crispin Standards Track [Page 82] - -RFC 3501 IMAPv4 March 2003 - - -body-ext-mpart = body-fld-param [SP body-fld-dsp [SP body-fld-lang - [SP body-fld-loc *(SP body-extension)]]] - ; MUST NOT be returned on non-extensible - ; "BODY" fetch - -body-fields = body-fld-param SP body-fld-id SP body-fld-desc SP - body-fld-enc SP body-fld-octets - -body-fld-desc = nstring - -body-fld-dsp = "(" string SP body-fld-param ")" / nil - -body-fld-enc = (DQUOTE ("7BIT" / "8BIT" / "BINARY" / "BASE64"/ - "QUOTED-PRINTABLE") DQUOTE) / string - -body-fld-id = nstring - -body-fld-lang = nstring / "(" string *(SP string) ")" - -body-fld-loc = nstring - -body-fld-lines = number - -body-fld-md5 = nstring - -body-fld-octets = number - -body-fld-param = "(" string SP string *(SP string SP string) ")" / nil - -body-type-1part = (body-type-basic / body-type-msg / body-type-text) - [SP body-ext-1part] - -body-type-basic = media-basic SP body-fields - ; MESSAGE subtype MUST NOT be "RFC822" - -body-type-mpart = 1*body SP media-subtype - [SP body-ext-mpart] - -body-type-msg = media-message SP body-fields SP envelope - SP body SP body-fld-lines - -body-type-text = media-text SP body-fields SP body-fld-lines - -capability = ("AUTH=" auth-type) / atom - ; New capabilities MUST begin with "X" or be - ; registered with IANA as standard or - ; standards-track - - - - -Crispin Standards Track [Page 83] - -RFC 3501 IMAPv4 March 2003 - - -capability-data = "CAPABILITY" *(SP capability) SP "IMAP4rev1" - *(SP capability) - ; Servers MUST implement the STARTTLS, AUTH=PLAIN, - ; and LOGINDISABLED capabilities - ; Servers which offer RFC 1730 compatibility MUST - ; list "IMAP4" as the first capability. - -CHAR8 = %x01-ff - ; any OCTET except NUL, %x00 - -command = tag SP (command-any / command-auth / command-nonauth / - command-select) CRLF - ; Modal based on state - -command-any = "CAPABILITY" / "LOGOUT" / "NOOP" / x-command - ; Valid in all states - -command-auth = append / create / delete / examine / list / lsub / - rename / select / status / subscribe / unsubscribe - ; Valid only in Authenticated or Selected state - -command-nonauth = login / authenticate / "STARTTLS" - ; Valid only when in Not Authenticated state - -command-select = "CHECK" / "CLOSE" / "EXPUNGE" / copy / fetch / store / - uid / search - ; Valid only when in Selected state - -continue-req = "+" SP (resp-text / base64) CRLF - -copy = "COPY" SP sequence-set SP mailbox - -create = "CREATE" SP mailbox - ; Use of INBOX gives a NO error - -date = date-text / DQUOTE date-text DQUOTE - -date-day = 1*2DIGIT - ; Day of month - -date-day-fixed = (SP DIGIT) / 2DIGIT - ; Fixed-format version of date-day - -date-month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / - "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" - -date-text = date-day "-" date-month "-" date-year - - - - -Crispin Standards Track [Page 84] - -RFC 3501 IMAPv4 March 2003 - - -date-year = 4DIGIT - -date-time = DQUOTE date-day-fixed "-" date-month "-" date-year - SP time SP zone DQUOTE - -delete = "DELETE" SP mailbox - ; Use of INBOX gives a NO error - -digit-nz = %x31-39 - ; 1-9 - -envelope = "(" env-date SP env-subject SP env-from SP - env-sender SP env-reply-to SP env-to SP env-cc SP - env-bcc SP env-in-reply-to SP env-message-id ")" - -env-bcc = "(" 1*address ")" / nil - -env-cc = "(" 1*address ")" / nil - -env-date = nstring - -env-from = "(" 1*address ")" / nil - -env-in-reply-to = nstring - -env-message-id = nstring - -env-reply-to = "(" 1*address ")" / nil - -env-sender = "(" 1*address ")" / nil - -env-subject = nstring - -env-to = "(" 1*address ")" / nil - -examine = "EXAMINE" SP mailbox - -fetch = "FETCH" SP sequence-set SP ("ALL" / "FULL" / "FAST" / - fetch-att / "(" fetch-att *(SP fetch-att) ")") - -fetch-att = "ENVELOPE" / "FLAGS" / "INTERNALDATE" / - "RFC822" [".HEADER" / ".SIZE" / ".TEXT"] / - "BODY" ["STRUCTURE"] / "UID" / - "BODY" section ["<" number "." nz-number ">"] / - "BODY.PEEK" section ["<" number "." nz-number ">"] - - - - - - -Crispin Standards Track [Page 85] - -RFC 3501 IMAPv4 March 2003 - - -flag = "\Answered" / "\Flagged" / "\Deleted" / - "\Seen" / "\Draft" / flag-keyword / flag-extension - ; Does not include "\Recent" - -flag-extension = "\" atom - ; Future expansion. Client implementations - ; MUST accept flag-extension flags. Server - ; implementations MUST NOT generate - ; flag-extension flags except as defined by - ; future standard or standards-track - ; revisions of this specification. - -flag-fetch = flag / "\Recent" - -flag-keyword = atom - -flag-list = "(" [flag *(SP flag)] ")" - -flag-perm = flag / "\*" - -greeting = "*" SP (resp-cond-auth / resp-cond-bye) CRLF - -header-fld-name = astring - -header-list = "(" header-fld-name *(SP header-fld-name) ")" - -list = "LIST" SP mailbox SP list-mailbox - -list-mailbox = 1*list-char / string - -list-char = ATOM-CHAR / list-wildcards / resp-specials - -list-wildcards = "%" / "*" - -literal = "{" number "}" CRLF *CHAR8 - ; Number represents the number of CHAR8s - -login = "LOGIN" SP userid SP password - -lsub = "LSUB" SP mailbox SP list-mailbox - - - - - - - - - - - -Crispin Standards Track [Page 86] - -RFC 3501 IMAPv4 March 2003 - - -mailbox = "INBOX" / astring - ; INBOX is case-insensitive. All case variants of - ; INBOX (e.g., "iNbOx") MUST be interpreted as INBOX - ; not as an astring. An astring which consists of - ; the case-insensitive sequence "I" "N" "B" "O" "X" - ; is considered to be INBOX and not an astring. - ; Refer to section 5.1 for further - ; semantic details of mailbox names. - -mailbox-data = "FLAGS" SP flag-list / "LIST" SP mailbox-list / - "LSUB" SP mailbox-list / "SEARCH" *(SP nz-number) / - "STATUS" SP mailbox SP "(" [status-att-list] ")" / - number SP "EXISTS" / number SP "RECENT" - -mailbox-list = "(" [mbx-list-flags] ")" SP - (DQUOTE QUOTED-CHAR DQUOTE / nil) SP mailbox - -mbx-list-flags = *(mbx-list-oflag SP) mbx-list-sflag - *(SP mbx-list-oflag) / - mbx-list-oflag *(SP mbx-list-oflag) - -mbx-list-oflag = "\Noinferiors" / flag-extension - ; Other flags; multiple possible per LIST response - -mbx-list-sflag = "\Noselect" / "\Marked" / "\Unmarked" - ; Selectability flags; only one per LIST response - -media-basic = ((DQUOTE ("APPLICATION" / "AUDIO" / "IMAGE" / - "MESSAGE" / "VIDEO") DQUOTE) / string) SP - media-subtype - ; Defined in [MIME-IMT] - -media-message = DQUOTE "MESSAGE" DQUOTE SP DQUOTE "RFC822" DQUOTE - ; Defined in [MIME-IMT] - -media-subtype = string - ; Defined in [MIME-IMT] - -media-text = DQUOTE "TEXT" DQUOTE SP media-subtype - ; Defined in [MIME-IMT] - -message-data = nz-number SP ("EXPUNGE" / ("FETCH" SP msg-att)) - -msg-att = "(" (msg-att-dynamic / msg-att-static) - *(SP (msg-att-dynamic / msg-att-static)) ")" - -msg-att-dynamic = "FLAGS" SP "(" [flag-fetch *(SP flag-fetch)] ")" - ; MAY change for a message - - - -Crispin Standards Track [Page 87] - -RFC 3501 IMAPv4 March 2003 - - -msg-att-static = "ENVELOPE" SP envelope / "INTERNALDATE" SP date-time / - "RFC822" [".HEADER" / ".TEXT"] SP nstring / - "RFC822.SIZE" SP number / - "BODY" ["STRUCTURE"] SP body / - "BODY" section ["<" number ">"] SP nstring / - "UID" SP uniqueid - ; MUST NOT change for a message - -nil = "NIL" - -nstring = string / nil - -number = 1*DIGIT - ; Unsigned 32-bit integer - ; (0 <= n < 4,294,967,296) - -nz-number = digit-nz *DIGIT - ; Non-zero unsigned 32-bit integer - ; (0 < n < 4,294,967,296) - -password = astring - -quoted = DQUOTE *QUOTED-CHAR DQUOTE - -QUOTED-CHAR = <any TEXT-CHAR except quoted-specials> / - "\" quoted-specials - -quoted-specials = DQUOTE / "\" - -rename = "RENAME" SP mailbox SP mailbox - ; Use of INBOX as a destination gives a NO error - -response = *(continue-req / response-data) response-done - -response-data = "*" SP (resp-cond-state / resp-cond-bye / - mailbox-data / message-data / capability-data) CRLF - -response-done = response-tagged / response-fatal - -response-fatal = "*" SP resp-cond-bye CRLF - ; Server closes connection immediately - -response-tagged = tag SP resp-cond-state CRLF - -resp-cond-auth = ("OK" / "PREAUTH") SP resp-text - ; Authentication condition - - - - - -Crispin Standards Track [Page 88] - -RFC 3501 IMAPv4 March 2003 - - -resp-cond-bye = "BYE" SP resp-text - -resp-cond-state = ("OK" / "NO" / "BAD") SP resp-text - ; Status condition - -resp-specials = "]" - -resp-text = ["[" resp-text-code "]" SP] text - -resp-text-code = "ALERT" / - "BADCHARSET" [SP "(" astring *(SP astring) ")" ] / - capability-data / "PARSE" / - "PERMANENTFLAGS" SP "(" - [flag-perm *(SP flag-perm)] ")" / - "READ-ONLY" / "READ-WRITE" / "TRYCREATE" / - "UIDNEXT" SP nz-number / "UIDVALIDITY" SP nz-number / - "UNSEEN" SP nz-number / - atom [SP 1*<any TEXT-CHAR except "]">] - -search = "SEARCH" [SP "CHARSET" SP astring] 1*(SP search-key) - ; CHARSET argument to MUST be registered with IANA - -search-key = "ALL" / "ANSWERED" / "BCC" SP astring / - "BEFORE" SP date / "BODY" SP astring / - "CC" SP astring / "DELETED" / "FLAGGED" / - "FROM" SP astring / "KEYWORD" SP flag-keyword / - "NEW" / "OLD" / "ON" SP date / "RECENT" / "SEEN" / - "SINCE" SP date / "SUBJECT" SP astring / - "TEXT" SP astring / "TO" SP astring / - "UNANSWERED" / "UNDELETED" / "UNFLAGGED" / - "UNKEYWORD" SP flag-keyword / "UNSEEN" / - ; Above this line were in [IMAP2] - "DRAFT" / "HEADER" SP header-fld-name SP astring / - "LARGER" SP number / "NOT" SP search-key / - "OR" SP search-key SP search-key / - "SENTBEFORE" SP date / "SENTON" SP date / - "SENTSINCE" SP date / "SMALLER" SP number / - "UID" SP sequence-set / "UNDRAFT" / sequence-set / - "(" search-key *(SP search-key) ")" - -section = "[" [section-spec] "]" - -section-msgtext = "HEADER" / "HEADER.FIELDS" [".NOT"] SP header-list / - "TEXT" - ; top-level or MESSAGE/RFC822 part - -section-part = nz-number *("." nz-number) - ; body part nesting - - - -Crispin Standards Track [Page 89] - -RFC 3501 IMAPv4 March 2003 - - -section-spec = section-msgtext / (section-part ["." section-text]) - -section-text = section-msgtext / "MIME" - ; text other than actual body part (headers, etc.) - -select = "SELECT" SP mailbox - -seq-number = nz-number / "*" - ; message sequence number (COPY, FETCH, STORE - ; commands) or unique identifier (UID COPY, - ; UID FETCH, UID STORE commands). - ; * represents the largest number in use. In - ; the case of message sequence numbers, it is - ; the number of messages in a non-empty mailbox. - ; In the case of unique identifiers, it is the - ; unique identifier of the last message in the - ; mailbox or, if the mailbox is empty, the - ; mailbox's current UIDNEXT value. - ; The server should respond with a tagged BAD - ; response to a command that uses a message - ; sequence number greater than the number of - ; messages in the selected mailbox. This - ; includes "*" if the selected mailbox is empty. - -seq-range = seq-number ":" seq-number - ; two seq-number values and all values between - ; these two regardless of order. - ; Example: 2:4 and 4:2 are equivalent and indicate - ; values 2, 3, and 4. - ; Example: a unique identifier sequence range of - ; 3291:* includes the UID of the last message in - ; the mailbox, even if that value is less than 3291. - -sequence-set = (seq-number / seq-range) *("," sequence-set) - ; set of seq-number values, regardless of order. - ; Servers MAY coalesce overlaps and/or execute the - ; sequence in any order. - ; Example: a message sequence number set of - ; 2,4:7,9,12:* for a mailbox with 15 messages is - ; equivalent to 2,4,5,6,7,9,12,13,14,15 - ; Example: a message sequence number set of *:4,5:7 - ; for a mailbox with 10 messages is equivalent to - ; 10,9,8,7,6,5,4,5,6,7 and MAY be reordered and - ; overlap coalesced to be 4,5,6,7,8,9,10. - -status = "STATUS" SP mailbox SP - "(" status-att *(SP status-att) ")" - - - - -Crispin Standards Track [Page 90] - -RFC 3501 IMAPv4 March 2003 - - -status-att = "MESSAGES" / "RECENT" / "UIDNEXT" / "UIDVALIDITY" / - "UNSEEN" - -status-att-list = status-att SP number *(SP status-att SP number) - -store = "STORE" SP sequence-set SP store-att-flags - -store-att-flags = (["+" / "-"] "FLAGS" [".SILENT"]) SP - (flag-list / (flag *(SP flag))) - -string = quoted / literal - -subscribe = "SUBSCRIBE" SP mailbox - -tag = 1*<any ASTRING-CHAR except "+"> - -text = 1*TEXT-CHAR - -TEXT-CHAR = <any CHAR except CR and LF> - -time = 2DIGIT ":" 2DIGIT ":" 2DIGIT - ; Hours minutes seconds - -uid = "UID" SP (copy / fetch / search / store) - ; Unique identifiers used instead of message - ; sequence numbers - -uniqueid = nz-number - ; Strictly ascending - -unsubscribe = "UNSUBSCRIBE" SP mailbox - -userid = astring - -x-command = "X" atom <experimental command arguments> - -zone = ("+" / "-") 4DIGIT - ; Signed four-digit value of hhmm representing - ; hours and minutes east of Greenwich (that is, - ; the amount that the given time differs from - ; Universal Time). Subtracting the timezone - ; from the given time will give the UT form. - ; The Universal Time zone is "+0000". - - - - - - - - -Crispin Standards Track [Page 91] - -RFC 3501 IMAPv4 March 2003 - - -10. Author's Note - - This document is a revision or rewrite of earlier documents, and - supercedes the protocol specification in those documents: RFC 2060, - RFC 1730, unpublished IMAP2bis.TXT document, RFC 1176, and RFC 1064. - -11. Security Considerations - - IMAP4rev1 protocol transactions, including electronic mail data, are - sent in the clear over the network unless protection from snooping is - negotiated. This can be accomplished either by the use of STARTTLS, - negotiated privacy protection in the AUTHENTICATE command, or some - other protection mechanism. - -11.1. STARTTLS Security Considerations - - The specification of the STARTTLS command and LOGINDISABLED - capability in this document replaces that in [IMAP-TLS]. [IMAP-TLS] - remains normative for the PLAIN [SASL] authenticator. - - IMAP client and server implementations MUST implement the - TLS_RSA_WITH_RC4_128_MD5 [TLS] cipher suite, and SHOULD implement the - TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher suite. This is - important as it assures that any two compliant implementations can be - configured to interoperate. All other cipher suites are OPTIONAL. - Note that this is a change from section 2.1 of [IMAP-TLS]. - - During the [TLS] negotiation, the client MUST check its understanding - of the server hostname against the server's identity as presented in - the server Certificate message, in order to prevent man-in-the-middle - attacks. If the match fails, the client SHOULD either ask for - explicit user confirmation, or terminate the connection and indicate - that the server's identity is suspect. Matching is performed - according to these rules: - - The client MUST use the server hostname it used to open the - connection as the value to compare against the server name - as expressed in the server certificate. The client MUST - NOT use any form of the server hostname derived from an - insecure remote source (e.g., insecure DNS lookup). CNAME - canonicalization is not done. - - If a subjectAltName extension of type dNSName is present in - the certificate, it SHOULD be used as the source of the - server's identity. - - Matching is case-insensitive. - - - - -Crispin Standards Track [Page 92] - -RFC 3501 IMAPv4 March 2003 - - - A "*" wildcard character MAY be used as the left-most name - component in the certificate. For example, *.example.com - would match a.example.com, foo.example.com, etc. but would - not match example.com. - - If the certificate contains multiple names (e.g., more than - one dNSName field), then a match with any one of the fields - is considered acceptable. - - Both the client and server MUST check the result of the STARTTLS - command and subsequent [TLS] negotiation to see whether acceptable - authentication or privacy was achieved. - -11.2. Other Security Considerations - - A server error message for an AUTHENTICATE command which fails due to - invalid credentials SHOULD NOT detail why the credentials are - invalid. - - Use of the LOGIN command sends passwords in the clear. This can be - avoided by using the AUTHENTICATE command with a [SASL] mechanism - that does not use plaintext passwords, by first negotiating - encryption via STARTTLS or some other protection mechanism. - - A server implementation MUST implement a configuration that, at the - time of authentication, requires: - (1) The STARTTLS command has been negotiated. - OR - (2) Some other mechanism that protects the session from password - snooping has been provided. - OR - (3) The following measures are in place: - (a) The LOGINDISABLED capability is advertised, and [SASL] - mechanisms (such as PLAIN) using plaintext passwords are NOT - advertised in the CAPABILITY list. - AND - (b) The LOGIN command returns an error even if the password is - correct. - AND - (c) The AUTHENTICATE command returns an error with all [SASL] - mechanisms that use plaintext passwords, even if the password - is correct. - - A server error message for a failing LOGIN command SHOULD NOT specify - that the user name, as opposed to the password, is invalid. - - A server SHOULD have mechanisms in place to limit or delay failed - AUTHENTICATE/LOGIN attempts. - - - -Crispin Standards Track [Page 93] - -RFC 3501 IMAPv4 March 2003 - - - Additional security considerations are discussed in the section - discussing the AUTHENTICATE and LOGIN commands. - -12. IANA Considerations - - IMAP4 capabilities are registered by publishing a standards track or - IESG approved experimental RFC. The registry is currently located - at: - - http://www.iana.org/assignments/imap4-capabilities - - As this specification revises the STARTTLS and LOGINDISABLED - extensions previously defined in [IMAP-TLS], the registry will be - updated accordingly. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 94] - -RFC 3501 IMAPv4 March 2003 - - -Appendices - -A. Normative References - - The following documents contain definitions or specifications that - are necessary to understand this document properly: - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", RFC 2234, - November 1997. - - [ANONYMOUS] Newman, C., "Anonymous SASL Mechanism", RFC - 2245, November 1997. - - [CHARSET] Freed, N. and J. Postel, "IANA Character Set - Registration Procedures", RFC 2978, October - 2000. - - [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest - Authentication as a SASL Mechanism", RFC 2831, - May 2000. - - [DISPOSITION] Troost, R., Dorner, S. and K. Moore, - "Communicating Presentation Information in - Internet Messages: The Content-Disposition - Header", RFC 2183, August 1997. - - [IMAP-TLS] Newman, C., "Using TLS with IMAP, POP3 and - ACAP", RFC 2595, June 1999. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to - Indicate Requirement Levels", BCP 14, RFC 2119, - March 1997. - - [LANGUAGE-TAGS] Alvestrand, H., "Tags for the Identification of - Languages", BCP 47, RFC 3066, January 2001. - - [LOCATION] Palme, J., Hopmann, A. and N. Shelness, "MIME - Encapsulation of Aggregate Documents, such as - HTML (MHTML)", RFC 2557, March 1999. - - [MD5] Myers, J. and M. Rose, "The Content-MD5 Header - Field", RFC 1864, October 1995. - - - - - - - - - -Crispin Standards Track [Page 95] - -RFC 3501 IMAPv4 March 2003 - - - [MIME-HDRS] Moore, K., "MIME (Multipurpose Internet Mail - Extensions) Part Three: Message Header - Extensions for Non-ASCII Text", RFC 2047, - November 1996. - - [MIME-IMB] Freed, N. and N. Borenstein, "MIME - (Multipurpose Internet Mail Extensions) Part - One: Format of Internet Message Bodies", RFC - 2045, November 1996. - - [MIME-IMT] Freed, N. and N. Borenstein, "MIME - (Multipurpose Internet Mail Extensions) Part - Two: Media Types", RFC 2046, November 1996. - - [RFC-2822] Resnick, P., "Internet Message Format", RFC - 2822, April 2001. - - [SASL] Myers, J., "Simple Authentication and Security - Layer (SASL)", RFC 2222, October 1997. - - [TLS] Dierks, T. and C. Allen, "The TLS Protocol - Version 1.0", RFC 2246, January 1999. - - [UTF-7] Goldsmith, D. and M. Davis, "UTF-7: A Mail-Safe - Transformation Format of Unicode", RFC 2152, - May 1997. - - The following documents describe quality-of-implementation issues - that should be carefully considered when implementing this protocol: - - [IMAP-IMPLEMENTATION] Leiba, B., "IMAP Implementation - Recommendations", RFC 2683, September 1999. - - [IMAP-MULTIACCESS] Gahrns, M., "IMAP4 Multi-Accessed Mailbox - Practice", RFC 2180, July 1997. - -A.1 Informative References - - The following documents describe related protocols: - - [IMAP-DISC] Austein, R., "Synchronization Operations for - Disconnected IMAP4 Clients", Work in Progress. - - [IMAP-MODEL] Crispin, M., "Distributed Electronic Mail - Models in IMAP4", RFC 1733, December 1994. - - - - - - -Crispin Standards Track [Page 96] - -RFC 3501 IMAPv4 March 2003 - - - [ACAP] Newman, C. and J. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, - November 1997. - - [SMTP] Klensin, J., "Simple Mail Transfer Protocol", - STD 10, RFC 2821, April 2001. - - The following documents are historical or describe historical aspects - of this protocol: - - [IMAP-COMPAT] Crispin, M., "IMAP4 Compatibility with - IMAP2bis", RFC 2061, December 1996. - - [IMAP-HISTORICAL] Crispin, M., "IMAP4 Compatibility with IMAP2 - and IMAP2bis", RFC 1732, December 1994. - - [IMAP-OBSOLETE] Crispin, M., "Internet Message Access Protocol - - Obsolete Syntax", RFC 2062, December 1996. - - [IMAP2] Crispin, M., "Interactive Mail Access Protocol - - Version 2", RFC 1176, August 1990. - - [RFC-822] Crocker, D., "Standard for the Format of ARPA - Internet Text Messages", STD 11, RFC 822, - August 1982. - - [RFC-821] Postel, J., "Simple Mail Transfer Protocol", - STD 10, RFC 821, August 1982. - -B. Changes from RFC 2060 - - 1) Clarify description of unique identifiers and their semantics. - - 2) Fix the SELECT description to clarify that UIDVALIDITY is required - in the SELECT and EXAMINE responses. - - 3) Added an example of a failing search. - - 4) Correct store-att-flags: "#flag" should be "1#flag". - - 5) Made search and section rules clearer. - - 6) Correct the STORE example. - - 7) Correct "BASE645" misspelling. - - 8) Remove extraneous close parenthesis in example of two-part message - with text and BASE64 attachment. - - - -Crispin Standards Track [Page 97] - -RFC 3501 IMAPv4 March 2003 - - - 9) Remove obsolete "MAILBOX" response from mailbox-data. - - 10) A spurious "<" in the rule for mailbox-data was removed. - - 11) Add CRLF to continue-req. - - 12) Specifically exclude "]" from the atom in resp-text-code. - - 13) Clarify that clients and servers should adhere strictly to the - protocol syntax. - - 14) Emphasize in 5.2 that EXISTS can not be used to shrink a mailbox. - - 15) Add NEWNAME to resp-text-code. - - 16) Clarify that the empty string, not NIL, is used as arguments to - LIST. - - 17) Clarify that NIL can be returned as a hierarchy delimiter for the - empty string mailbox name argument if the mailbox namespace is flat. - - 18) Clarify that addr-mailbox and addr-name have RFC-2822 quoting - removed. - - 19) Update UTF-7 reference. - - 20) Fix example in 6.3.11. - - 21) Clarify that non-existent UIDs are ignored. - - 22) Update DISPOSITION reference. - - 23) Expand state diagram. - - 24) Clarify that partial fetch responses are only returned in - response to a partial fetch command. - - 25) Add UIDNEXT response code. Correct UIDVALIDITY definition - reference. - - 26) Further clarification of "can" vs. "MAY". - - 27) Reference RFC-2119. - - 28) Clarify that superfluous shifts are not permitted in modified - UTF-7. - - 29) Clarify that there are no implicit shifts in modified UTF-7. - - - -Crispin Standards Track [Page 98] - -RFC 3501 IMAPv4 March 2003 - - - 30) Clarify that "INBOX" in a mailbox name is always INBOX, even if - it is given as a string. - - 31) Add missing open parenthesis in media-basic grammar rule. - - 32) Correct attribute syntax in mailbox-data. - - 33) Add UIDNEXT to EXAMINE responses. - - 34) Clarify UNSEEN, PERMANENTFLAGS, UIDVALIDITY, and UIDNEXT - responses in SELECT and EXAMINE. They are required now, but weren't - in older versions. - - 35) Update references with RFC numbers. - - 36) Flush text-mime2. - - 37) Clarify that modified UTF-7 names must be case-sensitive and that - violating the convention should be avoided. - - 38) Correct UID FETCH example. - - 39) Clarify UID FETCH, UID STORE, and UID SEARCH vs. untagged EXPUNGE - responses. - - 40) Clarify the use of the word "convention". - - 41) Clarify that a command is not "in progress" until it has been - fully received (specifically, that a command is not "in progress" - during command continuation negotiation). - - 42) Clarify envelope defaulting. - - 43) Clarify that SP means one and only one space character. - - 44) Forbid silly states in LIST response. - - 45) Clarify that the ENVELOPE, INTERNALDATE, RFC822*, BODY*, and UID - for a message is static. - - 46) Add BADCHARSET response code. - - 47) Update formal syntax to [ABNF] conventions. - - 48) Clarify trailing hierarchy delimiter in CREATE semantics. - - 49) Clarify that the "blank line" is the [RFC-2822] delimiting blank - line. - - - -Crispin Standards Track [Page 99] - -RFC 3501 IMAPv4 March 2003 - - - 50) Clarify that RENAME should also create hierarchy as needed for - the command to complete. - - 51) Fix body-ext-mpart to not require language if disposition - present. - - 52) Clarify the RFC822.HEADER response. - - 53) Correct missing space after charset astring in search. - - 54) Correct missing quote for BADCHARSET in resp-text-code. - - 55) Clarify that ALL, FAST, and FULL preclude any other data items - appearing. - - 56) Clarify semantics of reference argument in LIST. - - 57) Clarify that a null string for SEARCH HEADER X-FOO means any - message with a header line with a field-name of X-FOO regardless of - the text of the header. - - 58) Specifically reserve 8-bit mailbox names for future use as UTF-8. - - 59) It is not an error for the client to store a flag that is not in - the PERMANENTFLAGS list; however, the server will either ignore the - change or make the change in the session only. - - 60) Correct/clarify the text regarding superfluous shifts. - - 61) Correct typographic errors in the "Changes" section. - - 62) Clarify that STATUS must not be used to check for new messages in - the selected mailbox - - 63) Clarify LSUB behavior with "%" wildcard. - - 64) Change AUTHORIZATION to AUTHENTICATE in section 7.5. - - 65) Clarify description of multipart body type. - - 66) Clarify that STORE FLAGS does not affect \Recent. - - 67) Change "west" to "east" in description of timezone. - - 68) Clarify that commands which break command pipelining must wait - for a completion result response. - - 69) Clarify that EXAMINE does not affect \Recent. - - - -Crispin Standards Track [Page 100] - -RFC 3501 IMAPv4 March 2003 - - - 70) Make description of MIME structure consistent. - - 71) Clarify that date searches disregard the time and timezone of the - INTERNALDATE or Date: header. In other words, "ON 13-APR-2000" means - messages with an INTERNALDATE text which starts with "13-APR-2000", - even if timezone differential from the local timezone is sufficient - to move that INTERNALDATE into the previous or next day. - - 72) Clarify that the header fetches don't add a blank line if one - isn't in the [RFC-2822] message. - - 73) Clarify (in discussion of UIDs) that messages are immutable. - - 74) Add an example of CHARSET searching. - - 75) Clarify in SEARCH that keywords are a type of flag. - - 76) Clarify the mandatory nature of the SELECT data responses. - - 77) Add optional CAPABILITY response code in the initial OK or - PREAUTH. - - 78) Add note that server can send an untagged CAPABILITY command as - part of the responses to AUTHENTICATE and LOGIN. - - 79) Remove statement about it being unnecessary to issue a CAPABILITY - command more than once in a connection. That statement is no longer - true. - - 80) Clarify that untagged EXPUNGE decrements the number of messages - in the mailbox. - - 81) Fix definition of "body" (concatenation has tighter binding than - alternation). - - 82) Add a new "Special Notes to Implementors" section with reference - to [IMAP-IMPLEMENTATION]. - - 83) Clarify that an untagged CAPABILITY response to an AUTHENTICATE - command should only be done if a security layer was not negotiated. - - 84) Change the definition of atom to exclude "]". Update astring to - include "]" for compatibility with the past. Remove resp-text-atom. - - 85) Remove NEWNAME. It can't work because mailbox names can be - literals and can include "]". Functionality can be addressed via - referrals. - - - - -Crispin Standards Track [Page 101] - -RFC 3501 IMAPv4 March 2003 - - - 86) Move modified UTF-7 rationale in order to have more logical - paragraph flow. - - 87) Clarify UID uniqueness guarantees with the use of MUST. - - 88) Note that clients should read response data until the connection - is closed instead of immediately closing on a BYE. - - 89) Change RFC-822 references to RFC-2822. - - 90) Clarify that RFC-2822 should be followed instead of RFC-822. - - 91) Change recommendation of optional automatic capabilities in LOGIN - and AUTHENTICATE to use the CAPABILITY response code in the tagged - OK. This is more interoperable than an unsolicited untagged - CAPABILITY response. - - 92) STARTTLS and AUTH=PLAIN are mandatory to implement; add - recommendations for other [SASL] mechanisms. - - 93) Clarify that a "connection" (as opposed to "server" or "command") - is in one of the four states. - - 94) Clarify that a failed or rejected command does not change state. - - 95) Split references between normative and informative. - - 96) Discuss authentication failure issues in security section. - - 97) Clarify that a data item is not necessarily of only one data - type. - - 98) Clarify that sequence ranges are independent of order. - - 99) Change an example to clarify that superfluous shifts in - Modified-UTF7 can not be fixed just by omitting the shift. The - entire string must be recalculated. - - 100) Change Envelope Structure definition since [RFC-2822] uses - "envelope" to refer to the [SMTP] envelope and not the envelope data - that appears in the [RFC-2822] header. - - 101) Expand on RFC822.HEADER response data vs. BODY[HEADER]. - - 102) Clarify Logout state semantics, change ASCII art. - - 103) Security changes to comply with IESG requirements. - - - - -Crispin Standards Track [Page 102] - -RFC 3501 IMAPv4 March 2003 - - - 104) Add definition for body URI. - - 105) Break sequence range definition into three rules, with rewritten - descriptions for each. - - 106) Move STARTTLS and LOGINDISABLED here from [IMAP-TLS]. - - 107) Add IANA Considerations section. - - 108) Clarify valid client assumptions for new message UIDs vs. - UIDNEXT. - - 109) Clarify that changes to permanentflags affect concurrent - sessions as well as subsequent sessions. - - 110) Clarify that authenticated state can be entered by the CLOSE - command. - - 111) Emphasize that SELECT and EXAMINE are the exceptions to the rule - that a failing command does not change state. - - 112) Clarify that newly-appended messages have the Recent flag set. - - 113) Clarify that newly-copied messages SHOULD have the Recent flag - set. - - 114) Clarify that UID commands always return the UID in FETCH - responses. - -C. Key Word Index - - +FLAGS <flag list> (store command data item) ............... 59 - +FLAGS.SILENT <flag list> (store command data item) ........ 59 - -FLAGS <flag list> (store command data item) ............... 59 - -FLAGS.SILENT <flag list> (store command data item) ........ 59 - ALERT (response code) ...................................... 64 - ALL (fetch item) ........................................... 55 - ALL (search key) ........................................... 50 - ANSWERED (search key) ...................................... 50 - APPEND (command) ........................................... 45 - AUTHENTICATE (command) ..................................... 27 - BAD (response) ............................................. 66 - BADCHARSET (response code) ................................. 64 - BCC <string> (search key) .................................. 51 - BEFORE <date> (search key) ................................. 51 - BODY (fetch item) .......................................... 55 - BODY (fetch result) ........................................ 73 - BODY <string> (search key) ................................. 51 - - - -Crispin Standards Track [Page 103] - -RFC 3501 IMAPv4 March 2003 - - - BODY.PEEK[<section>]<<partial>> (fetch item) ............... 57 - BODYSTRUCTURE (fetch item) ................................. 57 - BODYSTRUCTURE (fetch result) ............................... 74 - BODY[<section>]<<origin octet>> (fetch result) ............. 74 - BODY[<section>]<<partial>> (fetch item) .................... 55 - BYE (response) ............................................. 67 - Body Structure (message attribute) ......................... 12 - CAPABILITY (command) ....................................... 24 - CAPABILITY (response code) ................................. 64 - CAPABILITY (response) ...................................... 68 - CC <string> (search key) ................................... 51 - CHECK (command) ............................................ 47 - CLOSE (command) ............................................ 48 - COPY (command) ............................................. 59 - CREATE (command) ........................................... 34 - DELETE (command) ........................................... 35 - DELETED (search key) ....................................... 51 - DRAFT (search key) ......................................... 51 - ENVELOPE (fetch item) ...................................... 57 - ENVELOPE (fetch result) .................................... 77 - EXAMINE (command) .......................................... 33 - EXISTS (response) .......................................... 71 - EXPUNGE (command) .......................................... 48 - EXPUNGE (response) ......................................... 72 - Envelope Structure (message attribute) ..................... 12 - FAST (fetch item) .......................................... 55 - FETCH (command) ............................................ 54 - FETCH (response) ........................................... 73 - FLAGGED (search key) ....................................... 51 - FLAGS (fetch item) ......................................... 57 - FLAGS (fetch result) ....................................... 78 - FLAGS (response) ........................................... 71 - FLAGS <flag list> (store command data item) ................ 59 - FLAGS.SILENT <flag list> (store command data item) ......... 59 - FROM <string> (search key) ................................. 51 - FULL (fetch item) .......................................... 55 - Flags (message attribute) .................................. 11 - HEADER (part specifier) .................................... 55 - HEADER <field-name> <string> (search key) .................. 51 - HEADER.FIELDS <header-list> (part specifier) ............... 55 - HEADER.FIELDS.NOT <header-list> (part specifier) ........... 55 - INTERNALDATE (fetch item) .................................. 57 - INTERNALDATE (fetch result) ................................ 78 - Internal Date (message attribute) .......................... 12 - KEYWORD <flag> (search key) ................................ 51 - Keyword (type of flag) ..................................... 11 - LARGER <n> (search key) .................................... 51 - LIST (command) ............................................. 40 - - - -Crispin Standards Track [Page 104] - -RFC 3501 IMAPv4 March 2003 - - - LIST (response) ............................................ 69 - LOGIN (command) ............................................ 30 - LOGOUT (command) ........................................... 25 - LSUB (command) ............................................. 43 - LSUB (response) ............................................ 70 - MAY (specification requirement term) ....................... 4 - MESSAGES (status item) ..................................... 45 - MIME (part specifier) ...................................... 56 - MUST (specification requirement term) ...................... 4 - MUST NOT (specification requirement term) .................. 4 - Message Sequence Number (message attribute) ................ 10 - NEW (search key) ........................................... 51 - NO (response) .............................................. 66 - NOOP (command) ............................................. 25 - NOT <search-key> (search key) .............................. 52 - OK (response) .............................................. 65 - OLD (search key) ........................................... 52 - ON <date> (search key) ..................................... 52 - OPTIONAL (specification requirement term) .................. 4 - OR <search-key1> <search-key2> (search key) ................ 52 - PARSE (response code) ...................................... 64 - PERMANENTFLAGS (response code) ............................. 64 - PREAUTH (response) ......................................... 67 - Permanent Flag (class of flag) ............................. 12 - READ-ONLY (response code) .................................. 65 - READ-WRITE (response code) ................................. 65 - RECENT (response) .......................................... 72 - RECENT (search key) ........................................ 52 - RECENT (status item) ....................................... 45 - RENAME (command) ........................................... 37 - REQUIRED (specification requirement term) .................. 4 - RFC822 (fetch item) ........................................ 57 - RFC822 (fetch result) ...................................... 78 - RFC822.HEADER (fetch item) ................................. 57 - RFC822.HEADER (fetch result) ............................... 78 - RFC822.SIZE (fetch item) ................................... 57 - RFC822.SIZE (fetch result) ................................. 78 - RFC822.TEXT (fetch item) ................................... 58 - RFC822.TEXT (fetch result) ................................. 79 - SEARCH (command) ........................................... 49 - SEARCH (response) .......................................... 71 - SEEN (search key) .......................................... 52 - SELECT (command) ........................................... 31 - SENTBEFORE <date> (search key) ............................. 52 - SENTON <date> (search key) ................................. 52 - SENTSINCE <date> (search key) .............................. 52 - SHOULD (specification requirement term) .................... 4 - SHOULD NOT (specification requirement term) ................ 4 - - - -Crispin Standards Track [Page 105] - -RFC 3501 IMAPv4 March 2003 - - - SINCE <date> (search key) .................................. 52 - SMALLER <n> (search key) ................................... 52 - STARTTLS (command) ......................................... 27 - STATUS (command) ........................................... 44 - STATUS (response) .......................................... 70 - STORE (command) ............................................ 58 - SUBJECT <string> (search key) .............................. 53 - SUBSCRIBE (command) ........................................ 38 - Session Flag (class of flag) ............................... 12 - System Flag (type of flag) ................................. 11 - TEXT (part specifier) ...................................... 56 - TEXT <string> (search key) ................................. 53 - TO <string> (search key) ................................... 53 - TRYCREATE (response code) .................................. 65 - UID (command) .............................................. 60 - UID (fetch item) ........................................... 58 - UID (fetch result) ......................................... 79 - UID <sequence set> (search key) ............................ 53 - UIDNEXT (response code) .................................... 65 - UIDNEXT (status item) ...................................... 45 - UIDVALIDITY (response code) ................................ 65 - UIDVALIDITY (status item) .................................. 45 - UNANSWERED (search key) .................................... 53 - UNDELETED (search key) ..................................... 53 - UNDRAFT (search key) ....................................... 53 - UNFLAGGED (search key) ..................................... 53 - UNKEYWORD <flag> (search key) .............................. 53 - UNSEEN (response code) ..................................... 65 - UNSEEN (search key) ........................................ 53 - UNSEEN (status item) ....................................... 45 - UNSUBSCRIBE (command) ...................................... 39 - Unique Identifier (UID) (message attribute) ................ 8 - X<atom> (command) .......................................... 62 - [RFC-2822] Size (message attribute) ........................ 12 - \Answered (system flag) .................................... 11 - \Deleted (system flag) ..................................... 11 - \Draft (system flag) ....................................... 11 - \Flagged (system flag) ..................................... 11 - \Marked (mailbox name attribute) ........................... 69 - \Noinferiors (mailbox name attribute) ...................... 69 - \Noselect (mailbox name attribute) ......................... 69 - \Recent (system flag) ...................................... 11 - \Seen (system flag) ........................................ 11 - \Unmarked (mailbox name attribute) ......................... 69 - - - - - - - -Crispin Standards Track [Page 106] - -RFC 3501 IMAPv4 March 2003 - - -Author's Address - - Mark R. Crispin - Networks and Distributed Computing - University of Washington - 4545 15th Avenue NE - Seattle, WA 98105-4527 - - Phone: (206) 543-5762 - - EMail: MRC@CAC.Washington.EDU - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 107] - -RFC 3501 IMAPv4 March 2003 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2003). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. v This - document and the information contained herein is provided on an "AS - IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK - FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT - LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL - NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY - OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - - -Crispin Standards Track [Page 108] - diff --git a/proto/rfc4616.txt b/proto/rfc4616.txt @@ -1,619 +0,0 @@ - - - - - - -Network Working Group K. Zeilenga, Ed. -Request for Comments: 4616 OpenLDAP Foundation -Updates: 2595 August 2006 -Category: Standards Track - - - The PLAIN Simple Authentication and Security Layer (SASL) Mechanism - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2006). - -Abstract - - This document defines a simple clear-text user/password Simple - Authentication and Security Layer (SASL) mechanism called the PLAIN - mechanism. The PLAIN mechanism is intended to be used, in - combination with data confidentiality services provided by a lower - layer, in protocols that lack a simple password authentication - command. - - - - - - - - - - - - - - - - - - - - - - - -Zeilenga Standards Track [Page 1] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - -1. Introduction - - Clear-text, multiple-use passwords are simple, interoperate with - almost all existing operating system authentication databases, and - are useful for a smooth transition to a more secure password-based - authentication mechanism. The drawback is that they are unacceptable - for use over network connections where data confidentiality is not - ensured. - - This document defines the PLAIN Simple Authentication and Security - Layer ([SASL]) mechanism for use in protocols with no clear-text - login command (e.g., [ACAP] or [SMTP-AUTH]). This document updates - RFC 2595, replacing Section 6. Changes since RFC 2595 are detailed - in Appendix A. - - The name associated with this mechanism is "PLAIN". - - The PLAIN SASL mechanism does not provide a security layer. - - The PLAIN mechanism should not be used without adequate data security - protection as this mechanism affords no integrity or confidentiality - protections itself. The mechanism is intended to be used with data - security protections provided by application-layer protocol, - generally through its use of Transport Layer Security ([TLS]) - services. - - By default, implementations SHOULD advertise and make use of the - PLAIN mechanism only when adequate data security services are in - place. Specifications for IETF protocols that indicate that this - mechanism is an applicable authentication mechanism MUST mandate that - implementations support an strong data security service, such as TLS. - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in [Keywords]. - -2. PLAIN SASL Mechanism - - The mechanism consists of a single message, a string of [UTF-8] - encoded [Unicode] characters, from the client to the server. The - client presents the authorization identity (identity to act as), - followed by a NUL (U+0000) character, followed by the authentication - identity (identity whose password will be used), followed by a NUL - (U+0000) character, followed by the clear-text password. As with - other SASL mechanisms, the client does not provide an authorization - identity when it wishes the server to derive an identity from the - credentials and use that as the authorization identity. - - - - -Zeilenga Standards Track [Page 2] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - - The formal grammar for the client message using Augmented BNF [ABNF] - follows. - - message = [authzid] UTF8NUL authcid UTF8NUL passwd - authcid = 1*SAFE ; MUST accept up to 255 octets - authzid = 1*SAFE ; MUST accept up to 255 octets - passwd = 1*SAFE ; MUST accept up to 255 octets - UTF8NUL = %x00 ; UTF-8 encoded NUL character - - SAFE = UTF1 / UTF2 / UTF3 / UTF4 - ;; any UTF-8 encoded Unicode character except NUL - - UTF1 = %x01-7F ;; except NUL - UTF2 = %xC2-DF UTF0 - UTF3 = %xE0 %xA0-BF UTF0 / %xE1-EC 2(UTF0) / - %xED %x80-9F UTF0 / %xEE-EF 2(UTF0) - UTF4 = %xF0 %x90-BF 2(UTF0) / %xF1-F3 3(UTF0) / - %xF4 %x80-8F 2(UTF0) - UTF0 = %x80-BF - - The authorization identity (authzid), authentication identity - (authcid), password (passwd), and NUL character deliminators SHALL be - transferred as [UTF-8] encoded strings of [Unicode] characters. As - the NUL (U+0000) character is used as a deliminator, the NUL (U+0000) - character MUST NOT appear in authzid, authcid, or passwd productions. - - The form of the authzid production is specific to the application- - level protocol's SASL profile [SASL]. The authcid and passwd - productions are form-free. Use of non-visible characters or - characters that a user may be unable to enter on some keyboards is - discouraged. - - Servers MUST be capable of accepting authzid, authcid, and passwd - productions up to and including 255 octets. It is noted that the - UTF-8 encoding of a Unicode character may be as long as 4 octets. - - Upon receipt of the message, the server will verify the presented (in - the message) authentication identity (authcid) and password (passwd) - with the system authentication database, and it will verify that the - authentication credentials permit the client to act as the (presented - or derived) authorization identity (authzid). If both steps succeed, - the user is authenticated. - - The presented authentication identity and password strings, as well - as the database authentication identity and password strings, are to - be prepared before being used in the verification process. The - [SASLPrep] profile of the [StringPrep] algorithm is the RECOMMENDED - preparation algorithm. The SASLprep preparation algorithm is - - - -Zeilenga Standards Track [Page 3] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - - recommended to improve the likelihood that comparisons behave in an - expected manner. The SASLprep preparation algorithm is not mandatory - so as to allow the server to employ other preparation algorithms - (including none) when appropriate. For instance, use of a different - preparation algorithm may be necessary for the server to interoperate - with an external system. - - When preparing the presented strings using [SASLPrep], the presented - strings are to be treated as "query" strings (Section 7 of - [StringPrep]) and hence unassigned code points are allowed to appear - in their prepared output. When preparing the database strings using - [SASLPrep], the database strings are to be treated as "stored" - strings (Section 7 of [StringPrep]) and hence unassigned code points - are prohibited from appearing in their prepared output. - - Regardless of the preparation algorithm used, if the output of a - non-invertible function (e.g., hash) of the expected string is - stored, the string MUST be prepared before input to that function. - - Regardless of the preparation algorithm used, if preparation fails or - results in an empty string, verification SHALL fail. - - When no authorization identity is provided, the server derives an - authorization identity from the prepared representation of the - provided authentication identity string. This ensures that the - derivation of different representations of the authentication - identity produces the same authorization identity. - - The server MAY use the credentials to initialize any new - authentication database, such as one suitable for [CRAM-MD5] or - [DIGEST-MD5]. - -3. Pseudo-Code - - This section provides pseudo-code illustrating the verification - process (using hashed passwords and the SASLprep preparation - function) discussed above. This section is not definitive. - - boolean Verify(string authzid, string authcid, string passwd) { - string pAuthcid = SASLprep(authcid, true); # prepare authcid - string pPasswd = SASLprep(passwd, true); # prepare passwd - if (pAuthcid == NULL || pPasswd == NULL) { - return false; # preparation failed - } - if (pAuthcid == "" || pPasswd == "") { - return false; # empty prepared string - } - - - - -Zeilenga Standards Track [Page 4] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - - storedHash = FetchPasswordHash(pAuthcid); - if (storedHash == NULL || storedHash == "") { - return false; # error or unknown authcid - } - - if (!Compare(storedHash, Hash(pPasswd))) { - return false; # incorrect password - } - - if (authzid == NULL ) { - authzid = DeriveAuthzid(pAuthcid); - if (authzid == NULL || authzid == "") { - return false; # could not derive authzid - } - } - - if (!Authorize(pAuthcid, authzid)) { - return false; # not authorized - } - - return true; - } - - The second parameter of the SASLprep function, when true, indicates - that unassigned code points are allowed in the input. When the - SASLprep function is called to prepare the password prior to - computing the stored hash, the second parameter would be false. - - The second parameter provided to the Authorize function is not - prepared by this code. The application-level SASL profile should be - consulted to determine what, if any, preparation is necessary. - - Note that the DeriveAuthzid and Authorize functions (whether - implemented as one function or two, whether designed in a manner in - which these functions or whether the mechanism implementation can be - reused elsewhere) require knowledge and understanding of mechanism - and the application-level protocol specification and/or - implementation details to implement. - - Note that the Authorize function outcome is clearly dependent on - details of the local authorization model and policy. Both functions - may be dependent on other factors as well. - - - - - - - - - -Zeilenga Standards Track [Page 5] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - -4. Examples - - This section provides examples of PLAIN authentication exchanges. - The examples are intended to help the readers understand the above - text. The examples are not definitive. - - "C:" and "S:" indicate lines sent by the client and server, - respectively. "<NUL>" represents a single NUL (U+0000) character. - The Application Configuration Access Protocol ([ACAP]) is used in the - examples. - - The first example shows how the PLAIN mechanism might be used for - user authentication. - - S: * ACAP (SASL "CRAM-MD5") (STARTTLS) - C: a001 STARTTLS - S: a001 OK "Begin TLS negotiation now" - <TLS negotiation, further commands are under TLS layer> - S: * ACAP (SASL "CRAM-MD5" "PLAIN") - C: a002 AUTHENTICATE "PLAIN" - S: + "" - C: {21} - C: <NUL>tim<NUL>tanstaaftanstaaf - S: a002 OK "Authenticated" - - The second example shows how the PLAIN mechanism might be used to - attempt to assume the identity of another user. In this example, the - server rejects the request. Also, this example makes use of the - protocol optional initial response capability to eliminate a round- - trip. - - S: * ACAP (SASL "CRAM-MD5") (STARTTLS) - C: a001 STARTTLS - S: a001 OK "Begin TLS negotiation now" - <TLS negotiation, further commands are under TLS layer> - S: * ACAP (SASL "CRAM-MD5" "PLAIN") - C: a002 AUTHENTICATE "PLAIN" {20+} - C: Ursel<NUL>Kurt<NUL>xipj3plmq - S: a002 NO "Not authorized to requested authorization identity" - -5. Security Considerations - - As the PLAIN mechanism itself provided no integrity or - confidentiality protections, it should not be used without adequate - external data security protection, such as TLS services provided by - many application-layer protocols. By default, implementations SHOULD - NOT advertise and SHOULD NOT make use of the PLAIN mechanism unless - adequate data security services are in place. - - - -Zeilenga Standards Track [Page 6] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - - When the PLAIN mechanism is used, the server gains the ability to - impersonate the user to all services with the same password - regardless of any encryption provided by TLS or other confidentiality - protection mechanisms. Whereas many other authentication mechanisms - have similar weaknesses, stronger SASL mechanisms address this issue. - Clients are encouraged to have an operational mode where all - mechanisms that are likely to reveal the user's password to the - server are disabled. - - General [SASL] security considerations apply to this mechanism. - - Unicode, [UTF-8], and [StringPrep] security considerations also - apply. - -6. IANA Considerations - - The SASL Mechanism registry [IANA-SASL] entry for the PLAIN mechanism - has been updated by the IANA to reflect that this document now - provides its technical specification. - - To: iana@iana.org - Subject: Updated Registration of SASL mechanism PLAIN - - SASL mechanism name: PLAIN - Security considerations: See RFC 4616. - Published specification (optional, recommended): RFC 4616 - Person & email address to contact for further information: - Kurt Zeilenga <kurt@openldap.org> - IETF SASL WG <ietf-sasl@imc.org> - Intended usage: COMMON - Author/Change controller: IESG <iesg@ietf.org> - Note: Updates existing entry for PLAIN - -7. Acknowledgements - - This document is a revision of RFC 2595 by Chris Newman. Portions of - the grammar defined in Section 2 were borrowed from [UTF-8] by - Francois Yergeau. - - This document is a product of the IETF Simple Authentication and - Security Layer (SASL) Working Group. - - - - - - - - - - -Zeilenga Standards Track [Page 7] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - -8. Normative References - - [ABNF] Crocker, D., Ed. and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", RFC 4234, October 2005. - - [Keywords] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [SASL] Melnikov, A., Ed., and K. Zeilenga, Ed., "Simple - Authentication and Security Layer (SASL)", RFC 4422, - June 2006. - - [SASLPrep] Zeilenga, K., "SASLprep: Stringprep Profile for User - Names and Passwords", RFC 4013, February 2005. - - [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of - Internationalized Strings ("stringprep")", RFC 3454, - December 2002. - - [Unicode] The Unicode Consortium, "The Unicode Standard, Version - 3.2.0" is defined by "The Unicode Standard, Version - 3.0" (Reading, MA, Addison-Wesley, 2000. ISBN 0-201- - 61633-5), as amended by the "Unicode Standard Annex - #27: Unicode 3.1" - (http://www.unicode.org/reports/tr27/) and by the - "Unicode Standard Annex #28: Unicode 3.2" - (http://www.unicode.org/reports/tr28/). - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. - - [TLS] Dierks, T. and E. Rescorla, "The Transport Layer - Security (TLS) Protocol Version 1.1", RFC 4346, April - 2006. - -9. Informative References - - [ACAP] Newman, C. and J. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November - 1997. - - [CRAM-MD5] Nerenberg, L., Ed., "The CRAM-MD5 SASL Mechanism", Work - in Progress, June 2006. - - [DIGEST-MD5] Melnikov, A., Ed., "Using Digest Authentication as a - SASL Mechanism", Work in Progress, June 2006. - - - - - -Zeilenga Standards Track [Page 8] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - - [IANA-SASL] IANA, "SIMPLE AUTHENTICATION AND SECURITY LAYER (SASL) - MECHANISMS", - <http://www.iana.org/assignments/sasl-mechanisms>. - - [SMTP-AUTH] Myers, J., "SMTP Service Extension for Authentication", - RFC 2554, March 1999. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Zeilenga Standards Track [Page 9] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - -Appendix A. Changes since RFC 2595 - - This appendix is non-normative. - - This document replaces Section 6 of RFC 2595. - - The specification details how the server is to compare client- - provided character strings with stored character strings. - - The ABNF grammar was updated. In particular, the grammar now allows - LINE FEED (U+000A) and CARRIAGE RETURN (U+000D) characters in the - authzid, authcid, passwd productions. However, whether these control - characters may be used depends on the string preparation rules - applicable to the production. For passwd and authcid productions, - control characters are prohibited. For authzid, one must consult the - application-level SASL profile. This change allows PLAIN to carry - all possible authorization identity strings allowed in SASL. - - Pseudo-code was added. - - The example section was expanded to illustrate more features of the - PLAIN mechanism. - -Editor's Address - - Kurt D. Zeilenga - OpenLDAP Foundation - - EMail: Kurt@OpenLDAP.org - - - - - - - - - - - - - - - - - - - - - - -Zeilenga Standards Track [Page 10] - -RFC 4616 The PLAIN SASL Mechanism August 2006 - - -Full Copyright Statement - - Copyright (C) The Internet Society (2006). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET - ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, - INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE - INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the procedures with respect to rights in RFC documents can be - found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - -Acknowledgement - - Funding for the RFC Editor function is provided by the IETF - Administrative Support Activity (IASA). - - - - - - - -Zeilenga Standards Track [Page 11] - diff --git a/proto/rfc5256.txt b/proto/rfc5256.txt @@ -1,1067 +0,0 @@ - - - - - - -Network Working Group M. Crispin -Request for Comments: 5256 Panda Programming -Category: Standards Track K. Murchison - Carnegie Mellon University - June 2008 - - - Internet Message Access Protocol - SORT and THREAD Extensions - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - This document describes the base-level server-based sorting and - threading extensions to the IMAP protocol. These extensions provide - substantial performance improvements for IMAP clients that offer - sorted and threaded views. - -1. Introduction - - The SORT and THREAD extensions to the [IMAP] protocol provide a means - of server-based sorting and threading of messages, without requiring - that the client download the necessary data to do so itself. This is - particularly useful for online clients as described in [IMAP-MODELS]. - - A server that supports the base-level SORT extension indicates this - with a capability name which starts with "SORT". Future, upwards- - compatible extensions to the SORT extension will all start with - "SORT", indicating support for this base level. - - A server that supports the THREAD extension indicates this with one - or more capability names consisting of "THREAD=" followed by a - supported threading algorithm name as described in this document. - This provides for future upwards-compatible extensions. - - A server that implements the SORT and/or THREAD extensions MUST - collate strings in accordance with the requirements of I18NLEVEL=1, - as described in [IMAP-I18N], and SHOULD implement and advertise the - I18NLEVEL=1 extension. Alternatively, a server MAY implement - I18NLEVEL=2 (or higher) and comply with the rules of that level. - - - - - -Crispin & Murchison Standards Track [Page 1] - -RFC 5256 IMAP Sort June 2008 - - - Discussion: The SORT and THREAD extensions predate [IMAP-I18N] by - several years. At the time of this writing, all known server - implementations of SORT and THREAD comply with the rules of - I18NLEVEL=1, but do not necessarily advertise it. As discussed in - [IMAP-I18N] section 4.5, all server implementations should - eventually be updated to comply with the I18NLEVEL=2 extension. - - Historical note: The REFERENCES threading algorithm is based on the - [THREADING] algorithm written and used in "Netscape Mail and News" - versions 2.0 through 3.0. - -2. Terminology - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in [KEYWORDS]. - - The word "can" (not "may") is used to refer to a possible - circumstance or situation, as opposed to an optional facility of the - protocol. - - "User" is used to refer to a human user, whereas "client" refers to - the software being run by the user. - - In examples, "C:" and "S:" indicate lines sent by the client and - server, respectively. - -2.1. Base Subject - - Subject sorting and threading use the "base subject", which has - specific subject artifacts removed. Due to the complexity of these - artifacts, the formal syntax for the subject extraction rules is - ambiguous. The following procedure is followed to determine the - "base subject", using the [ABNF] formal syntax rules described in - section 5: - - (1) Convert any RFC 2047 encoded-words in the subject to [UTF-8] - as described in "Internationalization Considerations". - Convert all tabs and continuations to space. Convert all - multiple spaces to a single space. - - (2) Remove all trailing text of the subject that matches the - subj-trailer ABNF; repeat until no more matches are possible. - - (3) Remove all prefix text of the subject that matches the subj- - leader ABNF. - - - - - -Crispin & Murchison Standards Track [Page 2] - -RFC 5256 IMAP Sort June 2008 - - - (4) If there is prefix text of the subject that matches the subj- - blob ABNF, and removing that prefix leaves a non-empty subj- - base, then remove the prefix text. - - (5) Repeat (3) and (4) until no matches remain. - - Note: It is possible to defer step (2) until step (6), but this - requires checking for subj-trailer in step (4). - - (6) If the resulting text begins with the subj-fwd-hdr ABNF and - ends with the subj-fwd-trl ABNF, remove the subj-fwd-hdr and - subj-fwd-trl and repeat from step (2). - - (7) The resulting text is the "base subject" used in the SORT. - - All servers and disconnected (as described in [IMAP-MODELS]) clients - MUST use exactly this algorithm to determine the "base subject". - Otherwise, there is potential for a user to get inconsistent results - based on whether they are running in connected or disconnected mode. - -2.2. Sent Date - - As used in this document, the term "sent date" refers to the date and - time from the Date: header, adjusted by time zone to normalize to - UTC. For example, "31 Dec 2000 16:01:33 -0800" is equivalent to the - UTC date and time of "1 Jan 2001 00:01:33 +0000". - - If the time zone is invalid, the date and time SHOULD be treated as - UTC. If the time is also invalid, the time SHOULD be treated as - 00:00:00. If there is no valid date or time, the date and time - SHOULD be treated as 00:00:00 on the earliest possible date. - - This differs from the date-related criteria in the SEARCH command - (described in [IMAP] section 6.4.4), which use just the date and not - the time, and are not adjusted by time zone. - - If the sent date cannot be determined (a Date: header is missing or - cannot be parsed), the INTERNALDATE for that message is used as the - sent date. - - When comparing two sent dates that match exactly, the order in which - the two messages appear in the mailbox (that is, by sequence number) - is used as a tie-breaker to determine the order. - - - - - - - - -Crispin & Murchison Standards Track [Page 3] - -RFC 5256 IMAP Sort June 2008 - - -3. Additional Commands - - These commands are extensions to the [IMAP] base protocol. - - The section headings are intended to correspond with where they would - be located in the main document if they were part of the base - specification. - -BASE.6.4.SORT. SORT Command - - Arguments: sort program - charset specification - searching criteria (one or more) - - Data: untagged responses: SORT - - Result: OK - sort completed - NO - sort error: can't sort that charset or - criteria - BAD - command unknown or arguments invalid - - The SORT command is a variant of SEARCH with sorting semantics for - the results. There are two arguments before the searching - criteria argument: a parenthesized list of sort criteria, and the - searching charset. - - The charset argument is mandatory (unlike SEARCH) and indicates - the [CHARSET] of the strings that appear in the searching - criteria. The US-ASCII and [UTF-8] charsets MUST be implemented. - All other charsets are optional. - - There is also a UID SORT command that returns unique identifiers - instead of message sequence numbers. Note that there are separate - searching criteria for message sequence numbers and UIDs; thus, - the arguments to UID SORT are interpreted the same as in SORT. - This is analogous to the behavior of UID SEARCH, as opposed to UID - COPY, UID FETCH, or UID STORE. - - The SORT command first searches the mailbox for messages that - match the given searching criteria using the charset argument for - the interpretation of strings in the searching criteria. It then - returns the matching messages in an untagged SORT response, sorted - according to one or more sort criteria. - - Sorting is in ascending order. Earlier dates sort before later - dates; smaller sizes sort before larger sizes; and strings are - sorted according to ascending values established by their - collation algorithm (see "Internationalization Considerations"). - - - -Crispin & Murchison Standards Track [Page 4] - -RFC 5256 IMAP Sort June 2008 - - - If two or more messages exactly match according to the sorting - criteria, these messages are sorted according to the order in - which they appear in the mailbox. In other words, there is an - implicit sort criterion of "sequence number". - - When multiple sort criteria are specified, the result is sorted in - the priority order that the criteria appear. For example, - (SUBJECT DATE) will sort messages in order by their base subject - text; and for messages with the same base subject text, it will - sort by their sent date. - - Untagged EXPUNGE responses are not permitted while the server is - responding to a SORT command, but are permitted during a UID SORT - command. - - The defined sort criteria are as follows. Refer to the Formal - Syntax section for the precise syntactic definitions of the - arguments. If the associated RFC-822 header for a particular - criterion is absent, it is treated as the empty string. The empty - string always collates before non-empty strings. - - ARRIVAL - Internal date and time of the message. This differs from the - ON criteria in SEARCH, which uses just the internal date. - - CC - [IMAP] addr-mailbox of the first "cc" address. - - DATE - Sent date and time, as described in section 2.2. - - FROM - [IMAP] addr-mailbox of the first "From" address. - - REVERSE - Followed by another sort criterion, has the effect of that - criterion but in reverse (descending) order. - Note: REVERSE only reverses a single criterion, and does not - affect the implicit "sequence number" sort criterion if all - other criteria are identical. Consequently, a sort of - REVERSE SUBJECT is not the same as a reverse ordering of a - SUBJECT sort. This can be avoided by use of additional - criteria, e.g., SUBJECT DATE vs. REVERSE SUBJECT REVERSE - DATE. In general, however, it's better (and faster, if the - client has a "reverse current ordering" command) to reverse - the results in the client instead of issuing a new SORT. - - - - - -Crispin & Murchison Standards Track [Page 5] - -RFC 5256 IMAP Sort June 2008 - - - SIZE - Size of the message in octets. - - SUBJECT - Base subject text. - - TO - [IMAP] addr-mailbox of the first "To" address. - - Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 - S: * SORT 2 84 882 - S: A282 OK SORT completed - C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL - S: * SORT 5 3 4 1 2 - S: A283 OK SORT completed - C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" - S: * SORT - S: A284 OK SORT completed - -BASE.6.4.THREAD. THREAD Command - -Arguments: threading algorithm - charset specification - searching criteria (one or more) - -Data: untagged responses: THREAD - -Result: OK - thread completed - NO - thread error: can't thread that charset or - criteria - BAD - command unknown or arguments invalid - - The THREAD command is a variant of SEARCH with threading semantics - for the results. Thread has two arguments before the searching - criteria argument: a threading algorithm and the searching - charset. - - The charset argument is mandatory (unlike SEARCH) and indicates - the [CHARSET] of the strings that appear in the searching - criteria. The US-ASCII and [UTF-8] charsets MUST be implemented. - All other charsets are optional. - - There is also a UID THREAD command that returns unique identifiers - instead of message sequence numbers. Note that there are separate - searching criteria for message sequence numbers and UIDs; thus the - arguments to UID THREAD are interpreted the same as in THREAD. - This is analogous to the behavior of UID SEARCH, as opposed to UID - COPY, UID FETCH, or UID STORE. - - - -Crispin & Murchison Standards Track [Page 6] - -RFC 5256 IMAP Sort June 2008 - - - The THREAD command first searches the mailbox for messages that - match the given searching criteria using the charset argument for - the interpretation of strings in the searching criteria. It then - returns the matching messages in an untagged THREAD response, - threaded according to the specified threading algorithm. - - All collation is in ascending order. Earlier dates collate before - later dates and strings are collated according to ascending values - established by their collation algorithm (see - "Internationalization Considerations"). - - Untagged EXPUNGE responses are not permitted while the server is - responding to a THREAD command, but are permitted during a UID - THREAD command. - - The defined threading algorithms are as follows: - - ORDEREDSUBJECT - - The ORDEREDSUBJECT threading algorithm is also referred to as - "poor man's threading". The searched messages are sorted by - base subject and then by the sent date. The messages are then - split into separate threads, with each thread containing - messages with the same base subject text. Finally, the threads - are sorted by the sent date of the first message in the thread. - - The top level or "root" in ORDEREDSUBJECT threading contains - the first message of every thread. All messages in the root - are siblings of each other. The second message of a thread is - the child of the first message, and subsequent messages of the - thread are siblings of the second message and hence children of - the message at the root. Hence, there are no grandchildren in - ORDEREDSUBJECT threading. - - Children in ORDEREDSUBJECT threading do not have descendents. - Client implementations SHOULD treat descendents of a child in a - server response as being siblings of that child. - - REFERENCES - - The REFERENCES threading algorithm threads the searched - messages by grouping them together in parent/child - relationships based on which messages are replies to others. - The parent/child relationships are built using two methods: - reconstructing a message's ancestry using the references - contained within it; and checking the original (not base) - subject of a message to see if it is a reply to (or forward of) - another message. - - - -Crispin & Murchison Standards Track [Page 7] - -RFC 5256 IMAP Sort June 2008 - - - Note: "Message ID" in the following description refers to a - normalized form of the msg-id in [RFC2822]. The actual text - in RFC 2822 may use quoting, resulting in multiple ways of - expressing the same Message ID. Implementations of the - REFERENCES threading algorithm MUST normalize any msg-id in - order to avoid false non-matches due to differences in - quoting. - - For example, the msg-id - <"01KF8JCEOCBS0045PS"@xxx.yyy.com> - and the msg-id - <01KF8JCEOCBS0045PS@xxx.yyy.com> - MUST be interpreted as being the same Message ID. - - The references used for reconstructing a message's ancestry are - found using the following rules: - - If a message contains a References header line, then use the - Message IDs in the References header line as the references. - - If a message does not contain a References header line, or - the References header line does not contain any valid - Message IDs, then use the first (if any) valid Message ID - found in the In-Reply-To header line as the only reference - (parent) for this message. - - Note: Although [RFC2822] permits multiple Message IDs in - the In-Reply-To header, in actual practice this - discipline has not been followed. For example, - In-Reply-To headers have been observed with message - addresses after the Message ID, and there are no good - heuristics for software to determine the difference. - This is not a problem with the References header, - however. - - If a message does not contain an In-Reply-To header line, or - the In-Reply-To header line does not contain a valid Message - ID, then the message does not have any references (NIL). - - A message is considered to be a reply or forward if the base - subject extraction rules, applied to the original subject, - remove any of the following: a subj-refwd, a "(fwd)" subj- - trailer, or a subj-fwd-hdr and subj-fwd-trl. - - The REFERENCES algorithm is significantly more complex than - ORDEREDSUBJECT and consists of six main steps. These steps are - outlined in detail below. - - - - -Crispin & Murchison Standards Track [Page 8] - -RFC 5256 IMAP Sort June 2008 - - - (1) For each searched message: - - (A) Using the Message IDs in the message's references, link - the corresponding messages (those whose Message-ID - header line contains the given reference Message ID) - together as parent/child. Make the first reference the - parent of the second (and the second a child of the - first), the second the parent of the third (and the - third a child of the second), etc. The following rules - govern the creation of these links: - - If a message does not contain a Message-ID header - line, or the Message-ID header line does not - contain a valid Message ID, then assign a unique - Message ID to this message. - - If two or more messages have the same Message ID, - then only use that Message ID in the first (lowest - sequence number) message, and assign a unique - Message ID to each of the subsequent messages with - a duplicate of that Message ID. - - If no message can be found with a given Message ID, - create a dummy message with this ID. Use this - dummy message for all subsequent references to this - ID. - - If a message already has a parent, don't change the - existing link. This is done because the References - header line may have been truncated by a Mail User - Agent (MUA). As a result, there is no guarantee - that the messages corresponding to adjacent Message - IDs in the References header line are parent and - child. - - Do not create a parent/child link if creating that - link would introduce a loop. For example, before - making message A the parent of B, make sure that A - is not a descendent of B. - - Note: Message ID comparisons are case-sensitive. - - (B) Create a parent/child link between the last reference - (or NIL if there are no references) and the current - message. If the current message already has a parent, - it is probably the result of a truncated References - header line, so break the current parent/child link - before creating the new correct one. As in step 1.A, - - - -Crispin & Murchison Standards Track [Page 9] - -RFC 5256 IMAP Sort June 2008 - - - do not create the parent/child link if creating that - link would introduce a loop. Note that if this message - has no references, it will now have no parent. - - Note: The parent/child links created in steps 1.A - and 1.B MUST be kept consistent with one another at - ALL times. - - (2) Gather together all of the messages that have no parents - and make them all children (siblings of one another) of a - dummy parent (the "root"). These messages constitute the - first (head) message of the threads created thus far. - - (3) Prune dummy messages from the thread tree. Traverse each - thread under the root, and for each message: - - If it is a dummy message with NO children, delete it. - - If it is a dummy message with children, delete it, but - promote its children to the current level. In other - words, splice them in with the dummy's siblings. - - Do not promote the children if doing so would make them - children of the root, unless there is only one child. - - (4) Sort the messages under the root (top-level siblings only) - by sent date as described in section 2.2. In the case of a - dummy message, sort its children by sent date and then use - the first child for the top-level sort. - - (5) Gather together messages under the root that have the same - base subject text. - - (A) Create a table for associating base subjects with - messages, called the subject table. - - (B) Populate the subject table with one message per each - base subject. For each child of the root: - - (i) Find the subject of this thread, by using the - base subject from either the current message or - its first child if the current message is a - dummy. This is the thread subject. - - (ii) If the thread subject is empty, skip this - message. - - - - - -Crispin & Murchison Standards Track [Page 10] - -RFC 5256 IMAP Sort June 2008 - - - (iii) Look up the message associated with the thread - subject in the subject table. - - (iv) If there is no message in the subject table with - the thread subject, add the current message and - the thread subject to the subject table. - - Otherwise, if the message in the subject table is - not a dummy, AND either of the following criteria - are true: - - The current message is a dummy, OR - - The message in the subject table is a reply - or forward and the current message is not. - - then replace the message in the subject table - with the current message. - - (C) Merge threads with the same thread subject. For each - child of the root: - - (i) Find the message's thread subject as in step - 5.B.i above. - - (ii) If the thread subject is empty, skip this - message. - - (iii) Lookup the message associated with this thread - subject in the subject table. - - (iv) If the message in the subject table is the - current message, skip this message. - - Otherwise, merge the current message with the one in - the subject table using the following rules: - - If both messages are dummies, append the current - message's children to the children of the message - in the subject table (the children of both messages - become siblings), and then delete the current - message. - - If the message in the subject table is a dummy and - the current message is not, make the current - message a child of the message in the subject table - (a sibling of its children). - - - - -Crispin & Murchison Standards Track [Page 11] - -RFC 5256 IMAP Sort June 2008 - - - If the current message is a reply or forward and - the message in the subject table is not, make the - current message a child of the message in the - subject table (a sibling of its children). - - Otherwise, create a new dummy message and make both - the current message and the message in the subject - table children of the dummy. Then replace the - message in the subject table with the dummy - message. - - Note: Subject comparisons are case-insensitive, - as described under "Internationalization - Considerations". - - (6) Traverse the messages under the root and sort each set of - siblings by sent date as described in section 2.2. - Traverse the messages in such a way that the "youngest" set - of siblings are sorted first, and the "oldest" set of - siblings are sorted last (grandchildren are sorted before - children, etc). In the case of a dummy message (which can - only occur with top-level siblings), use its first child - for sorting. - - Example: C: A283 THREAD ORDEREDSUBJECT UTF-8 SINCE 5-MAR-2000 - S: * THREAD (166)(167)(168)(169)(172)(170)(171) - (173)(174 (175)(176)(178)(181)(180))(179)(177 - (183)(182)(188)(184)(185)(186)(187)(189))(190) - (191)(192)(193)(194 195)(196 (197)(198))(199) - (200 202)(201)(203)(204)(205)(206 207)(208) - S: A283 OK THREAD completed - C: A284 THREAD ORDEREDSUBJECT US-ASCII TEXT "gewp" - S: * THREAD - S: A284 OK THREAD completed - C: A285 THREAD REFERENCES UTF-8 SINCE 5-MAR-2000 - S: * THREAD (166)(167)(168)(169)(172)((170)(179)) - (171)(173)((174)(175)(176)(178)(181)(180)) - ((177)(183)(182)(188 (184)(189))(185 186)(187)) - (190)(191)(192)(193)((194)(195 196))(197 198) - (199)(200 202)(201)(203)(204)(205 206 207)(208) - S: A285 OK THREAD completed - - Note: The line breaks in the first and third server - responses are for editorial clarity and do not appear in - real THREAD responses. - - - - - - -Crispin & Murchison Standards Track [Page 12] - -RFC 5256 IMAP Sort June 2008 - - -4. Additional Responses - - These responses are extensions to the [IMAP] base protocol. - - The section headings of these responses are intended to correspond - with where they would be located in the main document. - -BASE.7.2.SORT. SORT Response - - Data: zero or more numbers - - The SORT response occurs as a result of a SORT or UID SORT - command. The number(s) refer to those messages that match the - search criteria. For SORT, these are message sequence numbers; - for UID SORT, these are unique identifiers. Each number is - delimited by a space. - - Example: S: * SORT 2 3 6 - -BASE.7.2.THREAD. THREAD Response - - Data: zero or more threads - - The THREAD response occurs as a result of a THREAD or UID THREAD - command. It contains zero or more threads. A thread consists of - a parenthesized list of thread members. - - Thread members consist of zero or more message numbers, delimited - by spaces, indicating successive parent and child. This continues - until the thread splits into multiple sub-threads, at which point, - the thread nests into multiple sub-threads with the first member - of each sub-thread being siblings at this level. There is no - limit to the nesting of threads. - - The messages numbers refer to those messages that match the search - criteria. For THREAD, these are message sequence numbers; for UID - THREAD, these are unique identifiers. - - Example: S: * THREAD (2)(3 6 (4 23)(44 7 96)) - - The first thread consists only of message 2. The second thread - consists of the messages 3 (parent) and 6 (child), after which it - splits into two sub-threads; the first of which contains messages - 4 (child of 6, sibling of 44) and 23 (child of 4), and the second - of which contains messages 44 (child of 6, sibling of 4), 7 (child - of 44), and 96 (child of 7). Since some later messages are - parents of earlier messages, the messages were probably moved from - some other mailbox at different times. - - - -Crispin & Murchison Standards Track [Page 13] - -RFC 5256 IMAP Sort June 2008 - - - -- 2 - - -- 3 - \-- 6 - |-- 4 - | \-- 23 - | - \-- 44 - \-- 7 - \-- 96 - - Example: S: * THREAD ((3)(5)) - - In this example, 3 and 5 are siblings of a parent that does not - match the search criteria (and/or does not exist in the mailbox); - however they are members of the same thread. - -5. Formal Syntax of SORT and THREAD Commands and Responses - - The following syntax specification uses the Augmented Backus-Naur - Form (ABNF) notation as specified in [ABNF]. It also uses [ABNF] - rules defined in [IMAP]. - -sort = ["UID" SP] "SORT" SP sort-criteria SP search-criteria - -sort-criteria = "(" sort-criterion *(SP sort-criterion) ")" - -sort-criterion = ["REVERSE" SP] sort-key - -sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / - "SUBJECT" / "TO" - -thread = ["UID" SP] "THREAD" SP thread-alg SP search-criteria - -thread-alg = "ORDEREDSUBJECT" / "REFERENCES" / thread-alg-ext - -thread-alg-ext = atom - ; New algorithms MUST be registered with IANA - -search-criteria = charset 1*(SP search-key) - -charset = atom / quoted - ; CHARSET values MUST be registered with IANA - -sort-data = "SORT" *(SP nz-number) - -thread-data = "THREAD" [SP 1*thread-list] - - - - -Crispin & Murchison Standards Track [Page 14] - -RFC 5256 IMAP Sort June 2008 - - -thread-list = "(" (thread-members / thread-nested) ")" - -thread-members = nz-number *(SP nz-number) [SP thread-nested] - -thread-nested = 2*thread-list - - The following syntax describes base subject extraction rules (2)-(6): - -subject = *subj-leader [subj-middle] *subj-trailer - -subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" - -subj-blob = "[" *BLOBCHAR "]" *WSP - -subj-fwd = subj-fwd-hdr subject subj-fwd-trl - -subj-fwd-hdr = "[fwd:" - -subj-fwd-trl = "]" - -subj-leader = (*subj-blob subj-refwd) / WSP - -subj-middle = *subj-blob (subj-base / subj-fwd) - ; last subj-blob is subj-base if subj-base would - ; otherwise be empty - -subj-trailer = "(fwd)" / WSP - -subj-base = NONWSP *(*WSP NONWSP) - ; can be a subj-blob - -BLOBCHAR = %x01-5a / %x5c / %x5e-ff - ; any CHAR8 except '[' and ']'. - ; SHOULD comply with [UTF-8] - -NONWSP = %x01-08 / %x0a-1f / %x21-ff - ; any CHAR8 other than WSP. - ; SHOULD comply with [UTF-8] - -6. Security Considerations - - The SORT and THREAD extensions do not raise any security - considerations that are not present in the base [IMAP] protocol, and - these issues are discussed in [IMAP]. Nevertheless, it is important - to remember that [IMAP] protocol transactions, including message - data, are sent in the clear over the network unless protection from - snooping is negotiated, either by the use of STARTTLS, privacy - protection in AUTHENTICATE, or some other protection mechanism. - - - -Crispin & Murchison Standards Track [Page 15] - -RFC 5256 IMAP Sort June 2008 - - - Although not a security consideration, it is important to recognize - that sorting by REFERENCES can lead to misleading threading trees. - For example, a message with false References: header data will cause - a thread to be incorporated into another thread. - - The process of extracting the base subject may lead to incorrect - collation if the extracted data was significant text as opposed to a - subject artifact. - -7. Internationalization Considerations - - As stated in the introduction, the rules of I18NLEVEL=1 as described - in [IMAP-I18N] MUST be followed; that is, the SORT and THREAD - extensions MUST collate strings according to the i;unicode-casemap - collation described in [UNICASEMAP]. Servers SHOULD also advertise - the I18NLEVEL=1 extension. Alternatively, a server MAY implement - I18NLEVEL=2 (or higher) and comply with the rules of that level. - - As discussed in [IMAP-I18N] section 4.5, all server implementations - should eventually be updated to support the [IMAP-I18N] I18NLEVEL=2 - extension. - - Translations of the "re" or "fw"/"fwd" tokens are not specified for - removal in the base subject extraction process. An attempt to add - such translated tokens would result in a geometrically complex, and - ultimately unimplementable, task. - - Instead, note that [RFC2822] section 3.6.5 recommends that "re:" - (from the Latin "res", meaning "in the matter of") be used to - identify a reply. Although it is evident that, from the multiple - forms of token to identify a forwarded message, there is considerable - variation found in the wild, the variations are (still) manageable. - Consequently, it is suggested that "re:" and one of the variations of - the tokens for a forward supported by the base subject extraction - rules be adopted for Internet mail messages, since doing so makes it - a simple display-time task to localize the token language for the - user. - -8. IANA Considerations - - [IMAP] capabilities are registered by publishing a standards track or - IESG-approved experimental RFC. This document constitutes - registration of the SORT and THREAD capabilities in the [IMAP] - capabilities registry. - - - - - - - -Crispin & Murchison Standards Track [Page 16] - -RFC 5256 IMAP Sort June 2008 - - - This document creates a new [IMAP] threading algorithms registry, - which registers threading algorithms by publishing a standards track - or IESG-approved experimental RFC. This document constitutes - registration of the ORDEREDSUBJECT and REFERENCES algorithms in that - registry. - -9. Normative References - - [ABNF] Crocker, D., Ed., and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", STD 68, RFC 5234, January - 2008. - - [CHARSET] Freed, N. and J. Postel, "IANA Charset Registration - Procedures", BCP 19, RFC 2978, October 2000. - - [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - - VERSION 4rev1", RFC 3501, March 2003. - - [IMAP-I18N] Newman, C., Gulbrandsen, A., and A. Melnikov, "Internet - Message Access Protocol Internationalization", RFC - 5255, June 2008. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC2822] Resnick, P., Ed., "Internet Message Format", RFC 2822, - April 2001. - - [UNICASEMAP] Crispin, M., "i;unicode-casemap - Simple Unicode - Collation Algorithm", RFC 5051, October 2007. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. - -10. Informative References - - [IMAP-MODELS] Crispin, M., "Distributed Electronic Mail Models in - IMAP4", RFC 1733, December 1994. - - [THREADING] Zawinski, J. "Message Threading", - http://www.jwz.org/doc/threading.html, 1997-2002. - - - - - - - - - - -Crispin & Murchison Standards Track [Page 17] - -RFC 5256 IMAP Sort June 2008 - - -Authors' Addresses - - Mark R. Crispin - Panda Programming - 6158 NE Lariat Loop - Bainbridge Island, WA 98110-2098 - - Phone: +1 (206) 842-2385 - EMail: IMAP+SORT+THREAD@Lingling.Panda.COM - - - Kenneth Murchison - Carnegie Mellon University - 5000 Forbes Avenue - Cyert Hall 285 - Pittsburgh, PA 15213 - - Phone: +1 (412) 268-2638 - EMail: murch@andrew.cmu.edu - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Crispin & Murchison Standards Track [Page 18] - -RFC 5256 IMAP Sort June 2008 - - -Full Copyright Statement - - Copyright (C) The IETF Trust (2008). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND - THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS - OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF - THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the procedures with respect to rights in RFC documents can be - found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - - - - - - - - - - - - -Crispin & Murchison Standards Track [Page 19] - diff --git a/proto/rfc5322.txt b/proto/rfc5322.txt @@ -1,3195 +0,0 @@ - - - - - - -Network Working Group P. Resnick, Ed. -Request for Comments: 5322 Qualcomm Incorporated -Obsoletes: 2822 October 2008 -Updates: 4021 -Category: Standards Track - - - Internet Message Format - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - This document specifies the Internet Message Format (IMF), a syntax - for text messages that are sent between computer users, within the - framework of "electronic mail" messages. This specification is a - revision of Request For Comments (RFC) 2822, which itself superseded - Request For Comments (RFC) 822, "Standard for the Format of ARPA - Internet Text Messages", updating it to reflect current practice and - incorporating incremental changes that were specified in other RFCs. - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 1] - -RFC 5322 Internet Message Format October 2008 - - -Table of Contents - - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 5 - 1.2.1. Requirements Notation . . . . . . . . . . . . . . . . 5 - 1.2.2. Syntactic Notation . . . . . . . . . . . . . . . . . . 5 - 1.2.3. Structure of This Document . . . . . . . . . . . . . . 5 - 2. Lexical Analysis of Messages . . . . . . . . . . . . . . . . . 6 - 2.1. General Description . . . . . . . . . . . . . . . . . . . 6 - 2.1.1. Line Length Limits . . . . . . . . . . . . . . . . . . 7 - 2.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 8 - 2.2.1. Unstructured Header Field Bodies . . . . . . . . . . . 8 - 2.2.2. Structured Header Field Bodies . . . . . . . . . . . . 8 - 2.2.3. Long Header Fields . . . . . . . . . . . . . . . . . . 8 - 2.3. Body . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 10 - 3.2. Lexical Tokens . . . . . . . . . . . . . . . . . . . . . . 10 - 3.2.1. Quoted characters . . . . . . . . . . . . . . . . . . 10 - 3.2.2. Folding White Space and Comments . . . . . . . . . . . 11 - 3.2.3. Atom . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 3.2.4. Quoted Strings . . . . . . . . . . . . . . . . . . . . 13 - 3.2.5. Miscellaneous Tokens . . . . . . . . . . . . . . . . . 14 - 3.3. Date and Time Specification . . . . . . . . . . . . . . . 14 - 3.4. Address Specification . . . . . . . . . . . . . . . . . . 16 - 3.4.1. Addr-Spec Specification . . . . . . . . . . . . . . . 17 - 3.5. Overall Message Syntax . . . . . . . . . . . . . . . . . . 18 - 3.6. Field Definitions . . . . . . . . . . . . . . . . . . . . 19 - 3.6.1. The Origination Date Field . . . . . . . . . . . . . . 22 - 3.6.2. Originator Fields . . . . . . . . . . . . . . . . . . 22 - 3.6.3. Destination Address Fields . . . . . . . . . . . . . . 23 - 3.6.4. Identification Fields . . . . . . . . . . . . . . . . 25 - 3.6.5. Informational Fields . . . . . . . . . . . . . . . . . 27 - 3.6.6. Resent Fields . . . . . . . . . . . . . . . . . . . . 28 - 3.6.7. Trace Fields . . . . . . . . . . . . . . . . . . . . . 30 - 3.6.8. Optional Fields . . . . . . . . . . . . . . . . . . . 30 - 4. Obsolete Syntax . . . . . . . . . . . . . . . . . . . . . . . 31 - 4.1. Miscellaneous Obsolete Tokens . . . . . . . . . . . . . . 32 - 4.2. Obsolete Folding White Space . . . . . . . . . . . . . . . 33 - 4.3. Obsolete Date and Time . . . . . . . . . . . . . . . . . . 33 - 4.4. Obsolete Addressing . . . . . . . . . . . . . . . . . . . 35 - 4.5. Obsolete Header Fields . . . . . . . . . . . . . . . . . . 35 - 4.5.1. Obsolete Origination Date Field . . . . . . . . . . . 36 - 4.5.2. Obsolete Originator Fields . . . . . . . . . . . . . . 36 - 4.5.3. Obsolete Destination Address Fields . . . . . . . . . 37 - 4.5.4. Obsolete Identification Fields . . . . . . . . . . . . 37 - 4.5.5. Obsolete Informational Fields . . . . . . . . . . . . 37 - - - -Resnick Standards Track [Page 2] - -RFC 5322 Internet Message Format October 2008 - - - 4.5.6. Obsolete Resent Fields . . . . . . . . . . . . . . . . 38 - 4.5.7. Obsolete Trace Fields . . . . . . . . . . . . . . . . 38 - 4.5.8. Obsolete optional fields . . . . . . . . . . . . . . . 38 - 5. Security Considerations . . . . . . . . . . . . . . . . . . . 38 - 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 - Appendix A. Example Messages . . . . . . . . . . . . . . . . . 43 - Appendix A.1. Addressing Examples . . . . . . . . . . . . . . . 44 - Appendix A.1.1. A Message from One Person to Another with - Simple Addressing . . . . . . . . . . . . . . . . 44 - Appendix A.1.2. Different Types of Mailboxes . . . . . . . . . . . 45 - Appendix A.1.3. Group Addresses . . . . . . . . . . . . . . . . . 45 - Appendix A.2. Reply Messages . . . . . . . . . . . . . . . . . . 46 - Appendix A.3. Resent Messages . . . . . . . . . . . . . . . . . 47 - Appendix A.4. Messages with Trace Fields . . . . . . . . . . . . 48 - Appendix A.5. White Space, Comments, and Other Oddities . . . . 49 - Appendix A.6. Obsoleted Forms . . . . . . . . . . . . . . . . . 50 - Appendix A.6.1. Obsolete Addressing . . . . . . . . . . . . . . . 50 - Appendix A.6.2. Obsolete Dates . . . . . . . . . . . . . . . . . . 50 - Appendix A.6.3. Obsolete White Space and Comments . . . . . . . . 51 - Appendix B. Differences from Earlier Specifications . . . . . 52 - Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . 53 - 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 55 - 7.1. Normative References . . . . . . . . . . . . . . . . . . . 55 - 7.2. Informative References . . . . . . . . . . . . . . . . . . 55 - - - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 3] - -RFC 5322 Internet Message Format October 2008 - - -1. Introduction - -1.1. Scope - - This document specifies the Internet Message Format (IMF), a syntax - for text messages that are sent between computer users, within the - framework of "electronic mail" messages. This specification is an - update to [RFC2822], which itself superseded [RFC0822], updating it - to reflect current practice and incorporating incremental changes - that were specified in other RFCs such as [RFC1123]. - - This document specifies a syntax only for text messages. In - particular, it makes no provision for the transmission of images, - audio, or other sorts of structured data in electronic mail messages. - There are several extensions published, such as the MIME document - series ([RFC2045], [RFC2046], [RFC2049]), which describe mechanisms - for the transmission of such data through electronic mail, either by - extending the syntax provided here or by structuring such messages to - conform to this syntax. Those mechanisms are outside of the scope of - this specification. - - In the context of electronic mail, messages are viewed as having an - envelope and contents. The envelope contains whatever information is - needed to accomplish transmission and delivery. (See [RFC5321] for a - discussion of the envelope.) The contents comprise the object to be - delivered to the recipient. This specification applies only to the - format and some of the semantics of message contents. It contains no - specification of the information in the envelope. - - However, some message systems may use information from the contents - to create the envelope. It is intended that this specification - facilitate the acquisition of such information by programs. - - This specification is intended as a definition of what message - content format is to be passed between systems. Though some message - systems locally store messages in this format (which eliminates the - need for translation between formats) and others use formats that - differ from the one specified in this specification, local storage is - outside of the scope of this specification. - - Note: This specification is not intended to dictate the internal - formats used by sites, the specific message system features that - they are expected to support, or any of the characteristics of - user interface programs that create or read messages. In - addition, this document does not specify an encoding of the - characters for either transport or storage; that is, it does not - specify the number of bits used or how those bits are specifically - transferred over the wire or stored on disk. - - - -Resnick Standards Track [Page 4] - -RFC 5322 Internet Message Format October 2008 - - -1.2. Notational Conventions - -1.2.1. Requirements Notation - - This document occasionally uses terms that appear in capital letters. - When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD - NOT", and "MAY" appear capitalized, they are being used to indicate - particular requirements of this specification. A discussion of the - meanings of these terms appears in [RFC2119]. - -1.2.2. Syntactic Notation - - This specification uses the Augmented Backus-Naur Form (ABNF) - [RFC5234] notation for the formal definitions of the syntax of - messages. Characters will be specified either by a decimal value - (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by - a case-insensitive literal value enclosed in quotation marks (e.g., - "A" for either uppercase or lowercase A). - -1.2.3. Structure of This Document - - This document is divided into several sections. - - This section, section 1, is a short introduction to the document. - - Section 2 lays out the general description of a message and its - constituent parts. This is an overview to help the reader understand - some of the general principles used in the later portions of this - document. Any examples in this section MUST NOT be taken as - specification of the formal syntax of any part of a message. - - Section 3 specifies formal ABNF rules for the structure of each part - of a message (the syntax) and describes the relationship between - those parts and their meaning in the context of a message (the - semantics). That is, it lays out the actual rules for the structure - of each part of a message (the syntax) as well as a description of - the parts and instructions for their interpretation (the semantics). - This includes analysis of the syntax and semantics of subparts of - messages that have specific structure. The syntax included in - section 3 represents messages as they MUST be created. There are - also notes in section 3 to indicate if any of the options specified - in the syntax SHOULD be used over any of the others. - - Both sections 2 and 3 describe messages that are legal to generate - for purposes of this specification. - - - - - - -Resnick Standards Track [Page 5] - -RFC 5322 Internet Message Format October 2008 - - - Section 4 of this document specifies an "obsolete" syntax. There are - references in section 3 to these obsolete syntactic elements. The - rules of the obsolete syntax are elements that have appeared in - earlier versions of this specification or have previously been widely - used in Internet messages. As such, these elements MUST be - interpreted by parsers of messages in order to be conformant to this - specification. However, since items in this syntax have been - determined to be non-interoperable or to cause significant problems - for recipients of messages, they MUST NOT be generated by creators of - conformant messages. - - Section 5 details security considerations to take into account when - implementing this specification. - - Appendix A lists examples of different sorts of messages. These - examples are not exhaustive of the types of messages that appear on - the Internet, but give a broad overview of certain syntactic forms. - - Appendix B lists the differences between this specification and - earlier specifications for Internet messages. - - Appendix C contains acknowledgements. - -2. Lexical Analysis of Messages - -2.1. General Description - - At the most basic level, a message is a series of characters. A - message that is conformant with this specification is composed of - characters with values in the range of 1 through 127 and interpreted - as US-ASCII [ANSI.X3-4.1986] characters. For brevity, this document - sometimes refers to this range of characters as simply "US-ASCII - characters". - - Note: This document specifies that messages are made up of - characters in the US-ASCII range of 1 through 127. There are - other documents, specifically the MIME document series ([RFC2045], - [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]), that - extend this specification to allow for values outside of that - range. Discussion of those mechanisms is not within the scope of - this specification. - - Messages are divided into lines of characters. A line is a series of - characters that is delimited with the two characters carriage-return - and line-feed; that is, the carriage return (CR) character (ASCII - value 13) followed immediately by the line feed (LF) character (ASCII - value 10). (The carriage return/line feed pair is usually written in - this document as "CRLF".) - - - -Resnick Standards Track [Page 6] - -RFC 5322 Internet Message Format October 2008 - - - A message consists of header fields (collectively called "the header - section of the message") followed, optionally, by a body. The header - section is a sequence of lines of characters with special syntax as - defined in this specification. The body is simply a sequence of - characters that follows the header section and is separated from the - header section by an empty line (i.e., a line with nothing preceding - the CRLF). - - Note: Common parlance and earlier versions of this specification - use the term "header" to either refer to the entire header section - or to refer to an individual header field. To avoid ambiguity, - this document does not use the terms "header" or "headers" in - isolation, but instead always uses "header field" to refer to the - individual field and "header section" to refer to the entire - collection. - -2.1.1. Line Length Limits - - There are two limits that this specification places on the number of - characters in a line. Each line of characters MUST be no more than - 998 characters, and SHOULD be no more than 78 characters, excluding - the CRLF. - - The 998 character limit is due to limitations in many implementations - that send, receive, or store IMF messages which simply cannot handle - more than 998 characters on a line. Receiving implementations would - do well to handle an arbitrarily large number of characters in a line - for robustness sake. However, there are so many implementations that - (in compliance with the transport requirements of [RFC5321]) do not - accept messages containing more than 1000 characters including the CR - and LF per line, it is important for implementations not to create - such messages. - - The more conservative 78 character recommendation is to accommodate - the many implementations of user interfaces that display these - messages which may truncate, or disastrously wrap, the display of - more than 78 characters per line, in spite of the fact that such - implementations are non-conformant to the intent of this - specification (and that of [RFC5321] if they actually cause - information to be lost). Again, even though this limitation is put - on messages, it is incumbent upon implementations that display - messages to handle an arbitrarily large number of characters in a - line (certainly at least up to the 998 character limit) for the sake - of robustness. - - - - - - - -Resnick Standards Track [Page 7] - -RFC 5322 Internet Message Format October 2008 - - -2.2. Header Fields - - Header fields are lines beginning with a field name, followed by a - colon (":"), followed by a field body, and terminated by CRLF. A - field name MUST be composed of printable US-ASCII characters (i.e., - characters that have values between 33 and 126, inclusive), except - colon. A field body may be composed of printable US-ASCII characters - as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, - ASCII value 9) characters (together known as the white space - characters, WSP). A field body MUST NOT include CR and LF except - when used in "folding" and "unfolding", as described in section - 2.2.3. All field bodies MUST conform to the syntax described in - sections 3 and 4 of this specification. - -2.2.1. Unstructured Header Field Bodies - - Some field bodies in this specification are defined simply as - "unstructured" (which is specified in section 3.2.5 as any printable - US-ASCII characters plus white space characters) with no further - restrictions. These are referred to as unstructured field bodies. - Semantically, unstructured field bodies are simply to be treated as a - single line of characters with no further processing (except for - "folding" and "unfolding" as described in section 2.2.3). - -2.2.2. Structured Header Field Bodies - - Some field bodies in this specification have a syntax that is more - restrictive than the unstructured field bodies described above. - These are referred to as "structured" field bodies. Structured field - bodies are sequences of specific lexical tokens as described in - sections 3 and 4 of this specification. Many of these tokens are - allowed (according to their syntax) to be introduced or end with - comments (as described in section 3.2.2) as well as the white space - characters, and those white space characters are subject to "folding" - and "unfolding" as described in section 2.2.3. Semantic analysis of - structured field bodies is given along with their syntax. - -2.2.3. Long Header Fields - - Each header field is logically a single line of characters comprising - the field name, the colon, and the field body. For convenience - however, and to deal with the 998/78 character limitations per line, - the field body portion of a header field can be split into a - multiple-line representation; this is called "folding". The general - rule is that wherever this specification allows for folding white - space (not simply WSP characters), a CRLF may be inserted before any - WSP. - - - - -Resnick Standards Track [Page 8] - -RFC 5322 Internet Message Format October 2008 - - - For example, the header field: - - Subject: This is a test - - can be represented as: - - Subject: This - is a test - - Note: Though structured field bodies are defined in such a way - that folding can take place between many of the lexical tokens - (and even within some of the lexical tokens), folding SHOULD be - limited to placing the CRLF at higher-level syntactic breaks. For - instance, if a field body is defined as comma-separated values, it - is recommended that folding occur after the comma separating the - structured items in preference to other places where the field - could be folded, even if it is allowed elsewhere. - - The process of moving from this folded multiple-line representation - of a header field to its single line representation is called - "unfolding". Unfolding is accomplished by simply removing any CRLF - that is immediately followed by WSP. Each header field should be - treated in its unfolded form for further syntactic and semantic - evaluation. An unfolded header field has no length restriction and - therefore may be indeterminately long. - -2.3. Body - - The body of a message is simply lines of US-ASCII characters. The - only two limitations on the body are as follows: - - o CR and LF MUST only occur together as CRLF; they MUST NOT appear - independently in the body. - o Lines of characters in the body MUST be limited to 998 characters, - and SHOULD be limited to 78 characters, excluding the CRLF. - - Note: As was stated earlier, there are other documents, - specifically the MIME documents ([RFC2045], [RFC2046], [RFC2049], - [RFC4288], [RFC4289]), that extend (and limit) this specification - to allow for different sorts of message bodies. Again, these - mechanisms are beyond the scope of this document. - - - - - - - - - - -Resnick Standards Track [Page 9] - -RFC 5322 Internet Message Format October 2008 - - -3. Syntax - -3.1. Introduction - - The syntax as given in this section defines the legal syntax of - Internet messages. Messages that are conformant to this - specification MUST conform to the syntax in this section. If there - are options in this section where one option SHOULD be generated, - that is indicated either in the prose or in a comment next to the - syntax. - - For the defined expressions, a short description of the syntax and - use is given, followed by the syntax in ABNF, followed by a semantic - analysis. The following primitive tokens that are used but otherwise - unspecified are taken from the "Core Rules" of [RFC5234], Appendix - B.1: CR, LF, CRLF, HTAB, SP, WSP, DQUOTE, DIGIT, ALPHA, and VCHAR. - - In some of the definitions, there will be non-terminals whose names - start with "obs-". These "obs-" elements refer to tokens defined in - the obsolete syntax in section 4. In all cases, these productions - are to be ignored for the purposes of generating legal Internet - messages and MUST NOT be used as part of such a message. However, - when interpreting messages, these tokens MUST be honored as part of - the legal syntax. In this sense, section 3 defines a grammar for the - generation of messages, with "obs-" elements that are to be ignored, - while section 4 adds grammar for the interpretation of messages. - -3.2. Lexical Tokens - - The following rules are used to define an underlying lexical - analyzer, which feeds tokens to the higher-level parsers. This - section defines the tokens used in structured header field bodies. - - Note: Readers of this specification need to pay special attention - to how these lexical tokens are used in both the lower-level and - higher-level syntax later in the document. Particularly, the - white space tokens and the comment tokens defined in section 3.2.2 - get used in the lower-level tokens defined here, and those lower- - level tokens are in turn used as parts of the higher-level tokens - defined later. Therefore, white space and comments may be allowed - in the higher-level tokens even though they may not explicitly - appear in a particular definition. - -3.2.1. Quoted characters - - Some characters are reserved for special interpretation, such as - delimiting lexical tokens. To permit use of these characters as - uninterpreted data, a quoting mechanism is provided. - - - -Resnick Standards Track [Page 10] - -RFC 5322 Internet Message Format October 2008 - - - quoted-pair = ("\" (VCHAR / WSP)) / obs-qp - - Where any quoted-pair appears, it is to be interpreted as the - character alone. That is to say, the "\" character that appears as - part of a quoted-pair is semantically "invisible". - - Note: The "\" character may appear in a message where it is not - part of a quoted-pair. A "\" character that does not appear in a - quoted-pair is not semantically invisible. The only places in - this specification where quoted-pair currently appears are - ccontent, qcontent, and in obs-dtext in section 4. - -3.2.2. Folding White Space and Comments - - White space characters, including white space used in folding - (described in section 2.2.3), may appear between many elements in - header field bodies. Also, strings of characters that are treated as - comments may be included in structured field bodies as characters - enclosed in parentheses. The following defines the folding white - space (FWS) and comment constructs. - - Strings of characters enclosed in parentheses are considered comments - so long as they do not appear within a "quoted-string", as defined in - section 3.2.4. Comments may nest. - - There are several places in this specification where comments and FWS - may be freely inserted. To accommodate that syntax, an additional - token for "CFWS" is defined for places where comments and/or FWS can - occur. However, where CFWS occurs in this specification, it MUST NOT - be inserted in such a way that any line of a folded header field is - made up entirely of WSP characters and nothing else. - - FWS = ([*WSP CRLF] 1*WSP) / obs-FWS - ; Folding white space - - ctext = %d33-39 / ; Printable US-ASCII - %d42-91 / ; characters not including - %d93-126 / ; "(", ")", or "\" - obs-ctext - - ccontent = ctext / quoted-pair / comment - - comment = "(" *([FWS] ccontent) [FWS] ")" - - CFWS = (1*([FWS] comment) [FWS]) / FWS - - - - - - -Resnick Standards Track [Page 11] - -RFC 5322 Internet Message Format October 2008 - - - Throughout this specification, where FWS (the folding white space - token) appears, it indicates a place where folding, as discussed in - section 2.2.3, may take place. Wherever folding appears in a message - (that is, a header field body containing a CRLF followed by any WSP), - unfolding (removal of the CRLF) is performed before any further - semantic analysis is performed on that header field according to this - specification. That is to say, any CRLF that appears in FWS is - semantically "invisible". - - A comment is normally used in a structured field body to provide some - human-readable informational text. Since a comment is allowed to - contain FWS, folding is permitted within the comment. Also note that - since quoted-pair is allowed in a comment, the parentheses and - backslash characters may appear in a comment, so long as they appear - as a quoted-pair. Semantically, the enclosing parentheses are not - part of the comment; the comment is what is contained between the two - parentheses. As stated earlier, the "\" in any quoted-pair and the - CRLF in any FWS that appears within the comment are semantically - "invisible" and therefore not part of the comment either. - - Runs of FWS, comment, or CFWS that occur between lexical tokens in a - structured header field are semantically interpreted as a single - space character. - -3.2.3. Atom - - Several productions in structured header field bodies are simply - strings of certain basic characters. Such productions are called - atoms. - - Some of the structured header field bodies also allow the period - character (".", ASCII value 46) within runs of atext. An additional - "dot-atom" token is defined for those purposes. - - Note: The "specials" token does not appear anywhere else in this - specification. It is simply the visible (i.e., non-control, non- - white space) characters that do not appear in atext. It is - provided only because it is useful for implementers who use tools - that lexically analyze messages. Each of the characters in - specials can be used to indicate a tokenization point in lexical - analysis. - - - - - - - - - - -Resnick Standards Track [Page 12] - -RFC 5322 Internet Message Format October 2008 - - - atext = ALPHA / DIGIT / ; Printable US-ASCII - "!" / "#" / ; characters not including - "$" / "%" / ; specials. Used for atoms. - "&" / "'" / - "*" / "+" / - "-" / "/" / - "=" / "?" / - "^" / "_" / - "`" / "{" / - "|" / "}" / - "~" - - atom = [CFWS] 1*atext [CFWS] - - dot-atom-text = 1*atext *("." 1*atext) - - dot-atom = [CFWS] dot-atom-text [CFWS] - - specials = "(" / ")" / ; Special characters that do - "<" / ">" / ; not appear in atext - "[" / "]" / - ":" / ";" / - "@" / "\" / - "," / "." / - DQUOTE - - Both atom and dot-atom are interpreted as a single unit, comprising - the string of characters that make it up. Semantically, the optional - comments and FWS surrounding the rest of the characters are not part - of the atom; the atom is only the run of atext characters in an atom, - or the atext and "." characters in a dot-atom. - -3.2.4. Quoted Strings - - Strings of characters that include characters other than those - allowed in atoms can be represented in a quoted string format, where - the characters are surrounded by quote (DQUOTE, ASCII value 34) - characters. - - - - - - - - - - - - - -Resnick Standards Track [Page 13] - -RFC 5322 Internet Message Format October 2008 - - - qtext = %d33 / ; Printable US-ASCII - %d35-91 / ; characters not including - %d93-126 / ; "\" or the quote character - obs-qtext - - qcontent = qtext / quoted-pair - - quoted-string = [CFWS] - DQUOTE *([FWS] qcontent) [FWS] DQUOTE - [CFWS] - - A quoted-string is treated as a unit. That is, quoted-string is - identical to atom, semantically. Since a quoted-string is allowed to - contain FWS, folding is permitted. Also note that since quoted-pair - is allowed in a quoted-string, the quote and backslash characters may - appear in a quoted-string so long as they appear as a quoted-pair. - - Semantically, neither the optional CFWS outside of the quote - characters nor the quote characters themselves are part of the - quoted-string; the quoted-string is what is contained between the two - quote characters. As stated earlier, the "\" in any quoted-pair and - the CRLF in any FWS/CFWS that appears within the quoted-string are - semantically "invisible" and therefore not part of the quoted-string - either. - -3.2.5. Miscellaneous Tokens - - Three additional tokens are defined: word and phrase for combinations - of atoms and/or quoted-strings, and unstructured for use in - unstructured header fields and in some places within structured - header fields. - - word = atom / quoted-string - - phrase = 1*word / obs-phrase - - unstructured = (*([FWS] VCHAR) *WSP) / obs-unstruct - -3.3. Date and Time Specification - - Date and time values occur in several header fields. This section - specifies the syntax for a full date and time specification. Though - folding white space is permitted throughout the date-time - specification, it is RECOMMENDED that a single space be used in each - place that FWS appears (whether it is required or optional); some - older implementations will not interpret longer sequences of folding - white space correctly. - - - - -Resnick Standards Track [Page 14] - -RFC 5322 Internet Message Format October 2008 - - - date-time = [ day-of-week "," ] date time [CFWS] - - day-of-week = ([FWS] day-name) / obs-day-of-week - - day-name = "Mon" / "Tue" / "Wed" / "Thu" / - "Fri" / "Sat" / "Sun" - - date = day month year - - day = ([FWS] 1*2DIGIT FWS) / obs-day - - month = "Jan" / "Feb" / "Mar" / "Apr" / - "May" / "Jun" / "Jul" / "Aug" / - "Sep" / "Oct" / "Nov" / "Dec" - - year = (FWS 4*DIGIT FWS) / obs-year - - time = time-of-day zone - - time-of-day = hour ":" minute [ ":" second ] - - hour = 2DIGIT / obs-hour - - minute = 2DIGIT / obs-minute - - second = 2DIGIT / obs-second - - zone = (FWS ( "+" / "-" ) 4DIGIT) / obs-zone - - The day is the numeric day of the month. The year is any numeric - year 1900 or later. - - The time-of-day specifies the number of hours, minutes, and - optionally seconds since midnight of the date indicated. - - The date and time-of-day SHOULD express local time. - - The zone specifies the offset from Coordinated Universal Time (UTC, - formerly referred to as "Greenwich Mean Time") that the date and - time-of-day represent. The "+" or "-" indicates whether the time-of- - day is ahead of (i.e., east of) or behind (i.e., west of) Universal - Time. The first two digits indicate the number of hours difference - from Universal Time, and the last two digits indicate the number of - additional minutes difference from Universal Time. (Hence, +hhmm - means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) - minutes). The form "+0000" SHOULD be used to indicate a time zone at - Universal Time. Though "-0000" also indicates Universal Time, it is - - - - -Resnick Standards Track [Page 15] - -RFC 5322 Internet Message Format October 2008 - - - used to indicate that the time was generated on a system that may be - in a local time zone other than Universal Time and that the date-time - contains no information about the local time zone. - - A date-time specification MUST be semantically valid. That is, the - day-of-week (if included) MUST be the day implied by the date, the - numeric day-of-month MUST be between 1 and the number of days allowed - for the specified month (in the specified year), the time-of-day MUST - be in the range 00:00:00 through 23:59:60 (the number of seconds - allowing for a leap second; see [RFC1305]), and the last two digits - of the zone MUST be within the range 00 through 59. - -3.4. Address Specification - - Addresses occur in several message header fields to indicate senders - and recipients of messages. An address may either be an individual - mailbox, or a group of mailboxes. - - address = mailbox / group - - mailbox = name-addr / addr-spec - - name-addr = [display-name] angle-addr - - angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / - obs-angle-addr - - group = display-name ":" [group-list] ";" [CFWS] - - display-name = phrase - - mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list - - address-list = (address *("," address)) / obs-addr-list - - group-list = mailbox-list / CFWS / obs-group-list - - A mailbox receives mail. It is a conceptual entity that does not - necessarily pertain to file storage. For example, some sites may - choose to print mail on a printer and deliver the output to the - addressee's desk. - - Normally, a mailbox is composed of two parts: (1) an optional display - name that indicates the name of the recipient (which can be a person - or a system) that could be displayed to the user of a mail - application, and (2) an addr-spec address enclosed in angle brackets - - - - - -Resnick Standards Track [Page 16] - -RFC 5322 Internet Message Format October 2008 - - - ("<" and ">"). There is an alternate simple form of a mailbox where - the addr-spec address appears alone, without the recipient's name or - the angle brackets. The Internet addr-spec address is described in - section 3.4.1. - - Note: Some legacy implementations used the simple form where the - addr-spec appears without the angle brackets, but included the - name of the recipient in parentheses as a comment following the - addr-spec. Since the meaning of the information in a comment is - unspecified, implementations SHOULD use the full name-addr form of - the mailbox, instead of the legacy form, to specify the display - name associated with a mailbox. Also, because some legacy - implementations interpret the comment, comments generally SHOULD - NOT be used in address fields to avoid confusing such - implementations. - - When it is desirable to treat several mailboxes as a single unit - (i.e., in a distribution list), the group construct can be used. The - group construct allows the sender to indicate a named group of - recipients. This is done by giving a display name for the group, - followed by a colon, followed by a comma-separated list of any number - of mailboxes (including zero and one), and ending with a semicolon. - Because the list of mailboxes can be empty, using the group construct - is also a simple way to communicate to recipients that the message - was sent to one or more named sets of recipients, without actually - providing the individual mailbox address for any of those recipients. - -3.4.1. Addr-Spec Specification - - An addr-spec is a specific Internet identifier that contains a - locally interpreted string followed by the at-sign character ("@", - ASCII value 64) followed by an Internet domain. The locally - interpreted string is either a quoted-string or a dot-atom. If the - string can be represented as a dot-atom (that is, it contains no - characters other than atext characters or "." surrounded by atext - characters), then the dot-atom form SHOULD be used and the quoted- - string form SHOULD NOT be used. Comments and folding white space - SHOULD NOT be used around the "@" in the addr-spec. - - Note: A liberal syntax for the domain portion of addr-spec is - given here. However, the domain portion contains addressing - information specified by and used in other protocols (e.g., - [RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore - incumbent upon implementations to conform to the syntax of - addresses for the context in which they are used. - - - - - - -Resnick Standards Track [Page 17] - -RFC 5322 Internet Message Format October 2008 - - - addr-spec = local-part "@" domain - - local-part = dot-atom / quoted-string / obs-local-part - - domain = dot-atom / domain-literal / obs-domain - - domain-literal = [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS] - - dtext = %d33-90 / ; Printable US-ASCII - %d94-126 / ; characters not including - obs-dtext ; "[", "]", or "\" - - The domain portion identifies the point to which the mail is - delivered. In the dot-atom form, this is interpreted as an Internet - domain name (either a host name or a mail exchanger name) as - described in [RFC1034], [RFC1035], and [RFC1123]. In the domain- - literal form, the domain is interpreted as the literal Internet - address of the particular host. In both cases, how addressing is - used and how messages are transported to a particular host is covered - in separate documents, such as [RFC5321]. These mechanisms are - outside of the scope of this document. - - The local-part portion is a domain-dependent string. In addresses, - it is simply interpreted on the particular host as a name of a - particular mailbox. - -3.5. Overall Message Syntax - - A message consists of header fields, optionally followed by a message - body. Lines in a message MUST be a maximum of 998 characters - excluding the CRLF, but it is RECOMMENDED that lines be limited to 78 - characters excluding the CRLF. (See section 2.1.1 for explanation.) - In a message body, though all of the characters listed in the text - rule MAY be used, the use of US-ASCII control characters (values 1 - through 8, 11, 12, and 14 through 31) is discouraged since their - interpretation by receivers for display is not guaranteed. - - message = (fields / obs-fields) - [CRLF body] - - body = (*(*998text CRLF) *998text) / obs-body - - text = %d1-9 / ; Characters excluding CR - %d11 / ; and LF - %d12 / - %d14-127 - - - - - -Resnick Standards Track [Page 18] - -RFC 5322 Internet Message Format October 2008 - - - The header fields carry most of the semantic information and are - defined in section 3.6. The body is simply a series of lines of text - that are uninterpreted for the purposes of this specification. - -3.6. Field Definitions - - The header fields of a message are defined here. All header fields - have the same general syntactic structure: a field name, followed by - a colon, followed by the field body. The specific syntax for each - header field is defined in the subsequent sections. - - Note: In the ABNF syntax for each field in subsequent sections, - each field name is followed by the required colon. However, for - brevity, sometimes the colon is not referred to in the textual - description of the syntax. It is, nonetheless, required. - - It is important to note that the header fields are not guaranteed to - be in a particular order. They may appear in any order, and they - have been known to be reordered occasionally when transported over - the Internet. However, for the purposes of this specification, - header fields SHOULD NOT be reordered when a message is transported - or transformed. More importantly, the trace header fields and resent - header fields MUST NOT be reordered, and SHOULD be kept in blocks - prepended to the message. See sections 3.6.6 and 3.6.7 for more - information. - - The only required header fields are the origination date field and - the originator address field(s). All other header fields are - syntactically optional. More information is contained in the table - following this definition. - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 19] - -RFC 5322 Internet Message Format October 2008 - - - fields = *(trace - *optional-field / - *(resent-date / - resent-from / - resent-sender / - resent-to / - resent-cc / - resent-bcc / - resent-msg-id)) - *(orig-date / - from / - sender / - reply-to / - to / - cc / - bcc / - message-id / - in-reply-to / - references / - subject / - comments / - keywords / - optional-field) - - The following table indicates limits on the number of times each - field may occur in the header section of a message as well as any - special limitations on the use of those fields. An asterisk ("*") - next to a value in the minimum or maximum column indicates that a - special restriction appears in the Notes column. - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 20] - -RFC 5322 Internet Message Format October 2008 - - - +----------------+--------+------------+----------------------------+ - | Field | Min | Max number | Notes | - | | number | | | - +----------------+--------+------------+----------------------------+ - | trace | 0 | unlimited | Block prepended - see | - | | | | 3.6.7 | - | resent-date | 0* | unlimited* | One per block, required if | - | | | | other resent fields are | - | | | | present - see 3.6.6 | - | resent-from | 0 | unlimited* | One per block - see 3.6.6 | - | resent-sender | 0* | unlimited* | One per block, MUST occur | - | | | | with multi-address | - | | | | resent-from - see 3.6.6 | - | resent-to | 0 | unlimited* | One per block - see 3.6.6 | - | resent-cc | 0 | unlimited* | One per block - see 3.6.6 | - | resent-bcc | 0 | unlimited* | One per block - see 3.6.6 | - | resent-msg-id | 0 | unlimited* | One per block - see 3.6.6 | - | orig-date | 1 | 1 | | - | from | 1 | 1 | See sender and 3.6.2 | - | sender | 0* | 1 | MUST occur with | - | | | | multi-address from - see | - | | | | 3.6.2 | - | reply-to | 0 | 1 | | - | to | 0 | 1 | | - | cc | 0 | 1 | | - | bcc | 0 | 1 | | - | message-id | 0* | 1 | SHOULD be present - see | - | | | | 3.6.4 | - | in-reply-to | 0* | 1 | SHOULD occur in some | - | | | | replies - see 3.6.4 | - | references | 0* | 1 | SHOULD occur in some | - | | | | replies - see 3.6.4 | - | subject | 0 | 1 | | - | comments | 0 | unlimited | | - | keywords | 0 | unlimited | | - | optional-field | 0 | unlimited | | - +----------------+--------+------------+----------------------------+ - - The exact interpretation of each field is described in subsequent - sections. - - - - - - - - - - - -Resnick Standards Track [Page 21] - -RFC 5322 Internet Message Format October 2008 - - -3.6.1. The Origination Date Field - - The origination date field consists of the field name "Date" followed - by a date-time specification. - - orig-date = "Date:" date-time CRLF - - The origination date specifies the date and time at which the creator - of the message indicated that the message was complete and ready to - enter the mail delivery system. For instance, this might be the time - that a user pushes the "send" or "submit" button in an application - program. In any case, it is specifically not intended to convey the - time that the message is actually transported, but rather the time at - which the human or other creator of the message has put the message - into its final form, ready for transport. (For example, a portable - computer user who is not connected to a network might queue a message - for delivery. The origination date is intended to contain the date - and time that the user queued the message, not the time when the user - connected to the network to send the message.) - -3.6.2. Originator Fields - - The originator fields of a message consist of the from field, the - sender field (when applicable), and optionally the reply-to field. - The from field consists of the field name "From" and a comma- - separated list of one or more mailbox specifications. If the from - field contains more than one mailbox specification in the mailbox- - list, then the sender field, containing the field name "Sender" and a - single mailbox specification, MUST appear in the message. In either - case, an optional reply-to field MAY also be included, which contains - the field name "Reply-To" and a comma-separated list of one or more - addresses. - - from = "From:" mailbox-list CRLF - - sender = "Sender:" mailbox CRLF - - reply-to = "Reply-To:" address-list CRLF - - The originator fields indicate the mailbox(es) of the source of the - message. The "From:" field specifies the author(s) of the message, - that is, the mailbox(es) of the person(s) or system(s) responsible - for the writing of the message. The "Sender:" field specifies the - mailbox of the agent responsible for the actual transmission of the - message. For example, if a secretary were to send a message for - another person, the mailbox of the secretary would appear in the - "Sender:" field and the mailbox of the actual author would appear in - the "From:" field. If the originator of the message can be indicated - - - -Resnick Standards Track [Page 22] - -RFC 5322 Internet Message Format October 2008 - - - by a single mailbox and the author and transmitter are identical, the - "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD - appear. - - Note: The transmitter information is always present. The absence - of the "Sender:" field is sometimes mistakenly taken to mean that - the agent responsible for transmission of the message has not been - specified. This absence merely means that the transmitter is - identical to the author and is therefore not redundantly placed - into the "Sender:" field. - - The originator fields also provide the information required when - replying to a message. When the "Reply-To:" field is present, it - indicates the address(es) to which the author of the message suggests - that replies be sent. In the absence of the "Reply-To:" field, - replies SHOULD by default be sent to the mailbox(es) specified in the - "From:" field unless otherwise specified by the person composing the - reply. - - In all cases, the "From:" field SHOULD NOT contain any mailbox that - does not belong to the author(s) of the message. See also section - 3.6.3 for more information on forming the destination addresses for a - reply. - -3.6.3. Destination Address Fields - - The destination fields of a message consist of three possible fields, - each of the same form: the field name, which is either "To", "Cc", or - "Bcc", followed by a comma-separated list of one or more addresses - (either mailbox or group syntax). - - to = "To:" address-list CRLF - - cc = "Cc:" address-list CRLF - - bcc = "Bcc:" [address-list / CFWS] CRLF - - The destination fields specify the recipients of the message. Each - destination field may have one or more addresses, and the addresses - indicate the intended recipients of the message. The only difference - between the three fields is how each is used. - - The "To:" field contains the address(es) of the primary recipient(s) - of the message. - - - - - - - -Resnick Standards Track [Page 23] - -RFC 5322 Internet Message Format October 2008 - - - The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of - making a copy on a typewriter using carbon paper) contains the - addresses of others who are to receive the message, though the - content of the message may not be directed at them. - - The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy") contains - addresses of recipients of the message whose addresses are not to be - revealed to other recipients of the message. There are three ways in - which the "Bcc:" field is used. In the first case, when a message - containing a "Bcc:" field is prepared to be sent, the "Bcc:" line is - removed even though all of the recipients (including those specified - in the "Bcc:" field) are sent a copy of the message. In the second - case, recipients specified in the "To:" and "Cc:" lines each are sent - a copy of the message with the "Bcc:" line removed as above, but the - recipients on the "Bcc:" line get a separate copy of the message - containing a "Bcc:" line. (When there are multiple recipient - addresses in the "Bcc:" field, some implementations actually send a - separate copy of the message to each recipient with a "Bcc:" - containing only the address of that particular recipient.) Finally, - since a "Bcc:" field may contain no addresses, a "Bcc:" field can be - sent without any addresses indicating to the recipients that blind - copies were sent to someone. Which method to use with "Bcc:" fields - is implementation dependent, but refer to the "Security - Considerations" section of this document for a discussion of each. - - When a message is a reply to another message, the mailboxes of the - authors of the original message (the mailboxes in the "From:" field) - or mailboxes specified in the "Reply-To:" field (if it exists) MAY - appear in the "To:" field of the reply since these would normally be - the primary recipients of the reply. If a reply is sent to a message - that has destination fields, it is often desirable to send a copy of - the reply to all of the recipients of the message, in addition to the - author. When such a reply is formed, addresses in the "To:" and - "Cc:" fields of the original message MAY appear in the "Cc:" field of - the reply, since these are normally secondary recipients of the - reply. If a "Bcc:" field is present in the original message, - addresses in that field MAY appear in the "Bcc:" field of the reply, - but they SHOULD NOT appear in the "To:" or "Cc:" fields. - - Note: Some mail applications have automatic reply commands that - include the destination addresses of the original message in the - destination addresses of the reply. How those reply commands - behave is implementation dependent and is beyond the scope of this - document. In particular, whether or not to include the original - destination addresses when the original message had a "Reply-To:" - field is not addressed here. - - - - - -Resnick Standards Track [Page 24] - -RFC 5322 Internet Message Format October 2008 - - -3.6.4. Identification Fields - - Though listed as optional in the table in section 3.6, every message - SHOULD have a "Message-ID:" field. Furthermore, reply messages - SHOULD have "In-Reply-To:" and "References:" fields as appropriate - and as described below. - - The "Message-ID:" field contains a single unique message identifier. - The "References:" and "In-Reply-To:" fields each contain one or more - unique message identifiers, optionally separated by CFWS. - - The message identifier (msg-id) syntax is a limited version of the - addr-spec construct enclosed in the angle bracket characters, "<" and - ">". Unlike addr-spec, this syntax only permits the dot-atom-text - form on the left-hand side of the "@" and does not have internal CFWS - anywhere in the message identifier. - - Note: As with addr-spec, a liberal syntax is given for the right- - hand side of the "@" in a msg-id. However, later in this section, - the use of a domain for the right-hand side of the "@" is - RECOMMENDED. Again, the syntax of domain constructs is specified - by and used in other protocols (e.g., [RFC1034], [RFC1035], - [RFC1123], [RFC5321]). It is therefore incumbent upon - implementations to conform to the syntax of addresses for the - context in which they are used. - - message-id = "Message-ID:" msg-id CRLF - - in-reply-to = "In-Reply-To:" 1*msg-id CRLF - - references = "References:" 1*msg-id CRLF - - msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] - - id-left = dot-atom-text / obs-id-left - - id-right = dot-atom-text / no-fold-literal / obs-id-right - - no-fold-literal = "[" *dtext "]" - - The "Message-ID:" field provides a unique message identifier that - refers to a particular version of a particular message. The - uniqueness of the message identifier is guaranteed by the host that - generates it (see below). This message identifier is intended to be - machine readable and not necessarily meaningful to humans. A message - identifier pertains to exactly one version of a particular message; - subsequent revisions to the message each receive new message - identifiers. - - - -Resnick Standards Track [Page 25] - -RFC 5322 Internet Message Format October 2008 - - - Note: There are many instances when messages are "changed", but - those changes do not constitute a new instantiation of that - message, and therefore the message would not get a new message - identifier. For example, when messages are introduced into the - transport system, they are often prepended with additional header - fields such as trace fields (described in section 3.6.7) and - resent fields (described in section 3.6.6). The addition of such - header fields does not change the identity of the message and - therefore the original "Message-ID:" field is retained. In all - cases, it is the meaning that the sender of the message wishes to - convey (i.e., whether this is the same message or a different - message) that determines whether or not the "Message-ID:" field - changes, not any particular syntactic difference that appears (or - does not appear) in the message. - - The "In-Reply-To:" and "References:" fields are used when creating a - reply to a message. They hold the message identifier of the original - message and the message identifiers of other messages (for example, - in the case of a reply to a message that was itself a reply). The - "In-Reply-To:" field may be used to identify the message (or - messages) to which the new message is a reply, while the - "References:" field may be used to identify a "thread" of - conversation. - - When creating a reply to a message, the "In-Reply-To:" and - "References:" fields of the resultant message are constructed as - follows: - - The "In-Reply-To:" field will contain the contents of the - "Message-ID:" field of the message to which this one is a reply (the - "parent message"). If there is more than one parent message, then - the "In-Reply-To:" field will contain the contents of all of the - parents' "Message-ID:" fields. If there is no "Message-ID:" field in - any of the parent messages, then the new message will have no "In- - Reply-To:" field. - - The "References:" field will contain the contents of the parent's - "References:" field (if any) followed by the contents of the parent's - "Message-ID:" field (if any). If the parent message does not contain - a "References:" field but does have an "In-Reply-To:" field - containing a single message identifier, then the "References:" field - will contain the contents of the parent's "In-Reply-To:" field - followed by the contents of the parent's "Message-ID:" field (if - any). If the parent has none of the "References:", "In-Reply-To:", - or "Message-ID:" fields, then the new message will have no - "References:" field. - - - - - -Resnick Standards Track [Page 26] - -RFC 5322 Internet Message Format October 2008 - - - Note: Some implementations parse the "References:" field to - display the "thread of the discussion". These implementations - assume that each new message is a reply to a single parent and - hence that they can walk backwards through the "References:" field - to find the parent of each message listed there. Therefore, - trying to form a "References:" field for a reply that has multiple - parents is discouraged; how to do so is not defined in this - document. - - The message identifier (msg-id) itself MUST be a globally unique - identifier for a message. The generator of the message identifier - MUST guarantee that the msg-id is unique. There are several - algorithms that can be used to accomplish this. Since the msg-id has - a similar syntax to addr-spec (identical except that quoted strings, - comments, and folding white space are not allowed), a good method is - to put the domain name (or a domain literal IP address) of the host - on which the message identifier was created on the right-hand side of - the "@" (since domain names and IP addresses are normally unique), - and put a combination of the current absolute date and time along - with some other currently unique (perhaps sequential) identifier - available on the system (for example, a process id number) on the - left-hand side. Though other algorithms will work, it is RECOMMENDED - that the right-hand side contain some domain identifier (either of - the host itself or otherwise) such that the generator of the message - identifier can guarantee the uniqueness of the left-hand side within - the scope of that domain. - - Semantically, the angle bracket characters are not part of the - msg-id; the msg-id is what is contained between the two angle bracket - characters. - -3.6.5. Informational Fields - - The informational fields are all optional. The "Subject:" and - "Comments:" fields are unstructured fields as defined in section - 2.2.1, and therefore may contain text or folding white space. The - "Keywords:" field contains a comma-separated list of one or more - words or quoted-strings. - - subject = "Subject:" unstructured CRLF - - comments = "Comments:" unstructured CRLF - - keywords = "Keywords:" phrase *("," phrase) CRLF - - These three fields are intended to have only human-readable content - with information about the message. The "Subject:" field is the most - common and contains a short string identifying the topic of the - - - -Resnick Standards Track [Page 27] - -RFC 5322 Internet Message Format October 2008 - - - message. When used in a reply, the field body MAY start with the - string "Re: " (an abbreviation of the Latin "in re", meaning "in the - matter of") followed by the contents of the "Subject:" field body of - the original message. If this is done, only one instance of the - literal string "Re: " ought to be used since use of other strings or - more than one instance can lead to undesirable consequences. The - "Comments:" field contains any additional comments on the text of the - body of the message. The "Keywords:" field contains a comma- - separated list of important words and phrases that might be useful - for the recipient. - -3.6.6. Resent Fields - - Resent fields SHOULD be added to any message that is reintroduced by - a user into the transport system. A separate set of resent fields - SHOULD be added each time this is done. All of the resent fields - corresponding to a particular resending of the message SHOULD be - grouped together. Each new set of resent fields is prepended to the - message; that is, the most recent set of resent fields appears - earlier in the message. No other fields in the message are changed - when resent fields are added. - - Each of the resent fields corresponds to a particular field elsewhere - in the syntax. For instance, the "Resent-Date:" field corresponds to - the "Date:" field and the "Resent-To:" field corresponds to the "To:" - field. In each case, the syntax for the field body is identical to - the syntax given previously for the corresponding field. - - When resent fields are used, the "Resent-From:" and "Resent-Date:" - fields MUST be sent. The "Resent-Message-ID:" field SHOULD be sent. - "Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be - identical to "Resent-From:". - - resent-date = "Resent-Date:" date-time CRLF - - resent-from = "Resent-From:" mailbox-list CRLF - - resent-sender = "Resent-Sender:" mailbox CRLF - - resent-to = "Resent-To:" address-list CRLF - - resent-cc = "Resent-Cc:" address-list CRLF - - resent-bcc = "Resent-Bcc:" [address-list / CFWS] CRLF - - resent-msg-id = "Resent-Message-ID:" msg-id CRLF - - - - - -Resnick Standards Track [Page 28] - -RFC 5322 Internet Message Format October 2008 - - - Resent fields are used to identify a message as having been - reintroduced into the transport system by a user. The purpose of - using resent fields is to have the message appear to the final - recipient as if it were sent directly by the original sender, with - all of the original fields remaining the same. Each set of resent - fields correspond to a particular resending event. That is, if a - message is resent multiple times, each set of resent fields gives - identifying information for each individual time. Resent fields are - strictly informational. They MUST NOT be used in the normal - processing of replies or other such automatic actions on messages. - - Note: Reintroducing a message into the transport system and using - resent fields is a different operation from "forwarding". - "Forwarding" has two meanings: One sense of forwarding is that a - mail reading program can be told by a user to forward a copy of a - message to another person, making the forwarded message the body - of the new message. A forwarded message in this sense does not - appear to have come from the original sender, but is an entirely - new message from the forwarder of the message. Forwarding may - also mean that a mail transport program gets a message and - forwards it on to a different destination for final delivery. - Resent header fields are not intended for use with either type of - forwarding. - - The resent originator fields indicate the mailbox of the person(s) or - system(s) that resent the message. As with the regular originator - fields, there are two forms: a simple "Resent-From:" form, which - contains the mailbox of the individual doing the resending, and the - more complex form, when one individual (identified in the "Resent- - Sender:" field) resends a message on behalf of one or more others - (identified in the "Resent-From:" field). - - Note: When replying to a resent message, replies behave just as - they would with any other message, using the original "From:", - "Reply-To:", "Message-ID:", and other fields. The resent fields - are only informational and MUST NOT be used in the normal - processing of replies. - - The "Resent-Date:" indicates the date and time at which the resent - message is dispatched by the resender of the message. Like the - "Date:" field, it is not the date and time that the message was - actually transported. - - The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function - identically to the "To:", "Cc:", and "Bcc:" fields, respectively, - except that they indicate the recipients of the resent message, not - the recipients of the original message. - - - - -Resnick Standards Track [Page 29] - -RFC 5322 Internet Message Format October 2008 - - - The "Resent-Message-ID:" field provides a unique identifier for the - resent message. - -3.6.7. Trace Fields - - The trace fields are a group of header fields consisting of an - optional "Return-Path:" field, and one or more "Received:" fields. - The "Return-Path:" header field contains a pair of angle brackets - that enclose an optional addr-spec. The "Received:" field contains a - (possibly empty) list of tokens followed by a semicolon and a date- - time specification. Each token must be a word, angle-addr, addr- - spec, or a domain. Further restrictions are applied to the syntax of - the trace fields by specifications that provide for their use, such - as [RFC5321]. - - trace = [return] - 1*received - - return = "Return-Path:" path CRLF - - path = angle-addr / ([CFWS] "<" [CFWS] ">" [CFWS]) - - received = "Received:" *received-token ";" date-time CRLF - - received-token = word / angle-addr / addr-spec / domain - - A full discussion of the Internet mail use of trace fields is - contained in [RFC5321]. For the purposes of this specification, the - trace fields are strictly informational, and any formal - interpretation of them is outside of the scope of this document. - -3.6.8. Optional Fields - - Fields may appear in messages that are otherwise unspecified in this - document. They MUST conform to the syntax of an optional-field. - This is a field name, made up of the printable US-ASCII characters - except SP and colon, followed by a colon, followed by any text that - conforms to the unstructured syntax. - - The field names of any optional field MUST NOT be identical to any - field name specified elsewhere in this document. - - - - - - - - - - -Resnick Standards Track [Page 30] - -RFC 5322 Internet Message Format October 2008 - - - optional-field = field-name ":" unstructured CRLF - - field-name = 1*ftext - - ftext = %d33-57 / ; Printable US-ASCII - %d59-126 ; characters not including - ; ":". - - For the purposes of this specification, any optional field is - uninterpreted. - -4. Obsolete Syntax - - Earlier versions of this specification allowed for different (usually - more liberal) syntax than is allowed in this version. Also, there - have been syntactic elements used in messages on the Internet whose - interpretations have never been documented. Though these syntactic - forms MUST NOT be generated according to the grammar in section 3, - they MUST be accepted and parsed by a conformant receiver. This - section documents many of these syntactic elements. Taking the - grammar in section 3 and adding the definitions presented in this - section will result in the grammar to use for the interpretation of - messages. - - Note: This section identifies syntactic forms that any - implementation MUST reasonably interpret. However, there are - certainly Internet messages that do not conform to even the - additional syntax given in this section. The fact that a - particular form does not appear in any section of this document is - not justification for computer programs to crash or for malformed - data to be irretrievably lost by any implementation. It is up to - the implementation to deal with messages robustly. - - One important difference between the obsolete (interpreting) and the - current (generating) syntax is that in structured header field bodies - (i.e., between the colon and the CRLF of any structured header - field), white space characters, including folding white space, and - comments could be freely inserted between any syntactic tokens. This - allowed many complex forms that have proven difficult for some - implementations to parse. - - Another key difference between the obsolete and the current syntax is - that the rule in section 3.2.2 regarding lines composed entirely of - white space in comments and folding white space does not apply. See - the discussion of folding white space in section 4.2 below. - - Finally, certain characters that were formerly allowed in messages - appear in this section. The NUL character (ASCII value 0) was once - - - -Resnick Standards Track [Page 31] - -RFC 5322 Internet Message Format October 2008 - - - allowed, but is no longer for compatibility reasons. Similarly, US- - ASCII control characters other than CR, LF, SP, and HTAB (ASCII - values 1 through 8, 11, 12, 14 through 31, and 127) were allowed to - appear in header field bodies. CR and LF were allowed to appear in - messages other than as CRLF; this use is also shown here. - - Other differences in syntax and semantics are noted in the following - sections. - -4.1. Miscellaneous Obsolete Tokens - - These syntactic elements are used elsewhere in the obsolete syntax or - in the main syntax. Bare CR, bare LF, and NUL are added to obs-qp, - obs-body, and obs-unstruct. US-ASCII control characters are added to - obs-qp, obs-unstruct, obs-ctext, and obs-qtext. The period character - is added to obs-phrase. The obs-phrase-list provides for a - (potentially empty) comma-separated list of phrases that may include - "null" elements. That is, there could be two or more commas in such - a list with nothing in between them, or commas at the beginning or - end of the list. - - Note: The "period" (or "full stop") character (".") in obs-phrase - is not a form that was allowed in earlier versions of this or any - other specification. Period (nor any other character from - specials) was not allowed in phrase because it introduced a - parsing difficulty distinguishing between phrases and portions of - an addr-spec (see section 4.4). It appears here because the - period character is currently used in many messages in the - display-name portion of addresses, especially for initials in - names, and therefore must be interpreted properly. - - obs-NO-WS-CTL = %d1-8 / ; US-ASCII control - %d11 / ; characters that do not - %d12 / ; include the carriage - %d14-31 / ; return, line feed, and - %d127 ; white space characters - - obs-ctext = obs-NO-WS-CTL - - obs-qtext = obs-NO-WS-CTL - - obs-utext = %d0 / obs-NO-WS-CTL / VCHAR - - obs-qp = "\" (%d0 / obs-NO-WS-CTL / LF / CR) - - obs-body = *((*LF *CR *((%d0 / text) *LF *CR)) / CRLF) - - obs-unstruct = *((*LF *CR *(obs-utext *LF *CR)) / FWS) - - - -Resnick Standards Track [Page 32] - -RFC 5322 Internet Message Format October 2008 - - - obs-phrase = word *(word / "." / CFWS) - - obs-phrase-list = [phrase / CFWS] *("," [phrase / CFWS]) - - Bare CR and bare LF appear in messages with two different meanings. - In many cases, bare CR or bare LF are used improperly instead of CRLF - to indicate line separators. In other cases, bare CR and bare LF are - used simply as US-ASCII control characters with their traditional - ASCII meanings. - -4.2. Obsolete Folding White Space - - In the obsolete syntax, any amount of folding white space MAY be - inserted where the obs-FWS rule is allowed. This creates the - possibility of having two consecutive "folds" in a line, and - therefore the possibility that a line which makes up a folded header - field could be composed entirely of white space. - - obs-FWS = 1*WSP *(CRLF 1*WSP) - -4.3. Obsolete Date and Time - - The syntax for the obsolete date format allows a 2 digit year in the - date field and allows for a list of alphabetic time zone specifiers - that were used in earlier versions of this specification. It also - permits comments and folding white space between many of the tokens. - - obs-day-of-week = [CFWS] day-name [CFWS] - - obs-day = [CFWS] 1*2DIGIT [CFWS] - - obs-year = [CFWS] 2*DIGIT [CFWS] - - obs-hour = [CFWS] 2DIGIT [CFWS] - - obs-minute = [CFWS] 2DIGIT [CFWS] - - obs-second = [CFWS] 2DIGIT [CFWS] - - obs-zone = "UT" / "GMT" / ; Universal Time - ; North American UT - ; offsets - "EST" / "EDT" / ; Eastern: - 5/ - 4 - "CST" / "CDT" / ; Central: - 6/ - 5 - "MST" / "MDT" / ; Mountain: - 7/ - 6 - "PST" / "PDT" / ; Pacific: - 8/ - 7 - ; - - - - -Resnick Standards Track [Page 33] - -RFC 5322 Internet Message Format October 2008 - - - %d65-73 / ; Military zones - "A" - %d75-90 / ; through "I" and "K" - %d97-105 / ; through "Z", both - %d107-122 ; upper and lower case - - Where a two or three digit year occurs in a date, the year is to be - interpreted as follows: If a two digit year is encountered whose - value is between 00 and 49, the year is interpreted by adding 2000, - ending up with a value between 2000 and 2049. If a two digit year is - encountered with a value between 50 and 99, or any three digit year - is encountered, the year is interpreted by adding 1900. - - In the obsolete time zone, "UT" and "GMT" are indications of - "Universal Time" and "Greenwich Mean Time", respectively, and are - both semantically identical to "+0000". - - The remaining three character zones are the US time zones. The first - letter, "E", "C", "M", or "P" stands for "Eastern", "Central", - "Mountain", and "Pacific". The second letter is either "S" for - "Standard" time, or "D" for "Daylight Savings" (or summer) time. - Their interpretations are as follows: - - EDT is semantically equivalent to -0400 - EST is semantically equivalent to -0500 - CDT is semantically equivalent to -0500 - CST is semantically equivalent to -0600 - MDT is semantically equivalent to -0600 - MST is semantically equivalent to -0700 - PDT is semantically equivalent to -0700 - PST is semantically equivalent to -0800 - - The 1 character military time zones were defined in a non-standard - way in [RFC0822] and are therefore unpredictable in their meaning. - The original definitions of the military zones "A" through "I" are - equivalent to "+0100" through "+0900", respectively; "K", "L", and - "M" are equivalent to "+1000", "+1100", and "+1200", respectively; - "N" through "Y" are equivalent to "-0100" through "-1200". - respectively; and "Z" is equivalent to "+0000". However, because of - the error in [RFC0822], they SHOULD all be considered equivalent to - "-0000" unless there is out-of-band information confirming their - meaning. - - Other multi-character (usually between 3 and 5) alphabetic time zones - have been used in Internet messages. Any such time zone whose - meaning is not known SHOULD be considered equivalent to "-0000" - unless there is out-of-band information confirming their meaning. - - - - - -Resnick Standards Track [Page 34] - -RFC 5322 Internet Message Format October 2008 - - -4.4. Obsolete Addressing - - There are four primary differences in addressing. First, mailbox - addresses were allowed to have a route portion before the addr-spec - when enclosed in "<" and ">". The route is simply a comma-separated - list of domain names, each preceded by "@", and the list terminated - by a colon. Second, CFWS were allowed between the period-separated - elements of local-part and domain (i.e., dot-atom was not used). In - addition, local-part is allowed to contain quoted-string in addition - to just atom. Third, mailbox-list and address-list were allowed to - have "null" members. That is, there could be two or more commas in - such a list with nothing in between them, or commas at the beginning - or end of the list. Finally, US-ASCII control characters and quoted- - pairs were allowed in domain literals and are added here. - - obs-angle-addr = [CFWS] "<" obs-route addr-spec ">" [CFWS] - - obs-route = obs-domain-list ":" - - obs-domain-list = *(CFWS / ",") "@" domain - *("," [CFWS] ["@" domain]) - - obs-mbox-list = *([CFWS] ",") mailbox *("," [mailbox / CFWS]) - - obs-addr-list = *([CFWS] ",") address *("," [address / CFWS]) - - obs-group-list = 1*([CFWS] ",") [CFWS] - - obs-local-part = word *("." word) - - obs-domain = atom *("." atom) - - obs-dtext = obs-NO-WS-CTL / quoted-pair - - When interpreting addresses, the route portion SHOULD be ignored. - -4.5. Obsolete Header Fields - - Syntactically, the primary difference in the obsolete field syntax is - that it allows multiple occurrences of any of the fields and they may - occur in any order. Also, any amount of white space is allowed - before the ":" at the end of the field name. - - - - - - - - - -Resnick Standards Track [Page 35] - -RFC 5322 Internet Message Format October 2008 - - - obs-fields = *(obs-return / - obs-received / - obs-orig-date / - obs-from / - obs-sender / - obs-reply-to / - obs-to / - obs-cc / - obs-bcc / - obs-message-id / - obs-in-reply-to / - obs-references / - obs-subject / - obs-comments / - obs-keywords / - obs-resent-date / - obs-resent-from / - obs-resent-send / - obs-resent-rply / - obs-resent-to / - obs-resent-cc / - obs-resent-bcc / - obs-resent-mid / - obs-optional) - - Except for destination address fields (described in section 4.5.3), - the interpretation of multiple occurrences of fields is unspecified. - Also, the interpretation of trace fields and resent fields that do - not occur in blocks prepended to the message is unspecified as well. - Unless otherwise noted in the following sections, interpretation of - other fields is identical to the interpretation of their non-obsolete - counterparts in section 3. - -4.5.1. Obsolete Origination Date Field - - obs-orig-date = "Date" *WSP ":" date-time CRLF - -4.5.2. Obsolete Originator Fields - - obs-from = "From" *WSP ":" mailbox-list CRLF - - obs-sender = "Sender" *WSP ":" mailbox CRLF - - obs-reply-to = "Reply-To" *WSP ":" address-list CRLF - - - - - - - -Resnick Standards Track [Page 36] - -RFC 5322 Internet Message Format October 2008 - - -4.5.3. Obsolete Destination Address Fields - - obs-to = "To" *WSP ":" address-list CRLF - - obs-cc = "Cc" *WSP ":" address-list CRLF - - obs-bcc = "Bcc" *WSP ":" - (address-list / (*([CFWS] ",") [CFWS])) CRLF - - When multiple occurrences of destination address fields occur in a - message, they SHOULD be treated as if the address list in the first - occurrence of the field is combined with the address lists of the - subsequent occurrences by adding a comma and concatenating. - -4.5.4. Obsolete Identification Fields - - The obsolete "In-Reply-To:" and "References:" fields differ from the - current syntax in that they allow phrase (words or quoted strings) to - appear. The obsolete forms of the left and right sides of msg-id - allow interspersed CFWS, making them syntactically identical to - local-part and domain, respectively. - - obs-message-id = "Message-ID" *WSP ":" msg-id CRLF - - obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF - - obs-references = "References" *WSP ":" *(phrase / msg-id) CRLF - - obs-id-left = local-part - - obs-id-right = domain - - For purposes of interpretation, the phrases in the "In-Reply-To:" and - "References:" fields are ignored. - - Semantically, none of the optional CFWS in the local-part and the - domain is part of the obs-id-left and obs-id-right, respectively. - -4.5.5. Obsolete Informational Fields - - obs-subject = "Subject" *WSP ":" unstructured CRLF - - obs-comments = "Comments" *WSP ":" unstructured CRLF - - obs-keywords = "Keywords" *WSP ":" obs-phrase-list CRLF - - - - - - -Resnick Standards Track [Page 37] - -RFC 5322 Internet Message Format October 2008 - - -4.5.6. Obsolete Resent Fields - - The obsolete syntax adds a "Resent-Reply-To:" field, which consists - of the field name, the optional comments and folding white space, the - colon, and a comma separated list of addresses. - - obs-resent-from = "Resent-From" *WSP ":" mailbox-list CRLF - - obs-resent-send = "Resent-Sender" *WSP ":" mailbox CRLF - - obs-resent-date = "Resent-Date" *WSP ":" date-time CRLF - - obs-resent-to = "Resent-To" *WSP ":" address-list CRLF - - obs-resent-cc = "Resent-Cc" *WSP ":" address-list CRLF - - obs-resent-bcc = "Resent-Bcc" *WSP ":" - (address-list / (*([CFWS] ",") [CFWS])) CRLF - - obs-resent-mid = "Resent-Message-ID" *WSP ":" msg-id CRLF - - obs-resent-rply = "Resent-Reply-To" *WSP ":" address-list CRLF - - As with other resent fields, the "Resent-Reply-To:" field is to be - treated as trace information only. - -4.5.7. Obsolete Trace Fields - - The obs-return and obs-received are again given here as template - definitions, just as return and received are in section 3. Their - full syntax is given in [RFC5321]. - - obs-return = "Return-Path" *WSP ":" path CRLF - - obs-received = "Received" *WSP ":" *received-token CRLF - -4.5.8. Obsolete optional fields - - obs-optional = field-name *WSP ":" unstructured CRLF - -5. Security Considerations - - Care needs to be taken when displaying messages on a terminal or - terminal emulator. Powerful terminals may act on escape sequences - and other combinations of US-ASCII control characters with a variety - of consequences. They can remap the keyboard or permit other - modifications to the terminal that could lead to denial of service or - even damaged data. They can trigger (sometimes programmable) - - - -Resnick Standards Track [Page 38] - -RFC 5322 Internet Message Format October 2008 - - - answerback messages that can allow a message to cause commands to be - issued on the recipient's behalf. They can also affect the operation - of terminal attached devices such as printers. Message viewers may - wish to strip potentially dangerous terminal escape sequences from - the message prior to display. However, other escape sequences appear - in messages for useful purposes (cf. [ISO.2022.1994], [RFC2045], - [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]) and therefore - should not be stripped indiscriminately. - - Transmission of non-text objects in messages raises additional - security issues. These issues are discussed in [RFC2045], [RFC2046], - [RFC2047], [RFC2049], [RFC4288], and [RFC4289]. - - Many implementations use the "Bcc:" (blind carbon copy) field, - described in section 3.6.3, to facilitate sending messages to - recipients without revealing the addresses of one or more of the - addressees to the other recipients. Mishandling this use of "Bcc:" - may disclose confidential information that could eventually lead to - security problems through knowledge of even the existence of a - particular mail address. For example, if using the first method - described in section 3.6.3, where the "Bcc:" line is removed from the - message, blind recipients have no explicit indication that they have - been sent a blind copy, except insofar as their address does not - appear in the header section of a message. Because of this, one of - the blind addressees could potentially send a reply to all of the - shown recipients and accidentally reveal that the message went to the - blind recipient. When the second method from section 3.6.3 is used, - the blind recipient's address appears in the "Bcc:" field of a - separate copy of the message. If the "Bcc:" field sent contains all - of the blind addressees, all of the "Bcc:" recipients will be seen by - each "Bcc:" recipient. Even if a separate message is sent to each - "Bcc:" recipient with only the individual's address, implementations - still need to be careful to process replies to the message as per - section 3.6.3 so as not to accidentally reveal the blind recipient to - other recipients. - -6. IANA Considerations - - This document updates the registrations that appeared in [RFC4021] - that referred to the definitions in [RFC2822]. IANA has updated the - Permanent Message Header Field Repository with the following header - fields, in accordance with the procedures set out in [RFC3864]. - - Header field name: Date - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.1) - - - -Resnick Standards Track [Page 39] - -RFC 5322 Internet Message Format October 2008 - - - Header field name: From - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.2) - - Header field name: Sender - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.2) - - Header field name: Reply-To - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.2) - - Header field name: To - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.3) - - Header field name: Cc - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.3) - - Header field name: Bcc - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.3) - - Header field name: Message-ID - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.4) - - Header field name: In-Reply-To - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.4) - - - - -Resnick Standards Track [Page 40] - -RFC 5322 Internet Message Format October 2008 - - - Header field name: References - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.4) - - Header field name: Subject - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.5) - - Header field name: Comments - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.5) - - Header field name: Keywords - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.5) - - Header field name: Resent-Date - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Resent-From - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Resent-Sender - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Resent-To - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - - - -Resnick Standards Track [Page 41] - -RFC 5322 Internet Message Format October 2008 - - - Header field name: Resent-Cc - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Resent-Bcc - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Resent-Reply-To - Applicable protocol: Mail - Status: obsolete - Author/Change controller: IETF - Specification document(s): This document (section 4.5.6) - - Header field name: Resent-Message-ID - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.6) - - Header field name: Return-Path - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.7) - - Header field name: Received - Applicable protocol: Mail - Status: standard - Author/Change controller: IETF - Specification document(s): This document (section 3.6.7) - Related information: [RFC5321] - - - - - - - - - - - - - - - -Resnick Standards Track [Page 42] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A. Example Messages - - This section presents a selection of messages. These are intended to - assist in the implementation of this specification, but should not be - taken as normative; that is to say, although the examples in this - section were carefully reviewed, if there happens to be a conflict - between these examples and the syntax described in sections 3 and 4 - of this document, the syntax in those sections is to be taken as - correct. - - In the text version of this document, messages in this section are - delimited between lines of "----". The "----" lines are not part of - the message itself. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 43] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.1. Addressing Examples - - The following are examples of messages that might be sent between two - individuals. - -Appendix A.1.1. A Message from One Person to Another with Simple - Addressing - - This could be called a canonical message. It has a single author, - John Doe, a single recipient, Mary Smith, a subject, the date, a - message identifier, and a textual message in the body. - - ---- - From: John Doe <jdoe@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - If John's secretary Michael actually sent the message, even though - John was the author and replies to this message should go back to - him, the sender field would be used: - - ---- - From: John Doe <jdoe@machine.example> - Sender: Michael Jones <mjones@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - - - - - - - - - - - - -Resnick Standards Track [Page 44] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.1.2. Different Types of Mailboxes - - This message includes multiple addresses in the destination fields - and also uses several different forms of addresses. - - ---- - From: "Joe Q. Public" <john.q.public@example.com> - To: Mary Smith <mary@x.test>, jdoe@example.org, Who? <one@y.test> - Cc: <boss@nil.test>, "Giant; \"Big\" Box" <sysservices@example.net> - Date: Tue, 1 Jul 2003 10:52:37 +0200 - Message-ID: <5678.21-Nov-1997@example.com> - - Hi everyone. - ---- - - Note that the display names for Joe Q. Public and Giant; "Big" Box - needed to be enclosed in double-quotes because the former contains - the period and the latter contains both semicolon and double-quote - characters (the double-quote characters appearing as quoted-pair - constructs). Conversely, the display name for Who? could appear - without them because the question mark is legal in an atom. Notice - also that jdoe@example.org and boss@nil.test have no display names - associated with them at all, and jdoe@example.org uses the simpler - address form without the angle brackets. - -Appendix A.1.3. Group Addresses - - ---- - From: Pete <pete@silly.example> - To: A Group:Ed Jones <c@a.test>,joe@where.test,John <jdoe@one.test>; - Cc: Undisclosed recipients:; - Date: Thu, 13 Feb 1969 23:32:54 -0330 - Message-ID: <testabcd.1234@silly.example> - - Testing. - ---- - - In this message, the "To:" field has a single group recipient named - "A Group", which contains 3 addresses, and a "Cc:" field with an - empty group recipient named Undisclosed recipients. - - - - - - - - - - - -Resnick Standards Track [Page 45] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.2. Reply Messages - - The following is a series of three messages that make up a - conversation thread between John and Mary. John first sends a - message to Mary, Mary then replies to John's message, and then John - replies to Mary's reply message. - - Note especially the "Message-ID:", "References:", and "In-Reply-To:" - fields in each message. - - ---- - From: John Doe <jdoe@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - When sending replies, the Subject field is often retained, though - prepended with "Re: " as described in section 3.6.5. - - ---- - From: Mary Smith <mary@example.net> - To: John Doe <jdoe@machine.example> - Reply-To: "Mary Smith: Personal Account" <smith@home.example> - Subject: Re: Saying Hello - Date: Fri, 21 Nov 1997 10:01:10 -0600 - Message-ID: <3456@example.net> - In-Reply-To: <1234@local.machine.example> - References: <1234@local.machine.example> - - This is a reply to your hello. - ---- - - Note the "Reply-To:" field in the above message. When John replies - to Mary's message above, the reply should go to the address in the - "Reply-To:" field instead of the address in the "From:" field. - - - - - - - - - - - -Resnick Standards Track [Page 46] - -RFC 5322 Internet Message Format October 2008 - - - ---- - To: "Mary Smith: Personal Account" <smith@home.example> - From: John Doe <jdoe@machine.example> - Subject: Re: Saying Hello - Date: Fri, 21 Nov 1997 11:00:00 -0600 - Message-ID: <abcd.1234@local.machine.test> - In-Reply-To: <3456@example.net> - References: <1234@local.machine.example> <3456@example.net> - - This is a reply to your reply. - ---- - -Appendix A.3. Resent Messages - - Start with the message that has been used as an example several - times: - - ---- - From: John Doe <jdoe@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - Say that Mary, upon receiving this message, wishes to send a copy of - the message to Jane such that (a) the message would appear to have - come straight from John; (b) if Jane replies to the message, the - reply should go back to John; and (c) all of the original - information, like the date the message was originally sent to Mary, - the message identifier, and the original addressee, is preserved. In - this case, resent fields are prepended to the message: - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 47] - -RFC 5322 Internet Message Format October 2008 - - - ---- - Resent-From: Mary Smith <mary@example.net> - Resent-To: Jane Brown <j-brown@other.example> - Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800 - Resent-Message-ID: <78910@example.net> - From: John Doe <jdoe@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - If Jane, in turn, wished to resend this message to another person, - she would prepend her own set of resent header fields to the above - and send that. (Note that for brevity, trace fields are not shown.) - -Appendix A.4. Messages with Trace Fields - - As messages are sent through the transport system as described in - [RFC5321], trace fields are prepended to the message. The following - is an example of what those trace fields might look like. Note that - there is some folding white space in the first one since these lines - can be long. - - ---- - Received: from x.y.test - by example.net - via TCP - with ESMTP - id ABC12345 - for <mary@example.net>; 21 Nov 1997 10:05:43 -0600 - Received: from node.example by x.y.test; 21 Nov 1997 10:01:22 -0600 - From: John Doe <jdoe@node.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: Fri, 21 Nov 1997 09:55:06 -0600 - Message-ID: <1234@local.node.example> - - This is a message just to say hello. - So, "Hello". - ---- - - - - - - - -Resnick Standards Track [Page 48] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.5. White Space, Comments, and Other Oddities - - White space, including folding white space, and comments can be - inserted between many of the tokens of fields. Taking the example - from A.1.3, white space and comments can be inserted into all of the - fields. - - ---- - From: Pete(A nice \) chap) <pete(his account)@silly.test(his host)> - To:A Group(Some people) - :Chris Jones <c@(Chris's host.)public.example>, - joe@example.org, - John <jdoe@one.test> (my dear friend); (the end of the group) - Cc:(Empty list)(start)Hidden recipients :(nobody(that I know)) ; - Date: Thu, - 13 - Feb - 1969 - 23:32 - -0330 (Newfoundland Time) - Message-ID: <testabcd.1234@silly.test> - - Testing. - ---- - - The above example is aesthetically displeasing, but perfectly legal. - Note particularly (1) the comments in the "From:" field (including - one that has a ")" character appearing as part of a quoted-pair); (2) - the white space absent after the ":" in the "To:" field as well as - the comment and folding white space after the group name, the special - character (".") in the comment in Chris Jones's address, and the - folding white space before and after "joe@example.org,"; (3) the - multiple and nested comments in the "Cc:" field as well as the - comment immediately following the ":" after "Cc"; (4) the folding - white space (but no comments except at the end) and the missing - seconds in the time of the date field; and (5) the white space before - (but not within) the identifier in the "Message-ID:" field. - - - - - - - - - - - - - - -Resnick Standards Track [Page 49] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.6. Obsoleted Forms - - The following are examples of obsolete (that is, the "MUST NOT - generate") syntactic elements described in section 4 of this - document. - -Appendix A.6.1. Obsolete Addressing - - Note in the example below the lack of quotes around Joe Q. Public, - the route that appears in the address for Mary Smith, the two commas - that appear in the "To:" field, and the spaces that appear around the - "." in the jdoe address. - - ---- - From: Joe Q. Public <john.q.public@example.com> - To: Mary Smith <@node.test:mary@example.net>, , jdoe@test . example - Date: Tue, 1 Jul 2003 10:52:37 +0200 - Message-ID: <5678.21-Nov-1997@example.com> - - Hi everyone. - ---- - -Appendix A.6.2. Obsolete Dates - - The following message uses an obsolete date format, including a non- - numeric time zone and a two digit year. Note that although the day- - of-week is missing, that is not specific to the obsolete syntax; it - is optional in the current syntax as well. - - ---- - From: John Doe <jdoe@machine.example> - To: Mary Smith <mary@example.net> - Subject: Saying Hello - Date: 21 Nov 97 09:55:06 GMT - Message-ID: <1234@local.machine.example> - - This is a message just to say hello. - So, "Hello". - ---- - - - - - - - - - - - - -Resnick Standards Track [Page 50] - -RFC 5322 Internet Message Format October 2008 - - -Appendix A.6.3. Obsolete White Space and Comments - - White space and comments can appear between many more elements than - in the current syntax. Also, folding lines that are made up entirely - of white space are legal. - - ---- - From : John Doe <jdoe@machine(comment). example> - To : Mary Smith - __ - <mary@example.net> - Subject : Saying Hello - Date : Fri, 21 Nov 1997 09(comment): 55 : 06 -0600 - Message-ID : <1234 @ local(blah) .machine .example> - - This is a message just to say hello. - So, "Hello". - ---- - - Note especially the second line of the "To:" field. It starts with - two space characters. (Note that "__" represent blank spaces.) - Therefore, it is considered part of the folding, as described in - section 4.2. Also, the comments and white space throughout - addresses, dates, and message identifiers are all part of the - obsolete syntax. - - - - - - - - - - - - - - - - - - - - - - - - - - -Resnick Standards Track [Page 51] - -RFC 5322 Internet Message Format October 2008 - - -Appendix B. Differences from Earlier Specifications - - This appendix contains a list of changes that have been made in the - Internet Message Format from earlier specifications, specifically - [RFC0822], [RFC1123], and [RFC2822]. Items marked with an asterisk - (*) below are items which appear in section 4 of this document and - therefore can no longer be generated. - - The following are the changes made from [RFC0822] and [RFC1123] to - [RFC2822] that remain in this document: - - 1. Period allowed in obsolete form of phrase. - 2. ABNF moved out of document, now in [RFC5234]. - 3. Four or more digits allowed for year. - 4. Header field ordering (and lack thereof) made explicit. - 5. Encrypted header field removed. - 6. Specifically allow and give meaning to "-0000" time zone. - 7. Folding white space is not allowed between every token. - 8. Requirement for destinations removed. - 9. Forwarding and resending redefined. - 10. Extension header fields no longer specifically called out. - 11. ASCII 0 (null) removed.* - 12. Folding continuation lines cannot contain only white space.* - 13. Free insertion of comments not allowed in date.* - 14. Non-numeric time zones not allowed.* - 15. Two digit years not allowed.* - 16. Three digit years interpreted, but not allowed for generation.* - 17. Routes in addresses not allowed.* - 18. CFWS within local-parts and domains not allowed.* - 19. Empty members of address lists not allowed.* - 20. Folding white space between field name and colon not allowed.* - 21. Comments between field name and colon not allowed. - 22. Tightened syntax of in-reply-to and references.* - 23. CFWS within msg-id not allowed.* - 24. Tightened semantics of resent fields as informational only. - 25. Resent-Reply-To not allowed.* - 26. No multiple occurrences of fields (except resent and received).* - 27. Free CR and LF not allowed.* - 28. Line length limits specified. - 29. Bcc more clearly specified. - - - - - - - - - - - -Resnick Standards Track [Page 52] - -RFC 5322 Internet Message Format October 2008 - - - The following are changes from [RFC2822]. - 1. Assorted typographical/grammatical errors fixed and - clarifications made. - 2. Changed "standard" to "document" or "specification" throughout. - 3. Made distinction between "header field" and "header section". - 4. Removed NO-WS-CTL from ctext, qtext, dtext, and unstructured.* - 5. Moved discussion of specials to the "Atom" section. Moved text - to "Overall message syntax" section. - 6. Simplified CFWS syntax. - 7. Fixed unstructured syntax. - 8. Changed date and time syntax to deal with white space in - obsolete date syntax. - 9. Removed quoted-pair from domain literals and message - identifiers.* - 10. Clarified that other specifications limit domain syntax. - 11. Simplified "Bcc:" and "Resent-Bcc:" syntax. - 12. Allowed optional-field to appear within trace information. - 13. Removed no-fold-quote from msg-id. Clarified syntax - limitations. - 14. Generalized "Received:" syntax to fix bugs and move definition - out of this document. - 15. Simplified obs-qp. Fixed and simplified obs-utext (which now - only appears in the obsolete syntax). Removed obs-text and obs- - char, adding obs-body. - 16. Fixed obsolete date syntax to allow for more (or less) comments - and white space. - 17. Fixed all obsolete list syntax (obs-domain-list, obs-mbox-list, - obs-addr-list, obs-phrase-list, and the newly added obs-group- - list). - 18. Fixed obs-reply-to syntax. - 19. Fixed obs-bcc and obs-resent-bcc to allow empty lists. - 20. Removed obs-path. - -Appendix C. Acknowledgements - - Many people contributed to this document. They included folks who - participated in the Detailed Revision and Update of Messaging - Standards (DRUMS) Working Group of the Internet Engineering Task - Force (IETF), the chair of DRUMS, the Area Directors of the IETF, and - people who simply sent their comments in via email. The editor is - deeply indebted to them all and thanks them sincerely. The below - list includes everyone who sent email concerning both this document - and [RFC2822]. Hopefully, everyone who contributed is named here: - - +--------------------+----------------------+---------------------+ - | Matti Aarnio | Tanaka Akira | Russ Allbery | - | Eric Allman | Harald Alvestrand | Ran Atkinson | - | Jos Backus | Bruce Balden | Dave Barr | - - - -Resnick Standards Track [Page 53] - -RFC 5322 Internet Message Format October 2008 - - - | Alan Barrett | John Beck | J Robert von Behren | - | Jos den Bekker | D J Bernstein | James Berriman | - | Oliver Block | Norbert Bollow | Raj Bose | - | Antony Bowesman | Scott Bradner | Randy Bush | - | Tom Byrer | Bruce Campbell | Larry Campbell | - | W J Carpenter | Michael Chapman | Richard Clayton | - | Maurizio Codogno | Jim Conklin | R Kelley Cook | - | Nathan Coulter | Steve Coya | Mark Crispin | - | Dave Crocker | Matt Curtin | Michael D'Errico | - | Cyrus Daboo | Michael D Dean | Jutta Degener | - | Mark Delany | Steve Dorner | Harold A Driscoll | - | Michael Elkins | Frank Ellerman | Robert Elz | - | Johnny Eriksson | Erik E Fair | Roger Fajman | - | Patrik Faltstrom | Claus Andre Faerber | Barry Finkel | - | Erik Forsberg | Chuck Foster | Paul Fox | - | Klaus M Frank | Ned Freed | Jochen Friedrich | - | Randall C Gellens | Sukvinder Singh Gill | Tim Goodwin | - | Philip Guenther | Arnt Gulbrandsen | Eric A Hall | - | Tony Hansen | John Hawkinson | Philip Hazel | - | Kai Henningsen | Robert Herriot | Paul Hethmon | - | Jim Hill | Alfred Hoenes | Paul E Hoffman | - | Steve Hole | Kari Hurtta | Marco S Hyman | - | Ofer Inbar | Olle Jarnefors | Kevin Johnson | - | Sudish Joseph | Maynard Kang | Prabhat Keni | - | John C Klensin | Graham Klyne | Brad Knowles | - | Shuhei Kobayashi | Peter Koch | Dan Kohn | - | Christian Kuhtz | Anand Kumria | Steen Larsen | - | Eliot Lear | Barry Leiba | Jay Levitt | - | Bruce Lilly | Lars-Johan Liman | Charles Lindsey | - | Pete Loshin | Simon Lyall | Bill Manning | - | John Martin | Mark Martinec | Larry Masinter | - | Denis McKeon | William P McQuillan | Alexey Melnikov | - | Perry E Metzger | Steven Miller | S Moonesamy | - | Keith Moore | John Gardiner Myers | Chris Newman | - | John W Noerenberg | Eric Norman | Mike O'Dell | - | Larry Osterman | Paul Overell | Jacob Palme | - | Michael A Patton | Uzi Paz | Michael A Quinlan | - | Robert Rapplean | Eric S Raymond | Sam Roberts | - | Hugh Sasse | Bart Schaefer | Tom Scola | - | Wolfgang Segmuller | Nick Shelness | John Stanley | - | Einar Stefferud | Jeff Stephenson | Bernard Stern | - | Peter Sylvester | Mark Symons | Eric Thomas | - | Lee Thompson | Karel De Vriendt | Matthew Wall | - | Rolf Weber | Brent B Welch | Dan Wing | - | Jack De Winter | Gregory J Woodhouse | Greg A Woods | - | Kazu Yamamoto | Alain Zahm | Jamie Zawinski | - | Timothy S Zurcher | | | - +--------------------+----------------------+---------------------+ - - - -Resnick Standards Track [Page 54] - -RFC 5322 Internet Message Format October 2008 - - -7. References - -7.1. Normative References - - [ANSI.X3-4.1986] American National Standards Institute, "Coded - Character Set - 7-bit American Standard Code for - Information Interchange", ANSI X3.4, 1986. - - [RFC1034] Mockapetris, P., "Domain names - concepts and - facilities", STD 13, RFC 1034, November 1987. - - [RFC1035] Mockapetris, P., "Domain names - implementation and - specification", STD 13, RFC 1035, November 1987. - - [RFC1123] Braden, R., "Requirements for Internet Hosts - - Application and Support", STD 3, RFC 1123, - October 1989. - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for - Syntax Specifications: ABNF", STD 68, RFC 5234, - January 2008. - -7.2. Informative References - - [RFC0822] Crocker, D., "Standard for the format of ARPA - Internet text messages", STD 11, RFC 822, - August 1982. - - [RFC1305] Mills, D., "Network Time Protocol (Version 3) - Specification, Implementation", RFC 1305, - March 1992. - - [ISO.2022.1994] International Organization for Standardization, - "Information technology - Character code structure - and extension techniques", ISO Standard 2022, 1994. - - [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet - Mail Extensions (MIME) Part One: Format of Internet - Message Bodies", RFC 2045, November 1996. - - [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet - Mail Extensions (MIME) Part Two: Media Types", - RFC 2046, November 1996. - - - - - -Resnick Standards Track [Page 55] - -RFC 5322 Internet Message Format October 2008 - - - [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail - Extensions) Part Three: Message Header Extensions - for Non-ASCII Text", RFC 2047, November 1996. - - [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet - Mail Extensions (MIME) Part Five: Conformance - Criteria and Examples", RFC 2049, November 1996. - - [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, - April 2001. - - [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, - "Registration Procedures for Message Header - Fields", BCP 90, RFC 3864, September 2004. - - [RFC4021] Klyne, G. and J. Palme, "Registration of Mail and - MIME Header Fields", RFC 4021, March 2005. - - [RFC4288] Freed, N. and J. Klensin, "Media Type - Specifications and Registration Procedures", - BCP 13, RFC 4288, December 2005. - - [RFC4289] Freed, N. and J. Klensin, "Multipurpose Internet - Mail Extensions (MIME) Part Four: Registration - Procedures", BCP 13, RFC 4289, December 2005. - - [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", - RFC 5321, October 2008. - -Author's Address - - Peter W. Resnick (editor) - Qualcomm Incorporated - 5775 Morehouse Drive - San Diego, CA 92121-1714 - US - - Phone: +1 858 651 4478 - EMail: presnick@qualcomm.com - URI: http://www.qualcomm.com/~presnick/ - - - - - - - - - - - -Resnick Standards Track [Page 56] - -RFC 5322 Internet Message Format October 2008 - - -Full Copyright Statement - - Copyright (C) The IETF Trust (2008). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND - THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS - OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF - THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the procedures with respect to rights in RFC documents can be - found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - - - - - - - - - - - - -Resnick Standards Track [Page 57] - diff --git a/proto/rfc5804.txt b/proto/rfc5804.txt @@ -1,2747 +0,0 @@ - - - - - - -Internet Engineering Task Force (IETF) A. Melnikov, Ed. -Request for Comments: 5804 Isode Limited -Category: Standards Track T. Martin -ISSN: 2070-1721 BeThereBeSquare, Inc. - July 2010 - - - A Protocol for Remotely Managing Sieve Scripts - -Abstract - - Sieve scripts allow users to filter incoming email. Message stores - are commonly sealed servers so users cannot log into them, yet users - must be able to update their scripts on them. This document - describes a protocol "ManageSieve" for securely managing Sieve - scripts on a remote server. This protocol allows a user to have - multiple scripts, and also alerts a user to syntactically flawed - scripts. - -Status of This Memo - - This is an Internet Standards Track document. - - This document is a product of the Internet Engineering Task Force - (IETF). It represents the consensus of the IETF community. It has - received public review and has been approved for publication by the - Internet Engineering Steering Group (IESG). Further information on - Internet Standards is available in Section 2 of RFC 5741. - - Information about the current status of this document, any errata, - and how to provide feedback on it may be obtained at - http://www.rfc-editor.org/info/rfc5804. - -Copyright Notice - - Copyright (c) 2010 IETF Trust and the persons identified as the - document authors. All rights reserved. - - This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (http://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with respect - to this document. Code Components extracted from this document must - include Simplified BSD License text as described in Section 4.e of - the Trust Legal Provisions and are provided without warranty as - described in the Simplified BSD License. - - - - -Melnikov & Martin Standards Track [Page 1] - -RFC 5804 ManageSieve July 2010 - - -Table of Contents - - 1. Introduction ....................................................3 - 1.1. Commands and Responses .....................................3 - 1.2. Syntax .....................................................3 - 1.3. Response Codes .............................................3 - 1.4. Active Script ..............................................6 - 1.5. Quotas .....................................................6 - 1.6. Script Names ...............................................6 - 1.7. Capabilities ...............................................7 - 1.8. Transport ..................................................9 - 1.9. Conventions Used in This Document .........................10 - 2. Commands .......................................................10 - 2.1. AUTHENTICATE Command ......................................11 - 2.1.1. Use of SASL PLAIN Mechanism over TLS ...............16 - 2.2. STARTTLS Command ..........................................16 - 2.2.1. Server Identity Check ..............................17 - 2.3. LOGOUT Command ............................................20 - 2.4. CAPABILITY Command ........................................20 - 2.5. HAVESPACE Command .........................................20 - 2.6. PUTSCRIPT Command .........................................21 - 2.7. LISTSCRIPTS Command .......................................23 - 2.8. SETACTIVE Command .........................................24 - 2.9. GETSCRIPT Command .........................................25 - 2.10. DELETESCRIPT Command .....................................25 - 2.11. RENAMESCRIPT Command .....................................26 - 2.12. CHECKSCRIPT Command ......................................27 - 2.13. NOOP Command .............................................28 - 2.14. Recommended Extensions ...................................28 - 2.14.1. UNAUTHENTICATE Command ............................28 - 3. Sieve URL Scheme ...............................................29 - 4. Formal Syntax ..................................................31 - 5. Security Considerations ........................................37 - 6. IANA Considerations ............................................38 - 6.1. ManageSieve Capability Registration Template ..............39 - 6.2. Registration of Initial ManageSieve Capabilities ..........39 - 6.3. ManageSieve Response Code Registration Template ...........41 - 6.4. Registration of Initial ManageSieve Response Codes ........41 - 7. Internationalization Considerations ............................46 - 8. Acknowledgements ...............................................46 - 9. References .....................................................47 - 9.1. Normative References ......................................47 - 9.2. Informative References ....................................48 - - - - - - - - -Melnikov & Martin Standards Track [Page 2] - -RFC 5804 ManageSieve July 2010 - - -1. Introduction - -1.1. Commands and Responses - - A ManageSieve connection consists of the establishment of a client/ - server network connection, an initial greeting from the server, and - client/server interactions. These client/server interactions consist - of a client command, server data, and a server completion result - response. - - All interactions transmitted by client and server are in the form of - lines, that is, strings that end with a CRLF. The protocol receiver - of a ManageSieve client or server is either reading a line or reading - a sequence of octets with a known count followed by a line. - -1.2. Syntax - - ManageSieve is a line-oriented protocol much like [IMAP] or [ACAP], - which runs over TCP. There are three data types: atoms, numbers and - strings. Strings may be quoted or literal. See [ACAP] for detailed - descriptions of these types. - - Each command consists of an atom (the command name) followed by zero - or more strings and numbers terminated by CRLF. - - All client queries are replied to with either an OK, NO, or BYE - response. Each response may be followed by a response code (see - Section 1.3) and by a string consisting of human-readable text in the - local language (as returned by the LANGUAGE capability; see - Section 1.7), encoded in UTF-8 [UTF-8]. The contents of the string - SHOULD be shown to the user ,and implementations MUST NOT attempt to - parse the message for meaning. - - The BYE response SHOULD be used if the server wishes to close the - connection. A server may wish to do this because the client was idle - for too long or there were too many failed authentication attempts. - This response can be issued at any time and should be immediately - followed by a server hang-up of the connection. If a server has an - inactivity timeout resulting in client autologout, it MUST be no less - than 30 minutes after successful authentication. The inactivity - timeout MAY be less before authentication. - -1.3. Response Codes - - An OK, NO, or BYE response from the server MAY contain a response - code to describe the event in a more detailed machine-parsable - fashion. A response code consists of data inside parentheses in the - form of an atom, possibly followed by a space and arguments. - - - -Melnikov & Martin Standards Track [Page 3] - -RFC 5804 ManageSieve July 2010 - - - Response codes are defined when there is a specific action that a - client can take based upon the additional information. In order to - support future extension, the response code is represented as a - slash-separated (Solidus, %x2F) hierarchy with each level of - hierarchy representing increasing detail about the error. Response - codes MUST NOT start with the Solidus character. Clients MUST - tolerate additional hierarchical response code detail that they don't - understand. For example, if the client supports the "QUOTA" response - code, but doesn't understand the "QUOTA/MAXSCRIPTS" response code, it - should treat "QUOTA/MAXSCRIPTS" as "QUOTA". - - Client implementations MUST tolerate (ignore) response codes that - they do not recognize. - - The currently defined response codes are the following: - - AUTH-TOO-WEAK - - This response code is returned in the NO or BYE response from an - AUTHENTICATE command. It indicates that site security policy forbids - the use of the requested mechanism for the specified authentication - identity. - - ENCRYPT-NEEDED - - This response code is returned in the NO or BYE response from an - AUTHENTICATE command. It indicates that site security policy - requires the use of a strong encryption mechanism for the specified - authentication identity and mechanism. - - QUOTA - - If this response code is returned in the NO/BYE response, it means - that the command would have placed the user above the site-defined - quota constraints. If this response code is returned in the OK - response, it can mean that the user's storage is near its quota, or - it can mean that the account exceeded its quota but that the - condition is being allowed by the server (the server supports - so-called soft quotas). The QUOTA response code has two more - detailed variants: "QUOTA/MAXSCRIPTS" (the maximum number of per-user - scripts) and "QUOTA/MAXSIZE" (the maximum script size). - - REFERRAL - - This response code may be returned with a BYE result from any - command, and includes a mandatory parameter that indicates what - server to access to manage this user's Sieve scripts. The server - will be specified by a Sieve URL (see Section 3). The scriptname - - - -Melnikov & Martin Standards Track [Page 4] - -RFC 5804 ManageSieve July 2010 - - - portion of the URL MUST NOT be specified. The client should - authenticate to the specified server and use it for all further - commands in the current session. - - SASL - - This response code can occur in the OK response to a successful - AUTHENTICATE command and includes the optional final server response - data from the server as specified by [SASL]. - - TRANSITION-NEEDED - - This response code occurs in a NO response of an AUTHENTICATE - command. It indicates that the user name is valid, but the entry in - the authentication database needs to be updated in order to permit - authentication with the specified mechanism. This is typically done - by establishing a secure channel using TLS, verifying server identity - as specified in Section 2.2.1, and finally authenticating once using - the [PLAIN] authentication mechanism. The selected mechanism SHOULD - then work for authentications in subsequent sessions. - - This condition can happen if a user has an entry in a system - authentication database such as Unix /etc/passwd, but does not have - credentials suitable for use by the specified mechanism. - - TRYLATER - - A command failed due to a temporary server failure. The client MAY - continue using local information and try the command later. This - response code only makes sense when returned in a NO/BYE response. - - ACTIVE - - A command failed because it is not allowed on the active script, for - example, DELETESCRIPT on the active script. This response code only - makes sense when returned in a NO/BYE response. - - NONEXISTENT - - A command failed because the referenced script name doesn't exist. - This response code only makes sense when returned in a NO/BYE - response. - - ALREADYEXISTS - - A command failed because the referenced script name already exists. - This response code only makes sense when returned in a NO/BYE - response. - - - -Melnikov & Martin Standards Track [Page 5] - -RFC 5804 ManageSieve July 2010 - - - TAG - - This response code name is followed by a string specified in the - command. See Section 2.13 for a possible use case. - - WARNINGS - - This response code MAY be returned by the server in the OK response - (but it might be returned with the NO/BYE response as well) and - signals the client that even though the script is syntactically - valid, it might contain errors not intended by the script writer. - This response code is typically returned in response to PUTSCRIPT - and/or CHECKSCRIPT commands. A client seeing such response code - SHOULD present the returned warning text to the user. - -1.4. Active Script - - A user may have multiple Sieve scripts on the server, yet only one - script may be used for filtering of incoming messages. This is the - active script. Users may have zero or one active script and MUST use - the SETACTIVE command described below for changing the active script - or disabling Sieve processing. For example, users may have an - everyday script they normally use and a special script they use when - they go on vacation. Users can change which script is being used - without having to download and upload a script stored somewhere else. - -1.5. Quotas - - Servers SHOULD impose quotas to prevent malicious users from - overflowing available storage. If a command would place a user over - a quota setting, servers that impose such quotas MUST reply with a NO - response containing the QUOTA response code. Client implementations - MUST be able to handle commands failing because of quota - restrictions. - -1.6. Script Names - - A Sieve script name is a sequence of Unicode characters encoded in - UTF-8 [UTF-8]. A script name MUST comply with Net-Unicode Definition - (Section 2 of [NET-UNICODE]), with the additional restriction of - prohibiting the following Unicode characters: - - o 0000-001F; [CONTROL CHARACTERS] - - o 007F; DELETE - - o 0080-009F; [CONTROL CHARACTERS] - - - - -Melnikov & Martin Standards Track [Page 6] - -RFC 5804 ManageSieve July 2010 - - - o 2028; LINE SEPARATOR - - o 2029; PARAGRAPH SEPARATOR - - Sieve script names MUST be at least one octet (and hence Unicode - character) long. Zero octets script name has a special meaning (see - Section 2.8). Servers MUST allow names of up to 128 Unicode - characters in length (which can take up to 512 bytes when encoded in - UTF-8, not counting the terminating NUL), and MAY allow longer names. - A server that receives a script name longer than its internal limit - MUST reject the corresponding operation, in particular it MUST NOT - truncate the script name. - -1.7. Capabilities - - Server capabilities are sent automatically by the server upon a - client connection, or after successful STARTTLS and AUTHENTICATE - (which establishes a Simple Authentication and Security Layer (SASL)) - commands. Capabilities may change immediately after a successfully - completed STARTTLS command, and/or immediately after a successfully - completed AUTHENTICATE command, and/or after a successfully completed - UNAUTHENTICATE command (see Section 2.14.1). Capabilities MUST - remain static at all other times. - - Clients MAY request the capabilities at a later time by issuing the - CAPABILITY command described later. The capabilities consist of a - series of lines each with one or two strings. The first string is - the name of the capability, which is case-insensitive. The second - optional string is the value associated with that capability. Order - of capabilities is arbitrary, but each capability name can appear at - most once. - - The following capabilities are defined in this document: - - IMPLEMENTATION - Name of implementation and version. This capability - MUST always be returned by the server. - - SASL - List of SASL mechanisms supported by the server, each - separated by a space. This list can be empty if and only if STARTTLS - is also advertised. This means that the client must negotiate TLS - encryption with STARTTLS first, at which point the SASL capability - will list a non-empty list of SASL mechanisms. - - SIEVE - List of space-separated Sieve extensions (as listed in Sieve - "require" action [SIEVE]) supported by the Sieve engine. This - capability MUST always be returned by the server. - - - - - -Melnikov & Martin Standards Track [Page 7] - -RFC 5804 ManageSieve July 2010 - - - STARTTLS - If TLS [TLS] is supported by this implementation. Before - advertising this capability a server MUST verify to the best of its - ability that TLS can be successfully negotiated by a client with - common cipher suites. Specifically, a server should verify that a - server certificate has been installed and that the TLS subsystem has - successfully initialized. This capability SHOULD NOT be advertised - once STARTTLS or AUTHENTICATE command completes successfully. Client - and server implementations MUST implement the STARTTLS extension. - - MAXREDIRECTS - Specifies the limit on the number of Sieve "redirect" - actions a script can perform during a single evaluation. Note that - this is different from the total number of "redirect" actions a - script can contain. The value is a non-negative number represented - as a ManageSieve string. - - NOTIFY - A space-separated list of URI schema parts for supported - notification methods. This capability MUST be specified if the Sieve - implementation supports the "enotify" extension [NOTIFY]. - - LANGUAGE - The language (<Language-Tag> from [RFC5646]) currently - used for human-readable error messages. If this capability is not - returned, the "i-default" [RFC2277] language is assumed. Note that - the current language MAY be per-user configurable (i.e., it MAY - change after authentication). - - OWNER - The canonical name of the logged-in user (SASL "authorization - identity") encoded in UTF-8. This capability MUST NOT be returned in - unauthenticated state and SHOULD be returned once the AUTHENTICATE - command succeeds. - - VERSION - This capability MUST be returned by servers compliant with - this document or its successor. For servers compliant with this - document, the capability value is the string "1.0". Lack of this - capability means that the server predates this specification and thus - doesn't support the following commands: RENAMESCRIPT, CHECKSCRIPT, - and NOOP. - - Section 2.14 defines some additional ManageSieve extensions and their - respective capabilities. - - A server implementation MUST return SIEVE, IMPLEMENTATION, and - VERSION capabilities. - - A client implementation MUST ignore any listed capabilities that it - does not understand. - - - - - - -Melnikov & Martin Standards Track [Page 8] - -RFC 5804 ManageSieve July 2010 - - - Example: - - S: "IMPlemENTATION" "Example1 ManageSieved v001" - S: "SASl" "DIGEST-MD5 GSSAPI" - S: "SIeVE" "fileinto vacation" - S: "StaRTTLS" - S: "NOTIFY" "xmpp mailto" - S: "MAXREdIRECTS" "5" - S: "VERSION" "1.0" - S: OK - - After successful authentication, this might look like this: - - Example: - - S: "IMPlemENTATION" "Example1 ManageSieved v001" - S: "SASl" "DIGEST-MD5 GSSAPI" - S: "SIeVE" "fileinto vacation" - S: "NOTIFY" "xmpp mailto" - S: "OWNER" "alexey@example.com" - S: "MAXREdIRECTS" "5" - S: "VERSION" "1.0" - S: OK - -1.8. Transport - - The ManageSieve protocol assumes a reliable data stream such as that - provided by TCP. When TCP is used, a ManageSieve server typically - listens on port 4190. - - Before opening the TCP connection, the ManageSieve client first MUST - resolve the Domain Name System (DNS) hostname associated with the - receiving entity and determine the appropriate TCP port for - communication with the receiving entity. The process is as follows: - - 1. Attempt to resolve the hostname using a [DNS-SRV] Service of - "sieve" and a Proto of "tcp" for the target domain (e.g., - "example.net"), resulting in resource records such as - "_sieve._tcp.example.net.". The result of the SRV lookup, if - successful, will be one or more combinations of a port and - hostname; the ManageSieve client MUST resolve the returned - hostnames to IPv4/IPv6 addresses according to returned SRV record - weight. IP addresses from the first successfully resolved - hostname (with the corresponding port number returned by SRV - lookup) are used to connect to the server. If connection using - one of the IP addresses fails, the next resolved IP address is - - - - - -Melnikov & Martin Standards Track [Page 9] - -RFC 5804 ManageSieve July 2010 - - - used to connect. If connection to all resolved IP addresses - fails, then the resolution/connect is repeated for the next - hostname returned by SRV lookup. - - 2. If the SRV lookup fails, the fallback SHOULD be a normal IPv4 or - IPv6 address record resolution to determine the IP address, where - the port used is the default ManageSieve port of 4190. - -1.9. Conventions Used in This Document - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in [KEYWORDS]. - - In examples, "C:" and "S:" indicate lines sent by the client and - server respectively. Line breaks that do not start a new "C:" or - "S:" exist for editorial reasons. - - Examples of authentication in this document are using DIGEST-MD5 - [DIGEST-MD5] and GSSAPI [GSSAPI] SASL mechanisms. - -2. Commands - - This section and its subsections describe valid ManageSieve commands. - Upon initial connection to the server, the client's session is in - non-authenticated state. Prior to successful authentication, only - the AUTHENTICATE, CAPABILITY, STARTTLS, LOGOUT, and NOOP (see Section - 2.13) commands are valid. ManageSieve extensions MAY define other - commands that are valid in non-authenticated state. Servers MUST - reject all other commands with a NO response. Clients may pipeline - commands (send more than one command at a time without waiting for - completion of the first command). However, a group of commands sent - together MUST NOT have an AUTHENTICATE (*), a STARTTLS, or a - HAVESPACE command anywhere but the last command in the list. - - (*) - The only exception to this rule is when the AUTHENTICATE - command contains an initial response for a SASL mechanism that allows - clients to send data first, the mechanism is known to complete in one - round trip, and the mechanism doesn't negotiate a SASL security - layer. Two examples of such SASL mechanisms are PLAIN [PLAIN] and - EXTERNAL [SASL]. - - - - - - - - - - -Melnikov & Martin Standards Track [Page 10] - -RFC 5804 ManageSieve July 2010 - - -2.1. AUTHENTICATE Command - - Arguments: String - mechanism - String - initial data (optional) - - The AUTHENTICATE command indicates a SASL [SASL] authentication - mechanism to the server. If the server supports the requested - authentication mechanism, it performs an authentication protocol - exchange to identify and authenticate the user. Optionally, it also - negotiates a security layer for subsequent protocol interactions. If - the requested authentication mechanism is not supported, the server - rejects the AUTHENTICATE command by sending the NO response. - - The authentication protocol exchange consists of a series of server - challenges and client responses that are specific to the selected - authentication mechanism. A server challenge consists of a string - (quoted or literal) followed by a CRLF. The contents of the string - is a base-64 encoding [BASE64] of the SASL data. A client response - consists of a string (quoted or literal) with the base-64 encoding of - the SASL data followed by a CRLF. If the client wishes to cancel the - authentication exchange, it issues a string containing a single "*". - If the server receives such a response, it MUST reject the - AUTHENTICATE command by sending a NO reply. - - Note that an empty challenge/response is sent as an empty string. If - the mechanism dictates that the final response is sent by the server, - this data MAY be placed within the data portion of the SASL response - code to save a round trip. - - The optional initial-response argument to the AUTHENTICATE command is - used to save a round trip when using authentication mechanisms that - are defined to send no data in the initial challenge. When the - initial-response argument is used with such a mechanism, the initial - empty challenge is not sent to the client and the server uses the - data in the initial-response argument as if it were sent in response - to the empty challenge. If the initial-response argument to the - AUTHENTICATE command is used with a mechanism that sends data in the - initial challenge, the server MUST reject the AUTHENTICATE command by - sending the NO response. - - The service name specified by this protocol's profile of SASL is - "sieve". - - Reauthentication is not supported by ManageSieve protocol's profile - of SASL. That is, after a successfully completed AUTHENTICATE - command, no more AUTHENTICATE commands may be issued in the same - session. After a successful AUTHENTICATE command completes, a server - MUST reject any further AUTHENTICATE commands with a NO reply. - - - -Melnikov & Martin Standards Track [Page 11] - -RFC 5804 ManageSieve July 2010 - - - However, note that a server may implement the UNAUTHENTICATE - extension described in Section 2.14.1. - - If a security layer is negotiated through the SASL authentication - exchange, it takes effect immediately following the CRLF that - concludes the successful authentication exchange for the client, and - the CRLF of the OK response for the server. - - When a security layer takes effect, the ManageSieve protocol is reset - to the initial state (the state in ManageSieve after a client has - connected to the server). The server MUST discard any knowledge - obtained from the client that was not obtained from the SASL (or TLS) - negotiation itself. Likewise, the client MUST discard any knowledge - obtained from the server, such as the list of ManageSieve extensions, - that was not obtained from the SASL (and/or TLS) negotiation itself. - (Note that a client MAY compare the advertised SASL mechanisms before - and after authentication in order to detect an active down- - negotiation attack. See below.) - - Once a SASL security layer is established, the server MUST re-issue - the capability results, followed by an OK response. This is - necessary to protect against man-in-the-middle attacks that alter the - capabilities list prior to SASL negotiation. The capability results - MUST include all SASL mechanisms the server was capable of - negotiating with that client. This is done in order to allow the - client to detect an active down-negotiation attack. If a user- - oriented client detects such a down-negotiation attack, it SHOULD - either notify the user (it MAY give the user the opportunity to - continue with the ManageSieve session in this case) or close the - transport connection and indicate that a down-negotiation attack - might be in progress. If an automated client detects a down- - negotiation attack, it SHOULD return or log an error indicating that - a possible attack might be in progress and/or SHOULD close the - transport connection. - - When both [TLS] and SASL security layers are in effect, the TLS - encoding MUST be applied (when sending data) after the SASL encoding. - - Server implementations SHOULD support SASL proxy authentication so - that an administrator can administer a user's scripts. Proxy - authentication is when a user authenticates as herself/himself but - requests the server to act (authorize) as another user. - - The authorization identity generated by this [SASL] exchange is a - "simple username" (in the sense defined in [SASLprep]), and both - client and server MUST use the [SASLprep] profile of the [StringPrep] - algorithm to prepare these names for transmission or comparison. If - preparation of the authorization identity fails or results in an - - - -Melnikov & Martin Standards Track [Page 12] - -RFC 5804 ManageSieve July 2010 - - - empty string (unless it was transmitted as the empty string), the - server MUST fail the authentication. - - If an AUTHENTICATE command fails with a NO response, the client MAY - try another authentication mechanism by issuing another AUTHENTICATE - command. In other words, the client may request authentication types - in decreasing order of preference. - - Note that a failed (NO) response to the AUTHENTICATE command may - contain one of the following response codes: AUTH-TOO-WEAK, ENCRYPT- - NEEDED, or TRANSITION-NEEDED. See Section 1.3 for detailed - description of the relevant conditions. - - To ensure interoperability, both client and server implementations of - the ManageSieve protocol MUST implement the SCRAM-SHA-1 [SCRAM] SASL - mechanism, as well as [PLAIN] over [TLS]. - - Note: use of PLAIN over TLS reflects current use of PLAIN over TLS in - other email-related protocols; however, a longer-term goal is to - migrate email-related protocols from using PLAIN over TLS to SCRAM- - SHA-1 mechanism. - - Examples (Note that long lines are folded for readability and are not - part of protocol exchange): - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: "VERSION" "1.0" - S: OK - C: Authenticate "DIGEST-MD5" - S: "cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik - 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz - cyxjaGFyc2V0PXV0Zi04" - C: "Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 - QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo - aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX - N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy - ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 - A9YXV0aA==" - S: OK (SASL "cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZ - mZmZA==") - - - - - - - - -Melnikov & Martin Standards Track [Page 13] - -RFC 5804 ManageSieve July 2010 - - - A slightly different variant of the same authentication exchange is: - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "VERSION" "1.0" - S: "STARTTLS" - S: OK - C: Authenticate "DIGEST-MD5" - S: {136} - S: cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik - 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz - cyxjaGFyc2V0PXV0Zi04 - C: {300+} - C: Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 - QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo - aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX - N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy - ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 - A9YXV0aA== - S: {56} - S: cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA== - C: "" - S: OK - - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 14] - -RFC 5804 ManageSieve July 2010 - - - Another example demonstrating use of SASL PLAIN mechanism under TLS - follows. This example also demonstrate use of SASL "initial - response" (the second parameter to the Authenticate command): - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: OK - C: STARTTLS - S: OK - <TLS negotiation, further commands are under TLS layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN" - S: "SIEVE" "fileinto vacation" - S: OK - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xu" - S: NO - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xz" - S: NO - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xy" - S: BYE "Too many failed authentication attempts" - <Server closes connection> - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 15] - -RFC 5804 ManageSieve July 2010 - - - The following example demonstrates use of SASL "initial response". - It also demonstrates that an empty response can be sent as a literal - and that negotiating a SASL security layer results in the server - re-issuing server capabilities: - - C: AUTHENTICATE "GSSAPI" {1488+} - C: YIIE[...1480 octets here ...]dA== - S: {208} - S: YIGZBgkqhkiG9xIBAgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKic - [...114 octets here ...] - /yzpAy9p+Y0LanLskOTvMc0MnjgAa4YEr3eJ6 - C: {0+} - C: - S: {44} - S: BQQF/wAMAAwAAAAAYRGFAo6W0vIHti8i1UXODgEAEAA= - C: {44+} - C: BQQE/wAMAAwAAAAAIsT1iv9UkZApw471iXt6cwEAAAE= - S: OK - <Further commands/responses are under SASL security layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "LANGUAGE" "ru" - S: "MAXREDIRECTS" "3" - S: ok - -2.1.1. Use of SASL PLAIN Mechanism over TLS - - This section is normative for ManageSieve client implementations that - support SASL [PLAIN] over [TLS]. - - If a ManageSieve client is willing to use SASL PLAIN over TLS to - authenticate to the ManageSieve server, the client MUST verify the - server identity (see Section 2.2.1). If the server identity can't be - verified (e.g., the server has not provided any certificate, or if - the certificate verification fails), the client MUST NOT attempt to - authenticate using the SASL PLAIN mechanism. - -2.2. STARTTLS Command - - Support for STARTTLS command in servers is optional. Its - availability is advertised with "STARTTLS" capability as described in - Section 1.7. - - The STARTTLS command requests commencement of a TLS [TLS] - negotiation. The negotiation begins immediately after the CRLF in - the OK response. After a client issues a STARTTLS command, it MUST - - - -Melnikov & Martin Standards Track [Page 16] - -RFC 5804 ManageSieve July 2010 - - - NOT issue further commands until a server response is seen and the - TLS negotiation is complete. - - The STARTTLS command is only valid in non-authenticated state. The - server remains in non-authenticated state, even if client credentials - are supplied during the TLS negotiation. The SASL [SASL] EXTERNAL - mechanism MAY be used to authenticate once TLS client credentials are - successfully exchanged, but servers supporting the STARTTLS command - are not required to support the EXTERNAL mechanism. - - After the TLS layer is established, the server MUST re-issue the - capability results, followed by an OK response. This is necessary to - protect against man-in-the-middle attacks that alter the capabilities - list prior to STARTTLS. This capability result MUST NOT include the - STARTTLS capability. - - The client MUST discard cached capability information and replace it - with the new information. The server MAY advertise different - capabilities after STARTTLS. - - Example: - - C: StartTls - S: oK - <TLS negotiation, further commands are under TLS layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "VERSION" "1.0" - S: "LANGUAGE" "fr" - S: ok - -2.2.1. Server Identity Check - - During the TLS negotiation, the ManageSieve client MUST check its - understanding of the server hostname/IP address against the server's - identity as presented in the server Certificate message, in order to - prevent man-in-the-middle attacks. In this section, the client's - understanding of the server's identity is called the "reference - identity". - - Checking is performed according to the following rules: - - o If the reference identity is a hostname: - - 1. If a subjectAltName extension of the SRVName [X509-SRV], - dNSName [X509] (in that order of preference) type is present - in the server's certificate, then it SHOULD be used as the - - - -Melnikov & Martin Standards Track [Page 17] - -RFC 5804 ManageSieve July 2010 - - - source of the server's identity. Matching is performed as - described in Section 2.2.1.1, with the exception that no - wildcard matching is allowed for SRVName type. If the - certificate contains multiple names (e.g., more than one - dNSName field), then a match with any one of the fields is - considered acceptable. - - 2. The client MAY use other types of subjectAltName for - performing comparison. - - 3. The server's identity MAY also be verified by comparing the - reference identity to the Common Name (CN) [RFC4519] value in - the leaf Relative Distinguished Name (RDN) of the subjectName - field of the server's certificate. This comparison is - performed using the rules for comparison of DNS names in - Section 2.2.1.1, below. Although the use of the Common Name - value is existing practice, it is deprecated, and - Certification Authorities are encouraged to provide - subjectAltName values instead. Note that the TLS - implementation may represent DNs in certificates according to - X.500 or other conventions. For example, some X.500 - implementations order the RDNs in a DN using a left-to-right - (most significant to least significant) convention instead of - LDAP's right-to-left convention. - - o When the reference identity is an IP address, the iPAddress - subjectAltName SHOULD be used by the client for comparison. The - comparison is performed as described in Section 2.2.1.2. - - If the server identity check fails, user-oriented clients SHOULD - either notify the user (clients MAY give the user the opportunity to - continue with the ManageSieve session in this case) or close the - transport connection and indicate that the server's identity is - suspect. Automated clients SHOULD return or log an error indicating - that the server's identity is suspect and/or SHOULD close the - transport connection. Automated clients MAY provide a configuration - setting that disables this check, but MUST provide a setting that - enables it. - - Beyond the server identity check described in this section, clients - should be prepared to do further checking to ensure that the server - is authorized to provide the service it is requested to provide. The - client may need to make use of local policy information in making - this determination. - - - - - - - -Melnikov & Martin Standards Track [Page 18] - -RFC 5804 ManageSieve July 2010 - - -2.2.1.1. Comparison of DNS Names - - If the reference identity is an internationalized domain name, - conforming implementations MUST convert it to the ASCII Compatible - Encoding (ACE) format as specified in Section 4 of RFC 3490 [RFC3490] - before comparison with subjectAltName values of type dNSName. - Specifically, conforming implementations MUST perform the conversion - operation specified in Section 4 of [RFC3490] as follows: - - o in step 1, the domain name SHALL be considered a "stored string"; - - o in step 3, set the flag called "UseSTD3ASCIIRules"; - - o in step 4, process each label with the "ToASCII" operation; and - - o in step 5, change all label separators to U+002E (full stop). - - After performing the "to-ASCII" conversion, the DNS labels and names - MUST be compared for equality according to the rules specified in - Section 3 of [RFC3490]; i.e., once all label separators are replaced - with U+002E (dot) they are compared in the case-insensitive manner. - - The '*' (ASCII 42) wildcard character is allowed in subjectAltName - values of type dNSName, and then only as the left-most (least - significant) DNS label in that value. This wildcard matches any - left-most DNS label in the server name. That is, the subject - *.example.com matches the server names a.example.com and - b.example.com, but does not match example.com or a.b.example.com. - -2.2.1.2. Comparison of IP Addresses - - When the reference identity is an IP address, the identity MUST be - converted to the "network byte order" octet string representation - [RFC791][RFC2460]. For IP Version 4, as specified in RFC 791, the - octet string will contain exactly four octets. For IP Version 6, as - specified in RFC 2460, the octet string will contain exactly sixteen - octets. This octet string is then compared against subjectAltName - values of type iPAddress. A match occurs if the reference identity - octet string and value octet strings are identical. - -2.2.1.3. Comparison of Other subjectName Types - - Client implementations MAY support matching against subjectAltName - values of other types as described in other documents. - - - - - - - -Melnikov & Martin Standards Track [Page 19] - -RFC 5804 ManageSieve July 2010 - - -2.3. LOGOUT Command - - The client sends the LOGOUT command when it is finished with a - connection and wishes to terminate it. The server MUST reply with an - OK response. The server MUST ignore commands issued by the client - after the LOGOUT command. - - The client SHOULD wait for the OK response before closing the - connection. This avoids the TCP connection going into the TIME_WAIT - state on the server. In order to avoid going into the TIME_WAIT TCP - state, the server MAY wait for a short while for the client to close - the TCP connection first. Whether or not the server waits for the - client to close the connection, it MUST then close the connection - itself. - - Example: - - C: Logout - S: Ok - <connection is terminated> - -2.4. CAPABILITY Command - - The CAPABILITY command requests the server capabilities as described - earlier in this document. It has no parameters. - - Example: - - C: CAPABILITY - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN SCRAM-SHA-1 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: OK - -2.5. HAVESPACE Command - - Arguments: String - name - Number - script size - - The HAVESPACE command is used to query the server for available - space. Clients specify the name they wish to save the script as and - its size in octets. Both parameters can be used by the server to see - if the script with the specified name and size is within a user's - quota(s). For example, the server MAY use the script name to check - if a script would be replaced or a new one would be created. Servers - respond with a NO if storing a script with that name and size would - - - -Melnikov & Martin Standards Track [Page 20] - -RFC 5804 ManageSieve July 2010 - - - fail or OK otherwise. Clients SHOULD issue this command before - attempting to place a script on the server. - - Note that the OK response from the HAVESPACE command does not - constitute a guarantee of success as server disk space conditions - could change between the client issuing the HAVESPACE and the client - issuing the PUTSCRIPT commands. A QUOTA response code (see - Section 1.3) remains a possible (albeit unlikely) response to a - subsequent PUTSCRIPT with the same name and size. - - Example: - - C: HAVESPACE "myscript" 999999 - S: NO (QUOTA/MAXSIZE) "Quota exceeded" - - C: HAVESPACE "foobar" 435 - S: OK - -2.6. PUTSCRIPT Command - - Arguments: String - Script name - String - Script content - - The PUTSCRIPT command is used by the client to submit a Sieve script - to the server. - - If the script already exists, upon success the old script will be - overwritten. The old script MUST NOT be overwritten if PUTSCRIPT - fails in any way. A script of zero length SHOULD be disallowed. - - This command places the script on the server. It does not affect - whether the script is processed on incoming mail, unless it replaces - the script that is already active. The SETACTIVE command is used to - mark a script as active. - - When submitting large scripts, clients SHOULD use the HAVESPACE - command beforehand to query if the server is willing to accept a - script of that size. - - The server MUST check the submitted script for validity, which - includes checking that the script complies with the Sieve grammar - [SIEVE] and that all Sieve extensions mentioned in the script's - "require" statement(s) are supported by the Sieve interpreter. (Note - that if the Sieve interpreter supports the Sieve "ihave" extension - [I-HAVE], any unrecognized/unsupported extension mentioned in the - "ihave" test MUST NOT cause the validation failure.) Other checks - such as validating the supplied command arguments for each command - MAY be performed. Essentially, the performed validation SHOULD be - - - -Melnikov & Martin Standards Track [Page 21] - -RFC 5804 ManageSieve July 2010 - - - the same as performed when compiling the script for execution. - Implementations that use a binary representation to store compiled - scripts can extend the validation to a full compilation, in order to - avoid validating uploaded scripts multiple times. - - If the script fails the validation, the server MUST reply with a NO - response. Any script that fails the validity test MUST NOT be stored - on the server. The message given with a NO response MUST be human - readable and SHOULD contain a specific error message giving the line - number of the first error. Implementors should strive to produce - helpful error messages similar to those given by programming language - compilers. Client implementations should note that this may be a - multiline literal string with more than one error message separated - by CRLFs. The human-readable message is in the language returned in - the latest LANGUAGE capability (or in "i-default"; see Section 1.7), - encoded in UTF-8 [UTF-8]. - - An OK response MAY contain the WARNINGS response code. In such a - case the human-readable message that follows the OK response SHOULD - contain a specific warning message (or messages) giving the line - number(s) in the script that might contain errors not intended by the - script writer. The human-readable message is in the language - returned in the latest LANGUAGE capability (or in "i-default"; see - Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a - response code SHOULD present the message to the user. - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 22] - -RFC 5804 ManageSieve July 2010 - - - Examples: - - C: Putscript "foo" {31+} - C: #comment - C: InvalidSieveCommand - C: - S: NO "line 2: Syntax error" - - C: Putscript "mysievescript" {110+} - C: require ["fileinto"]; - C: - C: if envelope :contains "to" "tmartin+sent" { - C: fileinto "INBOX.sent"; - C: } - S: OK - - C: Putscript "myforwards" {190+} - C: redirect "111@example.net"; - C: - C: if size :under 10k { - C: redirect "mobile@cell.example.com"; - C: } - C: - C: if envelope :contains "to" "tmartin+lists" { - C: redirect "lists@groups.example.com"; - C: } - S: OK (WARNINGS) "line 8: server redirect action - limit is 2, this redirect might be ignored" - -2.7. LISTSCRIPTS Command - - This command lists the scripts the user has on the server. Upon - success, a list of CRLF-separated script names (each represented as a - quoted or literal string) is returned followed by an OK response. If - there exists an active script, the atom ACTIVE is appended to the - corresponding script name. The atom ACTIVE MUST NOT appear on more - than one response line. - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 23] - -RFC 5804 ManageSieve July 2010 - - - Example: - - C: Listscripts - S: "summer_script" - S: "vacation_script" - S: {13} - S: clever"script - S: "main_script" ACTIVE - S: OK - - C: listscripts - S: "summer_script" - S: "main_script" active - S: OK - -2.8. SETACTIVE Command - - Arguments: String - script name - - This command sets a script active. If the script name is the empty - string (i.e., ""), then any active script is disabled. Disabling an - active script when there is no script active is not an error and MUST - result in an OK reply. - - If the script does not exist on the server, then the server MUST - reply with a NO response. Such a reply SHOULD contain the - NONEXISTENT response code. - - Examples: - - C: Setactive "vacationscript" - S: Ok - - C: Setactive "" - S: Ok - - C: Setactive "baz" - S: No (NONEXISTENT) "There is no script by that name" - - C: Setactive "baz" - S: No (NONEXISTENT) {31} - S: There is no script by that name - - - - - - - - - -Melnikov & Martin Standards Track [Page 24] - -RFC 5804 ManageSieve July 2010 - - -2.9. GETSCRIPT Command - - Arguments: String - script name - - This command gets the contents of the specified script. If the - script does not exist, the server MUST reply with a NO response. - Such a reply SHOULD contain the NONEXISTENT response code. - - Upon success, a string with the contents of the script is returned - followed by an OK response. - - Example: - - C: Getscript "myscript" - S: {54} - S: #this is my wonderful script - S: reject "I reject all"; - S: - S: OK - -2.10. DELETESCRIPT Command - - Arguments: String - script name - - This command is used to delete a user's Sieve script. Servers MUST - reply with a NO response if the script does not exist. Such - responses SHOULD include the NONEXISTENT response code. - - The server MUST NOT allow the client to delete an active script, so - the server MUST reply with a NO response if attempted. Such a - response SHOULD contain the ACTIVE response code. If a client wishes - to delete an active script, it should use the SETACTIVE command to - disable the script first. - - Example: - - C: Deletescript "foo" - S: Ok - - C: Deletescript "baz" - S: No (ACTIVE) "You may not delete an active script" - - - - - - - - - - -Melnikov & Martin Standards Track [Page 25] - -RFC 5804 ManageSieve July 2010 - - -2.11. RENAMESCRIPT Command - - Arguments: String - Old Script name - String - New Script name - - This command is used to rename a user's Sieve script. Servers MUST - reply with a NO response if the old script does not exist (in which - case the NONEXISTENT response code SHOULD be included), or a script - with the new name already exists (in which case the ALREADYEXISTS - response code SHOULD be included). Renaming the active script is - allowed; the renamed script remains active. - - Example: - - C: Renamescript "foo" "bar" - S: Ok - - C: Renamescript "baz" "bar" - S: No "bar already exists" - - If the server doesn't support the RENAMESCRIPT command, the client - can emulate it by performing the following steps: - - 1. List available scripts with LISTSCRIPTS. If the script with the - new script name exists, then the client should ask the user - whether to abort the operation, to replace the script (by issuing - the DELETESCRIPT <newname> after that), or to choose a different - name. - - 2. Download the old script with GETSCRIPT <oldname>. - - 3. Upload the old script with the new name: PUTSCRIPT <newname>. - - 4. If the old script was active (as reported by LISTSCRIPTS in step - 1), then make the new script active: SETACTIVE <newname>. - - 5. Delete the old script: DELETESCRIPT <oldname>. - - Note that these steps don't describe how to handle various other - error conditions (for example, NO response containing QUOTA response - code in step 3). Error handling is left as an exercise for the - reader. - - - - - - - - - -Melnikov & Martin Standards Track [Page 26] - -RFC 5804 ManageSieve July 2010 - - -2.12. CHECKSCRIPT Command - - Arguments: String - Script content - - The CHECKSCRIPT command is used by the client to verify Sieve script - validity without storing the script on the server. - - The server MUST check the submitted script for syntactic validity, - which includes checking that all Sieve extensions mentioned in Sieve - script "require" statement(s) are supported by the Sieve interpreter. - (Note that if the Sieve interpreter supports the Sieve "ihave" - extension [I-HAVE], any unrecognized/unsupported extension mentioned - in the "ihave" test MUST NOT cause the syntactic validation failure.) - If the script fails this test, the server MUST reply with a NO - response. The message given with a NO response MUST be human - readable and SHOULD contain a specific error message giving the line - number of the first error. Implementors should strive to produce - helpful error messages similar to those given by programming language - compilers. Client implementations should note that this may be a - multiline literal string with more than one error message separated - by CRLFs. The human-readable message is in the language returned in - the latest LANGUAGE capability (or in "i-default"; see Section 1.7), - encoded in UTF-8 [UTF-8]. - - Examples: - - C: CheckScript {31+} - C: #comment - C: InvalidSieveCommand - C: - S: NO "line 2: Syntax error" - - A ManageSieve server supporting this command MUST NOT check if the - script will put the current user over its quota limit. - - An OK response MAY contain the WARNINGS response code. In such a - case, the human-readable message that follows the OK response SHOULD - contain a specific warning message (or messages) giving the line - number(s) in the script that might contain errors not intended by the - script writer. The human-readable message is in the language - returned in the latest LANGUAGE capability (or in "i-default"; see - Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a - response code SHOULD present the message to the user. - - - - - - - - -Melnikov & Martin Standards Track [Page 27] - -RFC 5804 ManageSieve July 2010 - - -2.13. NOOP Command - - Arguments: String - tag to echo back (optional) - - The NOOP command does nothing, beyond returning a response to the - client. It may be used by clients for protocol re-synchronization or - to reset any inactivity auto-logout timer on the server. - - The response to the NOOP command is always OK, followed by the TAG - response code together with the supplied string. If no string was - supplied in the NOOP command, the TAG response code MUST NOT be - included. - - Examples: - - C: NOOP - S: OK "NOOP completed" - - C: NOOP "STARTTLS-SYNC-42" - S: OK (TAG {16} - S: STARTTLS-SYNC-42) "Done" - -2.14. Recommended Extensions - - The UNAUTHENTICATE extension (advertised as the "UNAUTHENTICATE" - capability with no parameters) defines a new UNAUTHENTICATE command, - which allows a client to return the server to non-authenticated - state. Support for this extension is RECOMMENDED. - -2.14.1. UNAUTHENTICATE Command - - The UNAUTHENTICATE command returns the server to the - non-authenticated state. It doesn't affect any previously - established TLS [TLS] or SASL (Section 2.1) security layer. - - The UNAUTHENTICATE command is only valid in authenticated state. If - issued in a wrong state, the server MUST reject it with a NO - response. - - The UNAUTHENTICATE command has no parameters. - - When issued in the authenticated state, the UNAUTHENTICATE command - MUST NOT fail (i.e., it must never return anything other than OK or - BYE). - - - - - - - -Melnikov & Martin Standards Track [Page 28] - -RFC 5804 ManageSieve July 2010 - - -3. Sieve URL Scheme - - URI scheme name: sieve - - Status: permanent - - URI scheme syntax: Described using ABNF [ABNF]. Some ABNF - productions not defined below are from [URI-GEN]. - - sieveurl = sieveurl-server / sieveurl-list-scripts / - sieveurl-script - - sieveurl-server = "sieve://" authority - - sieveurl-list-scripts = "sieve://" authority ["/"] - - sieveurl-script = "sieve://" authority "/" - [owner "/"] scriptname - - authority = <defined in [URI-GEN]> - - owner = *ochar - ;; %-encoded version of [SASL] authorization - ;; identity (script owner) or "userid". - ;; - ;; Empty owner is used to reference - ;; global scripts. - ;; - ;; Note that ASCII characters such as " ", ";", - ;; "&", "=", "/" and "?" must be %-encoded - ;; as per rule specified in [URI-GEN]. - - scriptname = 1*ochar - ;; %-encoded version of UTF-8 representation - ;; of the script name. - ;; Note that ASCII characters such as " ", ";", - ;; "&", "=", "/" and "?" must be %-encoded - ;; as per rule specified in [URI-GEN]. - - ochar = unreserved / pct-encoded / sub-delims-sh / - ":" / "@" - ;; Same as [URI-GEN] 'pchar', - ;; but without ";", "&" and "=". - - unreserved = <defined in [URI-GEN]> - - pct-encoded = <defined in [URI-GEN]> - - - - -Melnikov & Martin Standards Track [Page 29] - -RFC 5804 ManageSieve July 2010 - - - sub-delims-sh = "!" / "$" / "'" / "(" / ")" / - "*" / "+" / "," - ;; Same as [URI-GEN] sub-delims, - ;; but without ";", "&" and "=". - - URI scheme semantics: - - A Sieve URL identifies a Sieve server or a Sieve script on a Sieve - server. The latter form is associated with the application/sieve - MIME type defined in [SIEVE]. There is no MIME type associated - with the former form of Sieve URI. - - The server form is used in the REFERRAL response code (see Section - 1.3) in order to designate another server where the client should - perform its operations. - - The script form allows to retrieve (GETSCRIPT), update - (PUTSCRIPT), delete (DELETESCRIPT), or activate (SETACTIVE) the - named script; however, the most typical action would be to - retrieve the script. If the script name is empty (omitted), the - URI requests that the client lists available scripts using the - LISTSCRIPTS command. - - Encoding considerations: - - The script name and/or the owner, if present, is in UTF-8. Non-- - US-ASCII UTF-8 octets MUST be percent-encoded as described in - [URI-GEN]. US-ASCII characters such as " " (space), ";", "&", - "=", "/" and "?" MUST be %-encoded as described in [URI-GEN]. - Note that "&" and "?" are in this list in order to allow for - future extensions. - - Note that the empty owner (e.g., sieve://example.com//script) is - different from the missing owner (e.g., - sieve://example.com/script) and is reserved for referencing global - scripts. - - The user name (in the "authority" part), if present, is in UTF-8. - Non-US-ASCII UTF-8 octets MUST be percent-encoded as described in - [URI-GEN]. - - Applications/protocols that use this URI scheme name: - ManageSieve [RFC5804] clients and servers. Clients that can store - user preferences in protocols such as [LDAP] or [ACAP]. - - Interoperability considerations: None. - - - - - -Melnikov & Martin Standards Track [Page 30] - -RFC 5804 ManageSieve July 2010 - - - Security considerations: - The <scriptname> part of a ManageSieve URL might potentially disclose - some confidential information about the author of the script or, - depending on a ManageSieve implementation, about configuration of the - mail system. The latter might be used to prepare for a more complex - attack on the mail system. - - Clients resolving ManageSieve URLs that wish to achieve data - confidentiality and/or integrity SHOULD use the STARTTLS command (if - supported by the server) before starting authentication, or use a - SASL mechanism, such as GSSAPI, that provides a confidentiality - security layer. - - Contact: Alexey Melnikov <alexey.melnikov@isode.com> - - Author/Change controller: IESG. - - References: This document and RFC 5228 [SIEVE]. - -4. Formal Syntax - - The following syntax specification uses the Augmented Backus-Naur - Form (BNF) notation as specified in [ABNF]. This uses the ABNF core - rules as specified in Appendix A of the ABNF specification [ABNF]. - "UTF8-2", "UTF8-3", and "UTF8-4" non-terminal are defined in [UTF-8]. - - Except as noted otherwise, all alphabetic characters are case- - insensitive. The use of upper- or lowercase characters to define - token strings is for editorial clarity only. Implementations MUST - accept these strings in a case-insensitive fashion. - - SAFE-CHAR = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / - %x5D-7F - ;; any TEXT-CHAR except QUOTED-SPECIALS - - QUOTED-CHAR = SAFE-UTF8-CHAR / "\" QUOTED-SPECIALS - - QUOTED-SPECIALS = DQUOTE / "\" - - SAFE-UTF8-CHAR = SAFE-CHAR / UTF8-2 / UTF8-3 / UTF8-4 - ;; <UTF8-2>, <UTF8-3>, and <UTF8-4> - ;; are defined in [UTF-8]. - - ATOM-CHAR = "!" / %x23-27 / %x2A-5B / %x5D-7A / %x7C-7E - ;; Any CHAR except ATOM-SPECIALS - - ATOM-SPECIALS = "(" / ")" / "{" / SP / CTL / QUOTED-SPECIALS - - - - -Melnikov & Martin Standards Track [Page 31] - -RFC 5804 ManageSieve July 2010 - - - NZDIGIT = %x31-39 - ;; 1-9 - - atom = 1*1024ATOM-CHAR - - iana-token = atom - ;; MUST be registered with IANA - - auth-type = DQUOTE auth-type-name DQUOTE - - auth-type-name = iana-token - ;; as defined in SASL [SASL] - - command = (command-any / command-auth / - command-nonauth) CRLF - ;; Modal based on state - - command-any = command-capability / command-logout / - command-noop - ;; Valid in all states - - command-auth = command-getscript / command-setactive / - command-listscripts / command-deletescript / - command-putscript / command-checkscript / - command-havespace / - command-renamescript / - command-unauthenticate - ;; Valid only in Authenticated state - - command-nonauth = command-authenticate / command-starttls - ;; Valid only when in Non-Authenticated - ;; state - - command-authenticate = "AUTHENTICATE" SP auth-type [SP string] - *(CRLF string) - - command-capability = "CAPABILITY" - - command-deletescript = "DELETESCRIPT" SP sieve-name - - command-getscript = "GETSCRIPT" SP sieve-name - - command-havespace = "HAVESPACE" SP sieve-name SP number - - command-listscripts = "LISTSCRIPTS" - - command-noop = "NOOP" [SP string] - - - - -Melnikov & Martin Standards Track [Page 32] - -RFC 5804 ManageSieve July 2010 - - - command-logout = "LOGOUT" - - command-putscript = "PUTSCRIPT" SP sieve-name SP sieve-script - - command-checkscript = "CHECKSCRIPT" SP sieve-script - - sieve-script = string - - command-renamescript = "RENAMESCRIPT" SP old-sieve-name SP - new-sieve-name - - old-sieve-name = sieve-name - - new-sieve-name = sieve-name - - command-setactive = "SETACTIVE" SP active-sieve-name - - command-starttls = "STARTTLS" - - command-unauthenticate= "UNAUTHENTICATE" - - extend-token = atom - ;; MUST be defined by a Standards Track or - ;; IESG-approved experimental protocol - ;; extension - - extension-data = extension-item *(SP extension-item) - - extension-item = extend-token / string / number / - "(" [extension-data] ")" - - literal-c2s = "{" number "+}" CRLF *OCTET - ;; The number represents the number of - ;; octets. - ;; This type of literal can only be sent - ;; from the client to the server. - - literal-s2c = "{" number "}" CRLF *OCTET - ;; Almost identical to literal-c2s, - ;; but with no '+' character. - ;; The number represents the number of - ;; octets. - ;; This type of literal can only be sent - ;; from the server to the client. - - - - - - - -Melnikov & Martin Standards Track [Page 33] - -RFC 5804 ManageSieve July 2010 - - - number = (NZDIGIT *DIGIT) / "0" - ;; A 32-bit unsigned number - ;; with no extra leading zeros. - ;; (0 <= n < 4,294,967,296) - - number-str = string - ;; <number> encoded as a <string>. - - quoted = DQUOTE *1024QUOTED-CHAR DQUOTE - ;; limited to 1024 octets between the <">s - - resp-code = "AUTH-TOO-WEAK" / "ENCRYPT-NEEDED" / "QUOTA" - ["/" ("MAXSCRIPTS" / "MAXSIZE")] / - resp-code-sasl / - resp-code-referral / - "TRANSITION-NEEDED" / "TRYLATER" / - "ACTIVE" / "NONEXISTENT" / - "ALREADYEXISTS" / "WARNINGS" / - "TAG" SP string / - resp-code-ext - - resp-code-referral = "REFERRAL" SP sieveurl - - resp-code-sasl = "SASL" SP string - - resp-code-name = iana-token - ;; The response code name is hierarchical, - ;; separated by '/'. - ;; The response code name MUST NOT start - ;; with '/'. - - resp-code-ext = resp-code-name [SP extension-data] - ;; unknown response codes MUST be tolerated - ;; by the client. - - response = response-authenticate / - response-logout / - response-getscript / - response-setactive / - response-listscripts / - response-deletescript / - response-putscript / - response-checkscript / - response-capability / - response-havespace / - response-starttls / - response-renamescript / - response-noop / - - - -Melnikov & Martin Standards Track [Page 34] - -RFC 5804 ManageSieve July 2010 - - - response-unauthenticate - - response-authenticate = *(string CRLF) - ((response-ok [response-capability]) / - response-nobye) - ;; <response-capability> is REQUIRED if a - ;; SASL security layer was negotiated and - ;; MUST be omitted otherwise. - - response-capability = *(single-capability) response-oknobye - - single-capability = capability-name [SP string] CRLF - - capability-name = string - - ;; Note that literal-s2c is allowed. - - initial-capabilities = DQUOTE "IMPLEMENTATION" DQUOTE SP string / - DQUOTE "SASL" DQUOTE SP sasl-mechs / - DQUOTE "SIEVE" DQUOTE SP sieve-extensions / - DQUOTE "MAXREDIRECTS" DQUOTE SP number-str / - DQUOTE "NOTIFY" DQUOTE SP notify-mechs / - DQUOTE "STARTTLS" DQUOTE / - DQUOTE "LANGUAGE" DQUOTE SP language / - DQUOTE "VERSION" DQUOTE SP version / - DQUOTE "OWNER" DQUOTE SP string - ;; Each capability conforms to - ;; the syntax for single-capability. - ;; Also, note that the capability name - ;; can be returned as either literal-s2c - ;; or quoted, even though only "quoted" - ;; string is shown above. - - version = ( DQUOTE "1.0" DQUOTE ) / version-ext - - version-ext = DQUOTE ver-major "." ver-minor DQUOTE - ; Future versions specified in updates - ; to this document. An increment to - ; the ver-major means a backward-incompatible - ; change to the protocol, e.g., "3.5" (ver-major "3") - ; is not backward-compatible with any "2.X" version. - ; Any version "Z.W" MUST be backward compatible - ; with any version "Z.Q", where Q < W. - ; For example, version "2.4" is backward compatible - ; with version "2.0", "2.1", "2.2", and "2.3". - - ver-major = number - - - - -Melnikov & Martin Standards Track [Page 35] - -RFC 5804 ManageSieve July 2010 - - - ver-minor = number - - sasl-mechs = string - ; Space-separated list of SASL mechanisms, - ; each SASL mechanism name complies with rules - ; specified in [SASL]. - ; Can be empty. - - sieve-extensions = string - ; Space-separated list of supported SIEVE extensions. - ; Can be empty. - - language = string - ; Contains <Language-Tag> from [RFC5646]. - - - notify-mechs = string - ; Space-separated list of URI schema parts - ; for supported notification [NOTIFY] methods. - ; MUST NOT be empty. - - response-deletescript = response-oknobye - - response-getscript = (sieve-script CRLF response-ok) / - response-nobye - - response-havespace = response-oknobye - - response-listscripts = *(sieve-name [SP "ACTIVE"] CRLF) - response-oknobye - ;; ACTIVE may only occur with one sieve-name - - response-logout = response-oknobye - - response-unauthenticate= response-oknobye - ;; "NO" response can only be returned when - ;; the command is issued in a wrong state - ;; or has a wrong number of parameters - - response-ok = "OK" [SP "(" resp-code ")"] - [SP string] CRLF - ;; The string contains human-readable text - ;; encoded as UTF-8. - - response-nobye = ("NO" / "BYE") [SP "(" resp-code ")"] - [SP string] CRLF - ;; The string contains human-readable text - ;; encoded as UTF-8. - - - -Melnikov & Martin Standards Track [Page 36] - -RFC 5804 ManageSieve July 2010 - - - response-oknobye = response-ok / response-nobye - - response-noop = response-ok - - response-putscript = response-oknobye - - response-checkscript = response-oknobye - - response-renamescript = response-oknobye - - response-setactive = response-oknobye - - response-starttls = (response-ok response-capability) / - response-nobye - - sieve-name = string - ;; See Section 1.6 for the full list of - ;; prohibited characters. - ;; Empty string is not allowed. - - active-sieve-name = string - ;; See Section 1.6 for the full list of - ;; prohibited characters. - ;; This is similar to <sieve-name>, but - ;; empty string is allowed and has a special - ;; meaning. - - string = quoted / literal-c2s / literal-s2c - ;; literal-c2s is only allowed when sent - ;; from the client to the server. - ;; literal-s2c is only allowed when sent - ;; from the server to the client. - ;; quoted is allowed in either direction. - -5. Security Considerations - - The AUTHENTICATE command uses SASL [SASL] to provide authentication - and authorization services. Integrity and privacy services can be - provided by [SASL] and/or [TLS]. When a SASL mechanism is used, the - security considerations for that mechanism apply. - - This protocol's transactions are susceptible to passive observers or - man-in-the-middle attacks that alter the data, unless the optional - encryption and integrity services of the SASL (via the AUTHENTICATE - command) and/or [TLS] (via the STARTTLS command) are enabled, or an - external security mechanism is used for protection. It may be useful - to allow configuration of both clients and servers to refuse to - transfer sensitive information in the absence of strong encryption. - - - -Melnikov & Martin Standards Track [Page 37] - -RFC 5804 ManageSieve July 2010 - - - If an implementation supports SASL mechanisms that are vulnerable to - passive eavesdropping attacks (such as [PLAIN]), then the - implementation MUST support at least one configuration where these - SASL mechanisms are not advertised or used without the presence of an - external security layer such as [TLS]. - - Some response codes returned on failed AUTHENTICATE command may - disclose whether or not the username is valid (e.g., TRANSITION- - NEEDED), so server implementations SHOULD provide the ability to - disable these features (or make them not conditional on a per-user - basis) for sites concerned about such disclosure. In the case of - ENCRYPT-NEEDED, if it is applied to all identities then no extra - information is disclosed, but if it is applied on a per-user basis it - can disclose information. - - A compromised or malicious server can use the TRANSITION-NEEDED - response code to force the client that is configured to use a - mechanism that does not disclose the user's password to the server - (e.g., Kerberos), to send the bare password to the server. Clients - SHOULD have the ability to disable the password transition feature, - or disclose that risk to the user and offer the user an option of how - to proceed. - -6. IANA Considerations - - IANA has reserved TCP port number 4190 for use with the ManageSieve - protocol described in this document. - - IANA has registered the "sieve" URI scheme defined in Section 3 of - this document. - - IANA has registered "sieve" in the "GSSAPI/Kerberos/SASL Service - Names" registry. - - IANA has created a new registry for ManageSieve capabilities. The - registration template for ManageSieve capabilities is specified in - Section 6.1. ManageSieve protocol capabilities MUST be specified in - a Standards-Track or IESG-approved Experimental RFC. - - IANA has created a new registry for ManageSieve response codes. The - registration template for ManageSieve response codes is specified in - Section 6.3. ManageSieve protocol response codes MUST be specified - in a Standards-Track or IESG-approved Experimental RFC. - - - - - - - - -Melnikov & Martin Standards Track [Page 38] - -RFC 5804 ManageSieve July 2010 - - -6.1. ManageSieve Capability Registration Template - - To: iana@iana.org - Subject: ManageSieve Capability Registration - - Please register the following ManageSieve capability: - - Capability name: - Description: - Relevant publications: - Person & email address to contact for further information: - Author/Change controller: - -6.2. Registration of Initial ManageSieve Capabilities - - To: iana@iana.org - Subject: ManageSieve Capability Registration - - Please register the following ManageSieve capabilities: - - Capability name: IMPLEMENTATION - Description: Its value contains the name of the server - implementation and its version. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: SASL - Description: Its value contains a space-separated list of SASL - mechanisms supported by the server. - Relevant publications: this RFC, Sections 1.7 and 2.1. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: SIEVE - Description: Its value contains a space-separated list of supported - SIEVE extensions. - Relevant publications: this RFC, Section 1.7. Also [SIEVE]. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 39] - -RFC 5804 ManageSieve July 2010 - - - Capability name: STARTTLS - Description: This capability is returned if the server supports TLS - (STARTTLS command). - Relevant publications: this RFC, Sections 1.7 and 2.2. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: NOTIFY - Description: This capability is returned if the server supports the - 'enotify' [NOTIFY] Sieve extension. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: MAXREDIRECTS - Description: This capability returns the limit on the number of - Sieve "redirect" actions a script can perform during a - single evaluation. The value is a non-negative number - represented as a ManageSieve string. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: LANGUAGE - Description: The language (<Language-Tag> from [RFC5646]) currently - used for human-readable error messages. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: OWNER - Description: Its value contains the UTF-8-encoded name of the - currently logged-in user ("authorization identity" - according to RFC 4422). - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - -Melnikov & Martin Standards Track [Page 40] - -RFC 5804 ManageSieve July 2010 - - - Capability name: VERSION - Description: This capability is returned if the server is compliant - with RFC 5804; i.e., that it supports RENAMESCRIPT, - CHECKSCRIPT, and NOOP commands. - Relevant publications: this RFC, Sections 2.11, 2.12, and 2.13. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - -6.3. ManageSieve Response Code Registration Template - - To: iana@iana.org - Subject: ManageSieve Response Code Registration - - Please register the following ManageSieve response code: - - Response Code: - Arguments (use ABNF to specify syntax, or the word NONE if none - can be specified): - Purpose: - Published Specification(s): - Person & email address to contact for further information: - Author/Change controller: - -6.4. Registration of Initial ManageSieve Response Codes - - To: iana@iana.org - Subject: ManageSieve Response Code Registration - - Please register the following ManageSieve response codes: - - Response Code: AUTH-TOO-WEAK - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code is returned in the NO response from - an AUTHENTICATE command. It indicates that site - security policy forbids the use of the requested - mechanism for the specified authentication identity. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - -Melnikov & Martin Standards Track [Page 41] - -RFC 5804 ManageSieve July 2010 - - - Response Code: ENCRYPT-NEEDED - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code is returned in the NO response from - an AUTHENTICATE command. It indicates that site - security policy requires the use of a strong - encryption mechanism for the specified authentication - identity and mechanism. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: QUOTA - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined quota constraints. If - this response code is returned in the OK response, it - can mean that the user is near its quota or that the - user exceeded its quota, but the server supports soft - quotas. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: QUOTA/MAXSCRIPTS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined limit on the number of - Sieve scripts. If this response code is returned in - the OK response, it can mean that the user is near its - quota or that the user exceeded its quota, but the - server supports soft quotas. This response code is a - more specific version of the QUOTA response code. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 42] - -RFC 5804 ManageSieve July 2010 - - - Response Code: QUOTA/MAXSIZE - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined maximum script size. - If this response code is returned in the OK response, - it can mean that the user is near its quota or that - the user exceeded its quota, but the server supports - soft quotas. This response code is a more specific - version of the QUOTA response code. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: REFERRAL - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): <sieveurl> - Purpose: This response code may be returned with a BYE result - from any command, and includes a mandatory parameter - that indicates what server to access to manage this - user's Sieve scripts. The server will be specified by - a Sieve URL (see Section 3). The scriptname portion - of the URL MUST NOT be specified. The client should - authenticate to the specified server and use it for - all further commands in the current session. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: SASL - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): <string> - Purpose: This response code can occur in the OK response to a - successful AUTHENTICATE command and includes the - optional final server response data from the server as - specified by [SASL]. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 43] - -RFC 5804 ManageSieve July 2010 - - - Response Code: TRANSITION-NEEDED - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code occurs in a NO response of an - AUTHENTICATE command. It indicates that the user name - is valid, but the entry in the authentication database - needs to be updated in order to permit authentication - with the specified mechanism. This is typically done - by establishing a secure channel using TLS, followed - by authenticating once using the [PLAIN] - authentication mechanism. The selected mechanism - SHOULD then work for authentications in subsequent - sessions. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: TRYLATER - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed due to a temporary server failure. - The client MAY continue using local information and - try the command later. This response code only make - sense when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: ACTIVE - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because it is not allowed on the - active script, for example, DELETESCRIPT on the active - script. This response code only makes sense when - returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - - -Melnikov & Martin Standards Track [Page 44] - -RFC 5804 ManageSieve July 2010 - - - Response Code: NONEXISTENT - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because the referenced script name - doesn't exist. This response code only makes sense - when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: ALREADYEXISTS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because the referenced script name - already exists. This response code only makes sense - when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: WARNINGS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code MAY be returned by the server in - the OK response (but it might be returned with the NO/ - BYE response as well) and signals the client that even - though the script is syntactically valid, it might - contain errors not intended by the script writer. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: TAG - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): string - Purpose: This response code name is followed by a string - specified in the command that caused this response. - It is typically used for client state synchronization. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - -Melnikov & Martin Standards Track [Page 45] - -RFC 5804 ManageSieve July 2010 - - -7. Internationalization Considerations - - The LANGUAGE capability (see Section 1.7) allows a client to discover - the current language used in all human-readable responses that might - be returned at the end of any OK/NO/BYE response. Human-readable - text in OK responses typically doesn't need to be shown to the user, - unless it is returned in response to a PUTSCRIPT or CHECKSCRIPT - command that also contains the WARNINGS response code (Section 1.3). - Human-readable text from NO/BYE responses is intended be shown to the - user, unless the client can automatically handle failure of the - command that caused such a response. Clients SHOULD use response - codes (Section 1.3) for automatic error handling. Response codes MAY - also be used by the client to present error messages in a language - understood by the user, for example, if the LANGUAGE capability - doesn't return a language understood by the user. - - Note that the human-readable text from OK (WARNINGS) or NO/BYE - responses for PUTSCRIPT/CHECKSCRIPT commands is intended for advanced - users that understand Sieve language. Such advanced users are often - sophisticated enough to be able to handle whatever language the - server is using, even if it is not their preferred language, and will - want to see error/warning text no matter what language the server - puts it in. - - A client that generates Sieve script automatically, for example, if - the script is generated without user intervention or from a UI that - presents an abstract list of conditions and corresponding actions, - SHOULD NOT present warning/error messages to the user, because the - user might not even be aware that the client is using Sieve - underneath. However, if the client has a debugging mode, such - warnings/errors SHOULD be available in the debugging mode. - - Note that this document doesn't provide a way to modify the currently - used language. It is expected that a future extension will address - that. - -8. Acknowledgements - - Thanks to Simon Josefsson, Larry Greenfield, Allen Johnson, Chris - Newman, Lyndon Nerenberg, Tim Showalter, Sarah Robeson, Walter Wong, - Barry Leiba, Arnt Gulbrandsen, Stephan Bosch, Ken Murchison, Phil - Pennock, Ned Freed, Jeffrey Hutzelman, Mark E. Mallett, Dilyan - Palauzov, Dave Cridland, Aaron Stone, Robert Burrell Donkin, Patrick - Ben Koetter, Bjoern Hoehrmann, Martin Duerst, Pasi Eronen, Magnus - Westerlund, Tim Polk, and Julien Coloos for help with this document. - Special thank you to Phil Pennock for providing text for the NOOP - command, as well as finding various bugs in the document. - - - - -Melnikov & Martin Standards Track [Page 46] - -RFC 5804 ManageSieve July 2010 - - -9. References - -9.1. Normative References - - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", STD 68, RFC 5234, January 2008. - - [ACAP] Newman, C. and J. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November - 1997. - - [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data - Encodings", RFC 4648, October 2006. - - [DNS-SRV] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR - for specifying the location of services (DNS SRV)", - RFC 2782, February 2000. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [NET-UNICODE] Klensin, J. and M. Padlipsky, "Unicode Format for - Network Interchange", RFC 5198, March 2008. - - [NOTIFY] Melnikov, A., Leiba, B., Segmuller, W., and T. Martin, - "Sieve Email Filtering: Extension for Notifications", - RFC 5435, January 2009. - - [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and - Languages", BCP 18, RFC 2277, January 1998. - - [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version - 6 (IPv6) Specification", RFC 2460, December 1998. - - [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, - "Internationalizing Domain Names in Applications - (IDNA)", RFC 3490, March 2003. - - [RFC4519] Sciberras, A., "Lightweight Directory Access Protocol - (LDAP): Schema for User Applications", RFC 4519, June - 2006. - - [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying - Languages", BCP 47, RFC 5646, September 2009. - - [RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791, - September 1981. - - - - -Melnikov & Martin Standards Track [Page 47] - -RFC 5804 ManageSieve July 2010 - - - [SASL] Melnikov, A. and K. Zeilenga, "Simple Authentication - and Security Layer (SASL)", RFC 4422, June 2006. - - [SASLprep] Zeilenga, K., "SASLprep: Stringprep Profile for User - Names and Passwords", RFC 4013, February 2005. - - [SCRAM] Menon-Sen, A., Melnikov, A., Newman, C., and N. - Williams, "Salted Challenge Response Authentication - Mechanism (SCRAM) SASL and GSS-API Mechanisms", RFC - 5802, July 2010. - - [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email - Filtering Language", RFC 5228, January 2008. - - [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of - Internationalized Strings ("stringprep")", RFC 3454, - December 2002. - - [TLS] Dierks, T. and E. Rescorla, "The Transport Layer - Security (TLS) Protocol Version 1.2", RFC 5246, August - 2008. - - [URI-GEN] Berners-Lee, T., Fielding, R., and L. Masinter, - "Uniform Resource Identifier (URI): Generic Syntax", - STD 66, RFC 3986, January 2005. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. - - [X509] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., - Housley, R., and W. Polk, "Internet X.509 Public Key - Infrastructure Certificate and Certificate Revocation - List (CRL) Profile", RFC 5280, May 2008. - - [X509-SRV] Santesson, S., "Internet X.509 Public Key - Infrastructure Subject Alternative Name for Expression - of Service Name", RFC 4985, August 2007. - -9.2. Informative References - - [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest Authentication - as a SASL Mechanism", RFC 2831, May 2000. - - [GSSAPI] Melnikov, A., "The Kerberos V5 ("GSSAPI") Simple - Authentication and Security Layer (SASL) Mechanism", - RFC 4752, November 2006. - - - - - -Melnikov & Martin Standards Track [Page 48] - -RFC 5804 ManageSieve July 2010 - - - [I-HAVE] Freed, N., "Sieve Email Filtering: Ihave Extension", - RFC 5463, March 2009. - - [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - - VERSION 4rev1", RFC 3501, March 2003. - - [LDAP] Zeilenga, K., "Lightweight Directory Access Protocol - (LDAP): Technical Specification Road Map", RFC 4510, - June 2006. - - [PLAIN] Zeilenga, K., "The PLAIN Simple Authentication and - Security Layer (SASL) Mechanism", RFC 4616, August - 2006. - -Authors' Addresses - - Alexey Melnikov (editor) - Isode Limited - 5 Castle Business Village - 36 Station Road - Hampton, Middlesex TW12 2BX - UK - - EMail: Alexey.Melnikov@isode.com - - - Tim Martin - BeThereBeSquare, Inc. - 672 Haight st. - San Francisco, CA 94117 - USA - - Phone: +1 510 260-4175 - EMail: timmartin@alumni.cmu.edu - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 49] - diff --git a/proto/rfc822.txt b/proto/rfc822.txt @@ -1,2901 +0,0 @@ - - - - - - - RFC # 822 - - Obsoletes: RFC #733 (NIC #41952) - - - - - - - - - - - - - STANDARD FOR THE FORMAT OF - - ARPA INTERNET TEXT MESSAGES - - - - - - - August 13, 1982 - - - - - - - Revised by - - David H. Crocker - - - Dept. of Electrical Engineering - University of Delaware, Newark, DE 19711 - Network: DCrocker @ UDel-Relay - - - - - - - - - - - - - - - - Standard for ARPA Internet Text Messages - - - TABLE OF CONTENTS - - - PREFACE .................................................... ii - - 1. INTRODUCTION ........................................... 1 - - 1.1. Scope ............................................ 1 - 1.2. Communication Framework .......................... 2 - - 2. NOTATIONAL CONVENTIONS ................................. 3 - - 3. LEXICAL ANALYSIS OF MESSAGES ........................... 5 - - 3.1. General Description .............................. 5 - 3.2. Header Field Definitions ......................... 9 - 3.3. Lexical Tokens ................................... 10 - 3.4. Clarifications ................................... 11 - - 4. MESSAGE SPECIFICATION .................................. 17 - - 4.1. Syntax ........................................... 17 - 4.2. Forwarding ....................................... 19 - 4.3. Trace Fields ..................................... 20 - 4.4. Originator Fields ................................ 21 - 4.5. Receiver Fields .................................. 23 - 4.6. Reference Fields ................................. 23 - 4.7. Other Fields ..................................... 24 - - 5. DATE AND TIME SPECIFICATION ............................ 26 - - 5.1. Syntax ........................................... 26 - 5.2. Semantics ........................................ 26 - - 6. ADDRESS SPECIFICATION .................................. 27 - - 6.1. Syntax ........................................... 27 - 6.2. Semantics ........................................ 27 - 6.3. Reserved Address ................................. 33 - - 7. BIBLIOGRAPHY ........................................... 34 - - - APPENDIX - - A. EXAMPLES ............................................... 36 - B. SIMPLE FIELD PARSING ................................... 40 - C. DIFFERENCES FROM RFC #733 .............................. 41 - D. ALPHABETICAL LISTING OF SYNTAX RULES ................... 44 - - - August 13, 1982 - i - RFC #822 - - - - - Standard for ARPA Internet Text Messages - - - PREFACE - - - By 1977, the Arpanet employed several informal standards for - the text messages (mail) sent among its host computers. It was - felt necessary to codify these practices and provide for those - features that seemed imminent. The result of that effort was - Request for Comments (RFC) #733, "Standard for the Format of ARPA - Network Text Message", by Crocker, Vittal, Pogran, and Henderson. - The specification attempted to avoid major changes in existing - software, while permitting several new features. - - This document revises the specifications in RFC #733, in - order to serve the needs of the larger and more complex ARPA - Internet. Some of RFC #733's features failed to gain adequate - acceptance. In order to simplify the standard and the software - that follows it, these features have been removed. A different - addressing scheme is used, to handle the case of inter-network - mail; and the concept of re-transmission has been introduced. - - This specification is intended for use in the ARPA Internet. - However, an attempt has been made to free it of any dependence on - that environment, so that it can be applied to other network text - message systems. - - The specification of RFC #733 took place over the course of - one year, using the ARPANET mail environment, itself, to provide - an on-going forum for discussing the capabilities to be included. - More than twenty individuals, from across the country, partici- - pated in the original discussion. The development of this - revised specification has, similarly, utilized network mail-based - group discussion. Both specification efforts greatly benefited - from the comments and ideas of the participants. - - The syntax of the standard, in RFC #733, was originally - specified in the Backus-Naur Form (BNF) meta-language. Ken L. - Harrenstien, of SRI International, was responsible for re-coding - the BNF into an augmented BNF that makes the representation - smaller and easier to understand. - - - - - - - - - - - - - August 13, 1982 - ii - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 1. INTRODUCTION - - 1.1. SCOPE - - This standard specifies a syntax for text messages that are - sent among computer users, within the framework of "electronic - mail". The standard supersedes the one specified in ARPANET - Request for Comments #733, "Standard for the Format of ARPA Net- - work Text Messages". - - In this context, messages are viewed as having an envelope - and contents. The envelope contains whatever information is - needed to accomplish transmission and delivery. The contents - compose the object to be delivered to the recipient. This stan- - dard applies only to the format and some of the semantics of mes- - sage contents. It contains no specification of the information - in the envelope. - - However, some message systems may use information from the - contents to create the envelope. It is intended that this stan- - dard facilitate the acquisition of such information by programs. - - Some message systems may store messages in formats that - differ from the one specified in this standard. This specifica- - tion is intended strictly as a definition of what message content - format is to be passed BETWEEN hosts. - - Note: This standard is NOT intended to dictate the internal for- - mats used by sites, the specific message system features - that they are expected to support, or any of the charac- - teristics of user interface programs that create or read - messages. - - A distinction should be made between what the specification - REQUIRES and what it ALLOWS. Messages can be made complex and - rich with formally-structured components of information or can be - kept small and simple, with a minimum of such information. Also, - the standard simplifies the interpretation of differing visual - formats in messages; only the visual aspect of a message is - affected and not the interpretation of information within it. - Implementors may choose to retain such visual distinctions. - - The formal definition is divided into four levels. The bot- - tom level describes the meta-notation used in this document. The - second level describes basic lexical analyzers that feed tokens - to higher-level parsers. Next is an overall specification for - messages; it permits distinguishing individual fields. Finally, - there is definition of the contents of several structured fields. - - - - August 13, 1982 - 1 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 1.2. COMMUNICATION FRAMEWORK - - Messages consist of lines of text. No special provisions - are made for encoding drawings, facsimile, speech, or structured - text. No significant consideration has been given to questions - of data compression or to transmission and storage efficiency, - and the standard tends to be free with the number of bits con- - sumed. For example, field names are specified as free text, - rather than special terse codes. - - A general "memo" framework is used. That is, a message con- - sists of some information in a rigid format, followed by the main - part of the message, with a format that is not specified in this - document. The syntax of several fields of the rigidly-formated - ("headers") section is defined in this specification; some of - these fields must be included in all messages. - - The syntax that distinguishes between header fields is - specified separately from the internal syntax for particular - fields. This separation is intended to allow simple parsers to - operate on the general structure of messages, without concern for - the detailed structure of individual header fields. Appendix B - is provided to facilitate construction of these parsers. - - In addition to the fields specified in this document, it is - expected that other fields will gain common use. As necessary, - the specifications for these "extension-fields" will be published - through the same mechanism used to publish this document. Users - may also wish to extend the set of fields that they use - privately. Such "user-defined fields" are permitted. - - The framework severely constrains document tone and appear- - ance and is primarily useful for most intra-organization communi- - cations and well-structured inter-organization communication. - It also can be used for some types of inter-process communica- - tion, such as simple file transfer and remote job entry. A more - robust framework might allow for multi-font, multi-color, multi- - dimension encoding of information. A less robust one, as is - present in most single-machine message systems, would more - severely constrain the ability to add fields and the decision to - include specific fields. In contrast with paper-based communica- - tion, it is interesting to note that the RECEIVER of a message - can exercise an extraordinary amount of control over the - message's appearance. The amount of actual control available to - message receivers is contingent upon the capabilities of their - individual message systems. - - - - - - August 13, 1982 - 2 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 2. NOTATIONAL CONVENTIONS - - This specification uses an augmented Backus-Naur Form (BNF) - notation. The differences from standard BNF involve naming rules - and indicating repetition and "local" alternatives. - - 2.1. RULE NAMING - - Angle brackets ("<", ">") are not used, in general. The - name of a rule is simply the name itself, rather than "<name>". - Quotation-marks enclose literal text (which may be upper and/or - lower case). Certain basic rules are in uppercase, such as - SPACE, TAB, CRLF, DIGIT, ALPHA, etc. Angle brackets are used in - rule definitions, and in the rest of this document, whenever - their presence will facilitate discerning the use of rule names. - - 2.2. RULE1 / RULE2: ALTERNATIVES - - Elements separated by slash ("/") are alternatives. There- - fore "foo / bar" will accept foo or bar. - - 2.3. (RULE1 RULE2): LOCAL ALTERNATIVES - - Elements enclosed in parentheses are treated as a single - element. Thus, "(elem (foo / bar) elem)" allows the token - sequences "elem foo elem" and "elem bar elem". - - 2.4. *RULE: REPETITION - - The character "*" preceding an element indicates repetition. - The full form is: - - <l>*<m>element - - indicating at least <l> and at most <m> occurrences of element. - Default values are 0 and infinity so that "*(element)" allows any - number, including zero; "1*element" requires at least one; and - "1*2element" allows one or two. - - 2.5. [RULE]: OPTIONAL - - Square brackets enclose optional elements; "[foo bar]" is - equivalent to "*1(foo bar)". - - 2.6. NRULE: SPECIFIC REPETITION - - "<n>(element)" is equivalent to "<n>*<n>(element)"; that is, - exactly <n> occurrences of (element). Thus 2DIGIT is a 2-digit - number, and 3ALPHA is a string of three alphabetic characters. - - - August 13, 1982 - 3 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 2.7. #RULE: LISTS - - A construct "#" is defined, similar to "*", as follows: - - <l>#<m>element - - indicating at least <l> and at most <m> elements, each separated - by one or more commas (","). This makes the usual form of lists - very easy; a rule such as '(element *("," element))' can be shown - as "1#element". Wherever this construct is used, null elements - are allowed, but do not contribute to the count of elements - present. That is, "(element),,(element)" is permitted, but - counts as only two elements. Therefore, where at least one ele- - ment is required, at least one non-null element must be present. - Default values are 0 and infinity so that "#(element)" allows any - number, including zero; "1#element" requires at least one; and - "1#2element" allows one or two. - - 2.8. ; COMMENTS - - A semi-colon, set off some distance to the right of rule - text, starts a comment that continues to the end of line. This - is a simple way of including useful notes in parallel with the - specifications. - - - - - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 4 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 3. LEXICAL ANALYSIS OF MESSAGES - - 3.1. GENERAL DESCRIPTION - - A message consists of header fields and, optionally, a body. - The body is simply a sequence of lines containing ASCII charac- - ters. It is separated from the headers by a null line (i.e., a - line with nothing preceding the CRLF). - - 3.1.1. LONG HEADER FIELDS - - Each header field can be viewed as a single, logical line of - ASCII characters, comprising a field-name and a field-body. - For convenience, the field-body portion of this conceptual - entity can be split into a multiple-line representation; this - is called "folding". The general rule is that wherever there - may be linear-white-space (NOT simply LWSP-chars), a CRLF - immediately followed by AT LEAST one LWSP-char may instead be - inserted. Thus, the single line - - To: "Joe & J. Harvey" <ddd @Org>, JJV @ BBN - - can be represented as: - - To: "Joe & J. Harvey" <ddd @ Org>, - JJV@BBN - - and - - To: "Joe & J. Harvey" - <ddd@ Org>, JJV - @BBN - - and - - To: "Joe & - J. Harvey" <ddd @ Org>, JJV @ BBN - - The process of moving from this folded multiple-line - representation of a header field to its single line represen- - tation is called "unfolding". Unfolding is accomplished by - regarding CRLF immediately followed by a LWSP-char as - equivalent to the LWSP-char. - - Note: While the standard permits folding wherever linear- - white-space is permitted, it is recommended that struc- - tured fields, such as those containing addresses, limit - folding to higher-level syntactic breaks. For address - fields, it is recommended that such folding occur - - - August 13, 1982 - 5 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - between addresses, after the separating comma. - - 3.1.2. STRUCTURE OF HEADER FIELDS - - Once a field has been unfolded, it may be viewed as being com- - posed of a field-name followed by a colon (":"), followed by a - field-body, and terminated by a carriage-return/line-feed. - The field-name must be composed of printable ASCII characters - (i.e., characters that have values between 33. and 126., - decimal, except colon). The field-body may be composed of any - ASCII characters, except CR or LF. (While CR and/or LF may be - present in the actual text, they are removed by the action of - unfolding the field.) - - Certain field-bodies of headers may be interpreted according - to an internal syntax that some systems may wish to parse. - These fields are called "structured fields". Examples - include fields containing dates and addresses. Other fields, - such as "Subject" and "Comments", are regarded simply as - strings of text. - - Note: Any field which has a field-body that is defined as - other than simply <text> is to be treated as a struc- - tured field. - - Field-names, unstructured field bodies and structured - field bodies each are scanned by their own, independent - "lexical" analyzers. - - 3.1.3. UNSTRUCTURED FIELD BODIES - - For some fields, such as "Subject" and "Comments", no struc- - turing is assumed, and they are treated simply as <text>s, as - in the message body. Rules of folding apply to these fields, - so that such field bodies which occupy several lines must - therefore have the second and successive lines indented by at - least one LWSP-char. - - 3.1.4. STRUCTURED FIELD BODIES - - To aid in the creation and reading of structured fields, the - free insertion of linear-white-space (which permits folding - by inclusion of CRLFs) is allowed between lexical tokens. - Rather than obscuring the syntax specifications for these - structured fields with explicit syntax for this linear-white- - space, the existence of another "lexical" analyzer is assumed. - This analyzer does not apply for unstructured field bodies - that are simply strings of text, as described above. The - analyzer provides an interpretation of the unfolded text - - - August 13, 1982 - 6 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - composing the body of the field as a sequence of lexical sym- - bols. - - These symbols are: - - - individual special characters - - quoted-strings - - domain-literals - - comments - - atoms - - The first four of these symbols are self-delimiting. Atoms - are not; they are delimited by the self-delimiting symbols and - by linear-white-space. For the purposes of regenerating - sequences of atoms and quoted-strings, exactly one SPACE is - assumed to exist, and should be used, between them. (Also, in - the "Clarifications" section on "White Space", below, note the - rules about treatment of multiple contiguous LWSP-chars.) - - So, for example, the folded body of an address field - - ":sysmail"@ Some-Group. Some-Org, - Muhammed.(I am the greatest) Ali @(the)Vegas.WBA - - - - - - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 7 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - is analyzed into the following lexical symbols and types: - - :sysmail quoted string - @ special - Some-Group atom - . special - Some-Org atom - , special - Muhammed atom - . special - (I am the greatest) comment - Ali atom - @ atom - (the) comment - Vegas atom - . special - WBA atom - - The canonical representations for the data in these addresses - are the following strings: - - ":sysmail"@Some-Group.Some-Org - - and - - Muhammed.Ali@Vegas.WBA - - Note: For purposes of display, and when passing such struc- - tured information to other systems, such as mail proto- - col services, there must be NO linear-white-space - between <word>s that are separated by period (".") or - at-sign ("@") and exactly one SPACE between all other - <word>s. Also, headers should be in a folded form. - - - - - - - - - - - - - - - - - - - August 13, 1982 - 8 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 3.2. HEADER FIELD DEFINITIONS - - These rules show a field meta-syntax, without regard for the - particular type or internal syntax. Their purpose is to permit - detection of fields; also, they present to higher-level parsers - an image of each field as fitting on one line. - - field = field-name ":" [ field-body ] CRLF - - field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> - - field-body = field-body-contents - [CRLF LWSP-char field-body] - - field-body-contents = - <the ASCII characters making up the field-body, as - defined in the following sections, and consisting - of combinations of atom, quoted-string, and - specials tokens, or else consisting of texts> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 9 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 3.3. LEXICAL TOKENS - - The following rules are used to define an underlying lexical - analyzer, which feeds tokens to higher level parsers. See the - ANSI references, in the Bibliography. - - ; ( Octal, Decimal.) - CHAR = <any ASCII character> ; ( 0-177, 0.-127.) - ALPHA = <any ASCII alphabetic character> - ; (101-132, 65.- 90.) - ; (141-172, 97.-122.) - DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.) - CTL = <any ASCII control ; ( 0- 37, 0.- 31.) - character and DEL> ; ( 177, 127.) - CR = <ASCII CR, carriage return> ; ( 15, 13.) - LF = <ASCII LF, linefeed> ; ( 12, 10.) - SPACE = <ASCII SP, space> ; ( 40, 32.) - HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) - <"> = <ASCII quote mark> ; ( 42, 34.) - CRLF = CR LF - - LWSP-char = SPACE / HTAB ; semantics = SPACE - - linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE - ; CRLF => folding - - specials = "(" / ")" / "<" / ">" / "@" ; Must be in quoted- - / "," / ";" / ":" / "\" / <"> ; string, to use - / "." / "[" / "]" ; within a word. - - delimiters = specials / linear-white-space / comment - - text = <any CHAR, including bare ; => atoms, specials, - CR & bare LF, but NOT ; comments and - including CRLF> ; quoted-strings are - ; NOT recognized. - - atom = 1*<any CHAR except specials, SPACE and CTLs> - - quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or - ; quoted chars. - - qtext = <any CHAR excepting <">, ; => may be folded - "\" & CR, and including - linear-white-space> - - domain-literal = "[" *(dtext / quoted-pair) "]" - - - - - August 13, 1982 - 10 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - dtext = <any CHAR excluding "[", ; => may be folded - "]", "\" & CR, & including - linear-white-space> - - comment = "(" *(ctext / quoted-pair / comment) ")" - - ctext = <any CHAR excluding "(", ; => may be folded - ")", "\" & CR, & including - linear-white-space> - - quoted-pair = "\" CHAR ; may quote any char - - phrase = 1*word ; Sequence of words - - word = atom / quoted-string - - - 3.4. CLARIFICATIONS - - 3.4.1. QUOTING - - Some characters are reserved for special interpretation, such - as delimiting lexical tokens. To permit use of these charac- - ters as uninterpreted data, a quoting mechanism is provided. - To quote a character, precede it with a backslash ("\"). - - This mechanism is not fully general. Characters may be quoted - only within a subset of the lexical constructs. In particu- - lar, quoting is limited to use within: - - - quoted-string - - domain-literal - - comment - - Within these constructs, quoting is REQUIRED for CR and "\" - and for the character(s) that delimit the token (e.g., "(" and - ")" for a comment). However, quoting is PERMITTED for any - character. - - Note: In particular, quoting is NOT permitted within atoms. - For example when the local-part of an addr-spec must - contain a special character, a quoted string must be - used. Therefore, a specification such as: - - Full\ Name@Domain - - is not legal and must be specified as: - - "Full Name"@Domain - - - August 13, 1982 - 11 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 3.4.2. WHITE SPACE - - Note: In structured field bodies, multiple linear space ASCII - characters (namely HTABs and SPACEs) are treated as - single spaces and may freely surround any symbol. In - all header fields, the only place in which at least one - LWSP-char is REQUIRED is at the beginning of continua- - tion lines in a folded field. - - When passing text to processes that do not interpret text - according to this standard (e.g., mail protocol servers), then - NO linear-white-space characters should occur between a period - (".") or at-sign ("@") and a <word>. Exactly ONE SPACE should - be used in place of arbitrary linear-white-space and comment - sequences. - - Note: Within systems conforming to this standard, wherever a - member of the list of delimiters is allowed, LWSP-chars - may also occur before and/or after it. - - Writers of mail-sending (i.e., header-generating) programs - should realize that there is no network-wide definition of the - effect of ASCII HT (horizontal-tab) characters on the appear- - ance of text at another network host; therefore, the use of - tabs in message headers, though permitted, is discouraged. - - 3.4.3. COMMENTS - - A comment is a set of ASCII characters, which is enclosed in - matching parentheses and which is not within a quoted-string - The comment construct permits message originators to add text - which will be useful for human readers, but which will be - ignored by the formal semantics. Comments should be retained - while the message is subject to interpretation according to - this standard. However, comments must NOT be included in - other cases, such as during protocol exchanges with mail - servers. - - Comments nest, so that if an unquoted left parenthesis occurs - in a comment string, there must also be a matching right - parenthesis. When a comment acts as the delimiter between a - sequence of two lexical symbols, such as two atoms, it is lex- - ically equivalent with a single SPACE, for the purposes of - regenerating the sequence, such as when passing the sequence - onto a mail protocol server. Comments are detected as such - only within field-bodies of structured fields. - - If a comment is to be "folded" onto multiple lines, then the - syntax for folding must be adhered to. (See the "Lexical - - - August 13, 1982 - 12 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - Analysis of Messages" section on "Folding Long Header Fields" - above, and the section on "Case Independence" below.) Note - that the official semantics therefore do not "see" any - unquoted CRLFs that are in comments, although particular pars- - ing programs may wish to note their presence. For these pro- - grams, it would be reasonable to interpret a "CRLF LWSP-char" - as being a CRLF that is part of the comment; i.e., the CRLF is - kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a - backslash followed by a CR followed by a LF) still must be - followed by at least one LWSP-char. - - 3.4.4. DELIMITING AND QUOTING CHARACTERS - - The quote character (backslash) and characters that delimit - syntactic units are not, generally, to be taken as data that - are part of the delimited or quoted unit(s). In particular, - the quotation-marks that define a quoted-string, the - parentheses that define a comment and the backslash that - quotes a following character are NOT part of the quoted- - string, comment or quoted character. A quotation-mark that is - to be part of a quoted-string, a parenthesis that is to be - part of a comment and a backslash that is to be part of either - must each be preceded by the quote-character backslash ("\"). - Note that the syntax allows any character to be quoted within - a quoted-string or comment; however only certain characters - MUST be quoted to be included as data. These characters are - the ones that are not part of the alternate text group (i.e., - ctext or qtext). - - The one exception to this rule is that a single SPACE is - assumed to exist between contiguous words in a phrase, and - this interpretation is independent of the actual number of - LWSP-chars that the creator places between the words. To - include more than one SPACE, the creator must make the LWSP- - chars be part of a quoted-string. - - Quotation marks that delimit a quoted string and backslashes - that quote the following character should NOT accompany the - quoted-string when the string is passed to processes that do - not interpret data according to this specification (e.g., mail - protocol servers). - - 3.4.5. QUOTED-STRINGS - - Where permitted (i.e., in words in structured fields) quoted- - strings are treated as a single symbol. That is, a quoted- - string is equivalent to an atom, syntactically. If a quoted- - string is to be "folded" onto multiple lines, then the syntax - for folding must be adhered to. (See the "Lexical Analysis of - - - August 13, 1982 - 13 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - Messages" section on "Folding Long Header Fields" above, and - the section on "Case Independence" below.) Therefore, the - official semantics do not "see" any bare CRLFs that are in - quoted-strings; however particular parsing programs may wish - to note their presence. For such programs, it would be rea- - sonable to interpret a "CRLF LWSP-char" as being a CRLF which - is part of the quoted-string; i.e., the CRLF is kept and the - LWSP-char is discarded. Quoted CRLFs (i.e., a backslash fol- - lowed by a CR followed by a LF) are also subject to rules of - folding, but the presence of the quoting character (backslash) - explicitly indicates that the CRLF is data to the quoted - string. Stripping off the first following LWSP-char is also - appropriate when parsing quoted CRLFs. - - 3.4.6. BRACKETING CHARACTERS - - There is one type of bracket which must occur in matched pairs - and may have pairs nested within each other: - - o Parentheses ("(" and ")") are used to indicate com- - ments. - - There are three types of brackets which must occur in matched - pairs, and which may NOT be nested: - - o Colon/semi-colon (":" and ";") are used in address - specifications to indicate that the included list of - addresses are to be treated as a group. - - o Angle brackets ("<" and ">") are generally used to - indicate the presence of a one machine-usable refer- - ence (e.g., delimiting mailboxes), possibly including - source-routing to the machine. - - o Square brackets ("[" and "]") are used to indicate the - presence of a domain-literal, which the appropriate - name-domain is to use directly, bypassing normal - name-resolution mechanisms. - - 3.4.7. CASE INDEPENDENCE - - Except as noted, alphabetic strings may be represented in any - combination of upper and lower case. The only syntactic units - - - - - - - - - August 13, 1982 - 14 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - which requires preservation of case information are: - - - text - - qtext - - dtext - - ctext - - quoted-pair - - local-part, except "Postmaster" - - When matching any other syntactic unit, case is to be ignored. - For example, the field-names "From", "FROM", "from", and even - "FroM" are semantically equal and should all be treated ident- - ically. - - When generating these units, any mix of upper and lower case - alphabetic characters may be used. The case shown in this - specification is suggested for message-creating processes. - - Note: The reserved local-part address unit, "Postmaster", is - an exception. When the value "Postmaster" is being - interpreted, it must be accepted in any mixture of - case, including "POSTMASTER", and "postmaster". - - 3.4.8. FOLDING LONG HEADER FIELDS - - Each header field may be represented on exactly one line con- - sisting of the name of the field and its body, and terminated - by a CRLF; this is what the parser sees. For readability, the - field-body portion of long header fields may be "folded" onto - multiple lines of the actual field. "Long" is commonly inter- - preted to mean greater than 65 or 72 characters. The former - length serves as a limit, when the message is to be viewed on - most simple terminals which use simple display software; how- - ever, the limit is not imposed by this standard. - - Note: Some display software often can selectively fold lines, - to suit the display terminal. In such cases, sender- - provided folding can interfere with the display - software. - - 3.4.9. BACKSPACE CHARACTERS - - ASCII BS characters (Backspace, decimal 8) may be included in - texts and quoted-strings to effect overstriking. However, any - use of backspaces which effects an overstrike to the left of - the beginning of the text or quoted-string is prohibited. - - - - - - August 13, 1982 - 15 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 3.4.10. NETWORK-SPECIFIC TRANSFORMATIONS - - During transmission through heterogeneous networks, it may be - necessary to force data to conform to a network's local con- - ventions. For example, it may be required that a CR be fol- - lowed either by LF, making a CRLF, or by <null>, if the CR is - to stand alone). Such transformations are reversed, when the - message exits that network. - - When crossing network boundaries, the message should be - treated as passing through two modules. It will enter the - first module containing whatever network-specific transforma- - tions that were necessary to permit migration through the - "current" network. It then passes through the modules: - - o Transformation Reversal - - The "current" network's idiosyncracies are removed and - the message is returned to the canonical form speci- - fied in this standard. - - o Transformation - - The "next" network's local idiosyncracies are imposed - on the message. - - ------------------ - From ==> | Remove Net-A | - Net-A | idiosyncracies | - ------------------ - || - \/ - Conformance - with standard - || - \/ - ------------------ - | Impose Net-B | ==> To - | idiosyncracies | Net-B - ------------------ - - - - - - - - - - - - August 13, 1982 - 16 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 4. MESSAGE SPECIFICATION - - 4.1. SYNTAX - - Note: Due to an artifact of the notational conventions, the syn- - tax indicates that, when present, some fields, must be in - a particular order. Header fields are NOT required to - occur in any particular order, except that the message - body must occur AFTER the headers. It is recommended - that, if present, headers be sent in the order "Return- - Path", "Received", "Date", "From", "Subject", "Sender", - "To", "cc", etc. - - This specification permits multiple occurrences of most - fields. Except as noted, their interpretation is not - specified here, and their use is discouraged. - - The following syntax for the bodies of various fields should - be thought of as describing each field body as a single long - string (or line). The "Lexical Analysis of Message" section on - "Long Header Fields", above, indicates how such long strings can - be represented on more than one line in the actual transmitted - message. - - message = fields *( CRLF *text ) ; Everything after - ; first null line - ; is message body - - fields = dates ; Creation time, - source ; author id & one - 1*destination ; address required - *optional-field ; others optional - - source = [ trace ] ; net traversals - originator ; original mail - [ resent ] ; forwarded - - trace = return ; path to sender - 1*received ; receipt tags - - return = "Return-path" ":" route-addr ; return address - - received = "Received" ":" ; one per relay - ["from" domain] ; sending host - ["by" domain] ; receiving host - ["via" atom] ; physical path - *("with" atom) ; link/mail protocol - ["id" msg-id] ; receiver msg id - ["for" addr-spec] ; initial form - - - August 13, 1982 - 17 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - ";" date-time ; time received - - originator = authentic ; authenticated addr - [ "Reply-To" ":" 1#address] ) - - authentic = "From" ":" mailbox ; Single author - / ( "Sender" ":" mailbox ; Actual submittor - "From" ":" 1#mailbox) ; Multiple authors - ; or not sender - - resent = resent-authentic - [ "Resent-Reply-To" ":" 1#address] ) - - resent-authentic = - = "Resent-From" ":" mailbox - / ( "Resent-Sender" ":" mailbox - "Resent-From" ":" 1#mailbox ) - - dates = orig-date ; Original - [ resent-date ] ; Forwarded - - orig-date = "Date" ":" date-time - - resent-date = "Resent-Date" ":" date-time - - destination = "To" ":" 1#address ; Primary - / "Resent-To" ":" 1#address - / "cc" ":" 1#address ; Secondary - / "Resent-cc" ":" 1#address - / "bcc" ":" #address ; Blind carbon - / "Resent-bcc" ":" #address - - optional-field = - / "Message-ID" ":" msg-id - / "Resent-Message-ID" ":" msg-id - / "In-Reply-To" ":" *(phrase / msg-id) - / "References" ":" *(phrase / msg-id) - / "Keywords" ":" #phrase - / "Subject" ":" *text - / "Comments" ":" *text - / "Encrypted" ":" 1#2word - / extension-field ; To be defined - / user-defined-field ; May be pre-empted - - msg-id = "<" addr-spec ">" ; Unique message id - - - - - - - August 13, 1982 - 18 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - extension-field = - <Any field which is defined in a document - published as a formal extension to this - specification; none will have names beginning - with the string "X-"> - - user-defined-field = - <Any field which has not been defined - in this specification or published as an - extension to this specification; names for - such fields must be unique and may be - pre-empted by published extensions> - - 4.2. FORWARDING - - Some systems permit mail recipients to forward a message, - retaining the original headers, by adding some new fields. This - standard supports such a service, through the "Resent-" prefix to - field names. - - Whenever the string "Resent-" begins a field name, the field - has the same semantics as a field whose name does not have the - prefix. However, the message is assumed to have been forwarded - by an original recipient who attached the "Resent-" field. This - new field is treated as being more recent than the equivalent, - original field. For example, the "Resent-From", indicates the - person that forwarded the message, whereas the "From" field indi- - cates the original author. - - Use of such precedence information depends upon partici- - pants' communication needs. For example, this standard does not - dictate when a "Resent-From:" address should receive replies, in - lieu of sending them to the "From:" address. - - Note: In general, the "Resent-" fields should be treated as con- - taining a set of information that is independent of the - set of original fields. Information for one set should - not automatically be taken from the other. The interpre- - tation of multiple "Resent-" fields, of the same type, is - undefined. - - In the remainder of this specification, occurrence of legal - "Resent-" fields are treated identically with the occurrence of - - - - - - - - - August 13, 1982 - 19 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - fields whose names do not contain this prefix. - - 4.3. TRACE FIELDS - - Trace information is used to provide an audit trail of mes- - sage handling. In addition, it indicates a route back to the - sender of the message. - - The list of known "via" and "with" values are registered - with the Network Information Center, SRI International, Menlo - Park, California. - - 4.3.1. RETURN-PATH - - This field is added by the final transport system that - delivers the message to its recipient. The field is intended - to contain definitive information about the address and route - back to the message's originator. - - Note: The "Reply-To" field is added by the originator and - serves to direct replies, whereas the "Return-Path" - field is used to identify a path back to the origina- - tor. - - While the syntax indicates that a route specification is - optional, every attempt should be made to provide that infor- - mation in this field. - - 4.3.2. RECEIVED - - A copy of this field is added by each transport service that - relays the message. The information in the field can be quite - useful for tracing transport problems. - - The names of the sending and receiving hosts and time-of- - receipt may be specified. The "via" parameter may be used, to - indicate what physical mechanism the message was sent over, - such as Arpanet or Phonenet, and the "with" parameter may be - used to indicate the mail-, or connection-, level protocol - that was used, such as the SMTP mail protocol, or X.25 tran- - sport protocol. - - Note: Several "with" parameters may be included, to fully - specify the set of protocols that were used. - - Some transport services queue mail; the internal message iden- - tifier that is assigned to the message may be noted, using the - "id" parameter. When the sending host uses a destination - address specification that the receiving host reinterprets, by - - - August 13, 1982 - 20 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - expansion or transformation, the receiving host may wish to - record the original specification, using the "for" parameter. - For example, when a copy of mail is sent to the member of a - distribution list, this parameter may be used to record the - original address that was used to specify the list. - - 4.4. ORIGINATOR FIELDS - - The standard allows only a subset of the combinations possi- - ble with the From, Sender, Reply-To, Resent-From, Resent-Sender, - and Resent-Reply-To fields. The limitation is intentional. - - 4.4.1. FROM / RESENT-FROM - - This field contains the identity of the person(s) who wished - this message to be sent. The message-creation process should - default this field to be a single, authenticated machine - address, indicating the AGENT (person, system or process) - entering the message. If this is not done, the "Sender" field - MUST be present. If the "From" field IS defaulted this way, - the "Sender" field is optional and is redundant with the - "From" field. In all cases, addresses in the "From" field - must be machine-usable (addr-specs) and may not contain named - lists (groups). - - 4.4.2. SENDER / RESENT-SENDER - - This field contains the authenticated identity of the AGENT - (person, system or process) that sends the message. It is - intended for use when the sender is not the author of the mes- - sage, or to indicate who among a group of authors actually - sent the message. If the contents of the "Sender" field would - be completely redundant with the "From" field, then the - "Sender" field need not be present and its use is discouraged - (though still legal). In particular, the "Sender" field MUST - be present if it is NOT the same as the "From" Field. - - The Sender mailbox specification includes a word sequence - which must correspond to a specific agent (i.e., a human user - or a computer program) rather than a standard address. This - indicates the expectation that the field will identify the - single AGENT (person, system, or process) responsible for - sending the mail and not simply include the name of a mailbox - from which the mail was sent. For example in the case of a - shared login name, the name, by itself, would not be adequate. - The local-part address unit, which refers to this agent, is - expected to be a computer system term, and not (for example) a - generalized person reference which can be used outside the - network text message context. - - - August 13, 1982 - 21 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - Since the critical function served by the "Sender" field is - identification of the agent responsible for sending mail and - since computer programs cannot be held accountable for their - behavior, it is strongly recommended that when a computer pro- - gram generates a message, the HUMAN who is responsible for - that program be referenced as part of the "Sender" field mail- - box specification. - - 4.4.3. REPLY-TO / RESENT-REPLY-TO - - This field provides a general mechanism for indicating any - mailbox(es) to which responses are to be sent. Three typical - uses for this feature can be distinguished. In the first - case, the author(s) may not have regular machine-based mail- - boxes and therefore wish(es) to indicate an alternate machine - address. In the second case, an author may wish additional - persons to be made aware of, or responsible for, replies. A - somewhat different use may be of some help to "text message - teleconferencing" groups equipped with automatic distribution - services: include the address of that service in the "Reply- - To" field of all messages submitted to the teleconference; - then participants can "reply" to conference submissions to - guarantee the correct distribution of any submission of their - own. - - Note: The "Return-Path" field is added by the mail transport - service, at the time of final deliver. It is intended - to identify a path back to the orginator of the mes- - sage. The "Reply-To" field is added by the message - originator and is intended to direct replies. - - 4.4.4. AUTOMATIC USE OF FROM / SENDER / REPLY-TO - - For systems which automatically generate address lists for - replies to messages, the following recommendations are made: - - o The "Sender" field mailbox should be sent notices of - any problems in transport or delivery of the original - messages. If there is no "Sender" field, then the - "From" field mailbox should be used. - - o The "Sender" field mailbox should NEVER be used - automatically, in a recipient's reply message. - - o If the "Reply-To" field exists, then the reply should - go to the addresses indicated in that field and not to - the address(es) indicated in the "From" field. - - - - - August 13, 1982 - 22 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - o If there is a "From" field, but no "Reply-To" field, - the reply should be sent to the address(es) indicated - in the "From" field. - - Sometimes, a recipient may actually wish to communicate with - the person that initiated the message transfer. In such - cases, it is reasonable to use the "Sender" address. - - This recommendation is intended only for automated use of - originator-fields and is not intended to suggest that replies - may not also be sent to other recipients of messages. It is - up to the respective mail-handling programs to decide what - additional facilities will be provided. - - Examples are provided in Appendix A. - - 4.5. RECEIVER FIELDS - - 4.5.1. TO / RESENT-TO - - This field contains the identity of the primary recipients of - the message. - - 4.5.2. CC / RESENT-CC - - This field contains the identity of the secondary (informa- - tional) recipients of the message. - - 4.5.3. BCC / RESENT-BCC - - This field contains the identity of additional recipients of - the message. The contents of this field are not included in - copies of the message sent to the primary and secondary reci- - pients. Some systems may choose to include the text of the - "Bcc" field only in the author(s)'s copy, while others may - also include it in the text sent to all those indicated in the - "Bcc" list. - - 4.6. REFERENCE FIELDS - - 4.6.1. MESSAGE-ID / RESENT-MESSAGE-ID - - This field contains a unique identifier (the local-part - address unit) which refers to THIS version of THIS message. - The uniqueness of the message identifier is guaranteed by the - host which generates it. This identifier is intended to be - machine readable and not necessarily meaningful to humans. A - message identifier pertains to exactly one instantiation of a - particular message; subsequent revisions to the message should - - - August 13, 1982 - 23 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - each receive new message identifiers. - - 4.6.2. IN-REPLY-TO - - The contents of this field identify previous correspon- - dence which this message answers. Note that if message iden- - tifiers are used in this field, they must use the msg-id - specification format. - - 4.6.3. REFERENCES - - The contents of this field identify other correspondence - which this message references. Note that if message identif- - iers are used, they must use the msg-id specification format. - - 4.6.4. KEYWORDS - - This field contains keywords or phrases, separated by - commas. - - 4.7. OTHER FIELDS - - 4.7.1. SUBJECT - - This is intended to provide a summary, or indicate the - nature, of the message. - - 4.7.2. COMMENTS - - Permits adding text comments onto the message without - disturbing the contents of the message's body. - - 4.7.3. ENCRYPTED - - Sometimes, data encryption is used to increase the - privacy of message contents. If the body of a message has - been encrypted, to keep its contents private, the "Encrypted" - field can be used to note the fact and to indicate the nature - of the encryption. The first <word> parameter indicates the - software used to encrypt the body, and the second, optional - <word> is intended to aid the recipient in selecting the - proper decryption key. This code word may be viewed as an - index to a table of keys held by the recipient. - - Note: Unfortunately, headers must contain envelope, as well - as contents, information. Consequently, it is neces- - sary that they remain unencrypted, so that mail tran- - sport services may access them. Since names, - addresses, and "Subject" field contents may contain - - - August 13, 1982 - 24 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - sensitive information, this requirement limits total - message privacy. - - Names of encryption software are registered with the Net- - work Information Center, SRI International, Menlo Park, Cali- - fornia. - - 4.7.4. EXTENSION-FIELD - - A limited number of common fields have been defined in - this document. As network mail requirements dictate, addi- - tional fields may be standardized. To provide user-defined - fields with a measure of safety, in name selection, such - extension-fields will never have names that begin with the - string "X-". - - Names of Extension-fields are registered with the Network - Information Center, SRI International, Menlo Park, California. - - 4.7.5. USER-DEFINED-FIELD - - Individual users of network mail are free to define and - use additional header fields. Such fields must have names - which are not already used in the current specification or in - any definitions of extension-fields, and the overall syntax of - these user-defined-fields must conform to this specification's - rules for delimiting and folding fields. Due to the - extension-field publishing process, the name of a user- - defined-field may be pre-empted - - Note: The prefatory string "X-" will never be used in the - names of Extension-fields. This provides user-defined - fields with a protected set of names. - - - - - - - - - - - - - - - - - - - August 13, 1982 - 25 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 5. DATE AND TIME SPECIFICATION - - 5.1. SYNTAX - - date-time = [ day "," ] date time ; dd mm yy - ; hh:mm:ss zzz - - day = "Mon" / "Tue" / "Wed" / "Thu" - / "Fri" / "Sat" / "Sun" - - date = 1*2DIGIT month 2DIGIT ; day month year - ; e.g. 20 Jun 82 - - month = "Jan" / "Feb" / "Mar" / "Apr" - / "May" / "Jun" / "Jul" / "Aug" - / "Sep" / "Oct" / "Nov" / "Dec" - - time = hour zone ; ANSI and Military - - hour = 2DIGIT ":" 2DIGIT [":" 2DIGIT] - ; 00:00:00 - 23:59:59 - - zone = "UT" / "GMT" ; Universal Time - ; North American : UT - / "EST" / "EDT" ; Eastern: - 5/ - 4 - / "CST" / "CDT" ; Central: - 6/ - 5 - / "MST" / "MDT" ; Mountain: - 7/ - 6 - / "PST" / "PDT" ; Pacific: - 8/ - 7 - / 1ALPHA ; Military: Z = UT; - ; A:-1; (J not used) - ; M:-12; N:+1; Y:+12 - / ( ("+" / "-") 4DIGIT ) ; Local differential - ; hours+min. (HHMM) - - 5.2. SEMANTICS - - If included, day-of-week must be the day implied by the date - specification. - - Time zone may be indicated in several ways. "UT" is Univer- - sal Time (formerly called "Greenwich Mean Time"); "GMT" is per- - mitted as a reference to Universal Time. The military standard - uses a single character for each zone. "Z" is Universal Time. - "A" indicates one hour earlier, and "M" indicates 12 hours ear- - lier; "N" is one hour later, and "Y" is 12 hours later. The - letter "J" is not used. The other remaining two forms are taken - from ANSI standard X3.51-1975. One allows explicit indication of - the amount of offset from UT; the other uses common 3-character - strings for indicating time zones in North America. - - - August 13, 1982 - 26 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 6. ADDRESS SPECIFICATION - - 6.1. SYNTAX - - address = mailbox ; one addressee - / group ; named list - - group = phrase ":" [#mailbox] ";" - - mailbox = addr-spec ; simple address - / phrase route-addr ; name & addr-spec - - route-addr = "<" [route] addr-spec ">" - - route = 1#("@" domain) ":" ; path-relative - - addr-spec = local-part "@" domain ; global address - - local-part = word *("." word) ; uninterpreted - ; case-preserved - - domain = sub-domain *("." sub-domain) - - sub-domain = domain-ref / domain-literal - - domain-ref = atom ; symbolic reference - - 6.2. SEMANTICS - - A mailbox receives mail. It is a conceptual entity which - does not necessarily pertain to file storage. For example, some - sites may choose to print mail on their line printer and deliver - the output to the addressee's desk. - - A mailbox specification comprises a person, system or pro- - cess name reference, a domain-dependent string, and a name-domain - reference. The name reference is optional and is usually used to - indicate the human name of a recipient. The name-domain refer- - ence specifies a sequence of sub-domains. The domain-dependent - string is uninterpreted, except by the final sub-domain; the rest - of the mail service merely transmits it as a literal string. - - 6.2.1. DOMAINS - - A name-domain is a set of registered (mail) names. A name- - domain specification resolves to a subordinate name-domain - specification or to a terminal domain-dependent string. - Hence, domain specification is extensible, permitting any - number of registration levels. - - - August 13, 1982 - 27 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - Name-domains model a global, logical, hierarchical addressing - scheme. The model is logical, in that an address specifica- - tion is related to name registration and is not necessarily - tied to transmission path. The model's hierarchy is a - directed graph, called an in-tree, such that there is a single - path from the root of the tree to any node in the hierarchy. - If more than one path actually exists, they are considered to - be different addresses. - - The root node is common to all addresses; consequently, it is - not referenced. Its children constitute "top-level" name- - domains. Usually, a service has access to its own full domain - specification and to the names of all top-level name-domains. - - The "top" of the domain addressing hierarchy -- a child of the - root -- is indicated by the right-most field, in a domain - specification. Its child is specified to the left, its child - to the left, and so on. - - Some groups provide formal registration services; these con- - stitute name-domains that are independent logically of - specific machines. In addition, networks and machines impli- - citly compose name-domains, since their membership usually is - registered in name tables. - - In the case of formal registration, an organization implements - a (distributed) data base which provides an address-to-route - mapping service for addresses of the form: - - person@registry.organization - - Note that "organization" is a logical entity, separate from - any particular communication network. - - A mechanism for accessing "organization" is universally avail- - able. That mechanism, in turn, seeks an instantiation of the - registry; its location is not indicated in the address specif- - ication. It is assumed that the system which operates under - the name "organization" knows how to find a subordinate regis- - try. The registry will then use the "person" string to deter- - mine where to send the mail specification. - - The latter, network-oriented case permits simple, direct, - attachment-related address specification, such as: - - user@host.network - - Once the network is accessed, it is expected that a message - will go directly to the host and that the host will resolve - - - August 13, 1982 - 28 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - the user name, placing the message in the user's mailbox. - - 6.2.2. ABBREVIATED DOMAIN SPECIFICATION - - Since any number of levels is possible within the domain - hierarchy, specification of a fully qualified address can - become inconvenient. This standard permits abbreviated domain - specification, in a special case: - - For the address of the sender, call the left-most - sub-domain Level N. In a header address, if all of - the sub-domains above (i.e., to the right of) Level N - are the same as those of the sender, then they do not - have to appear in the specification. Otherwise, the - address must be fully qualified. - - This feature is subject to approval by local sub- - domains. Individual sub-domains may require their - member systems, which originate mail, to provide full - domain specification only. When permitted, abbrevia- - tions may be present only while the message stays - within the sub-domain of the sender. - - Use of this mechanism requires the sender's sub-domain - to reserve the names of all top-level domains, so that - full specifications can be distinguished from abbrevi- - ated specifications. - - For example, if a sender's address is: - - sender@registry-A.registry-1.organization-X - - and one recipient's address is: - - recipient@registry-B.registry-1.organization-X - - and another's is: - - recipient@registry-C.registry-2.organization-X - - then ".registry-1.organization-X" need not be specified in the - the message, but "registry-C.registry-2" DOES have to be - specified. That is, the first two addresses may be abbrevi- - ated, but the third address must be fully specified. - - When a message crosses a domain boundary, all addresses must - be specified in the full format, ending with the top-level - name-domain in the right-most field. It is the responsibility - of mail forwarding services to ensure that addresses conform - - - August 13, 1982 - 29 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - with this requirement. In the case of abbreviated addresses, - the relaying service must make the necessary expansions. It - should be noted that it often is difficult for such a service - to locate all occurrences of address abbreviations. For exam- - ple, it will not be possible to find such abbreviations within - the body of the message. The "Return-Path" field can aid - recipients in recovering from these errors. - - Note: When passing any portion of an addr-spec onto a process - which does not interpret data according to this stan- - dard (e.g., mail protocol servers). There must be NO - LWSP-chars preceding or following the at-sign or any - delimiting period ("."), such as shown in the above - examples, and only ONE SPACE between contiguous - <word>s. - - 6.2.3. DOMAIN TERMS - - A domain-ref must be THE official name of a registry, network, - or host. It is a symbolic reference, within a name sub- - domain. At times, it is necessary to bypass standard mechan- - isms for resolving such references, using more primitive - information, such as a network host address rather than its - associated host name. - - To permit such references, this standard provides the domain- - literal construct. Its contents must conform with the needs - of the sub-domain in which it is interpreted. - - Domain-literals which refer to domains within the ARPA Inter- - net specify 32-bit Internet addresses, in four 8-bit fields - noted in decimal, as described in Request for Comments #820, - "Assigned Numbers." For example: - - [10.0.3.19] - - Note: THE USE OF DOMAIN-LITERALS IS STRONGLY DISCOURAGED. It - is permitted only as a means of bypassing temporary - system limitations, such as name tables which are not - complete. - - The names of "top-level" domains, and the names of domains - under in the ARPA Internet, are registered with the Network - Information Center, SRI International, Menlo Park, California. - - 6.2.4. DOMAIN-DEPENDENT LOCAL STRING - - The local-part of an addr-spec in a mailbox specification - (i.e., the host's name for the mailbox) is understood to be - - - August 13, 1982 - 30 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - whatever the receiving mail protocol server allows. For exam- - ple, some systems do not understand mailbox references of the - form "P. D. Q. Bach", but others do. - - This specification treats periods (".") as lexical separators. - Hence, their presence in local-parts which are not quoted- - strings, is detected. However, such occurrences carry NO - semantics. That is, if a local-part has periods within it, an - address parser will divide the local-part into several tokens, - but the sequence of tokens will be treated as one uninter- - preted unit. The sequence will be re-assembled, when the - address is passed outside of the system such as to a mail pro- - tocol service. - - For example, the address: - - First.Last@Registry.Org - - is legal and does not require the local-part to be surrounded - with quotation-marks. (However, "First Last" DOES require - quoting.) The local-part of the address, when passed outside - of the mail system, within the Registry.Org domain, is - "First.Last", again without quotation marks. - - 6.2.5. BALANCING LOCAL-PART AND DOMAIN - - In some cases, the boundary between local-part and domain can - be flexible. The local-part may be a simple string, which is - used for the final determination of the recipient's mailbox. - All other levels of reference are, therefore, part of the - domain. - - For some systems, in the case of abbreviated reference to the - local and subordinate sub-domains, it may be possible to - specify only one reference within the domain part and place - the other, subordinate name-domain references within the - local-part. This would appear as: - - mailbox.sub1.sub2@this-domain - - Such a specification would be acceptable to address parsers - which conform to RFC #733, but do not support this newer - Internet standard. While contrary to the intent of this stan- - dard, the form is legal. - - Also, some sub-domains have a specification syntax which does - not conform to this standard. For example: - - sub-net.mailbox@sub-domain.domain - - - August 13, 1982 - 31 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - uses a different parsing sequence for local-part than for - domain. - - Note: As a rule, the domain specification should contain - fields which are encoded according to the syntax of - this standard and which contain generally-standardized - information. The local-part specification should con- - tain only that portion of the address which deviates - from the form or intention of the domain field. - - 6.2.6. MULTIPLE MAILBOXES - - An individual may have several mailboxes and wish to receive - mail at whatever mailbox is convenient for the sender to - access. This standard does not provide a means of specifying - "any member of" a list of mailboxes. - - A set of individuals may wish to receive mail as a single unit - (i.e., a distribution list). The <group> construct permits - specification of such a list. Recipient mailboxes are speci- - fied within the bracketed part (":" - ";"). A copy of the - transmitted message is to be sent to each mailbox listed. - This standard does not permit recursive specification of - groups within groups. - - While a list must be named, it is not required that the con- - tents of the list be included. In this case, the <address> - serves only as an indication of group distribution and would - appear in the form: - - name:; - - Some mail services may provide a group-list distribution - facility, accepting a single mailbox reference, expanding it - to the full distribution list, and relaying the mail to the - list's members. This standard provides no additional syntax - for indicating such a service. Using the <group> address - alternative, while listing one mailbox in it, can mean either - that the mailbox reference will be expanded to a list or that - there is a group with one member. - - 6.2.7. EXPLICIT PATH SPECIFICATION - - At times, a message originator may wish to indicate the - transmission path that a message should follow. This is - called source routing. The normal addressing scheme, used in - an addr-spec, is carefully separated from such information; - the <route> portion of a route-addr is provided for such occa- - sions. It specifies the sequence of hosts and/or transmission - - - August 13, 1982 - 32 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - services that are to be traversed. Both domain-refs and - domain-literals may be used. - - Note: The use of source routing is discouraged. Unless the - sender has special need of path restriction, the choice - of transmission route should be left to the mail tran- - sport service. - - 6.3. RESERVED ADDRESS - - It often is necessary to send mail to a site, without know- - ing any of its valid addresses. For example, there may be mail - system dysfunctions, or a user may wish to find out a person's - correct address, at that site. - - This standard specifies a single, reserved mailbox address - (local-part) which is to be valid at each site. Mail sent to - that address is to be routed to a person responsible for the - site's mail system or to a person with responsibility for general - site operation. The name of the reserved local-part address is: - - Postmaster - - so that "Postmaster@domain" is required to be valid. - - Note: This reserved local-part must be matched without sensi- - tivity to alphabetic case, so that "POSTMASTER", "postmas- - ter", and even "poStmASteR" is to be accepted. - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 33 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - 7. BIBLIOGRAPHY - - - ANSI. "USA Standard Code for Information Interchange," X3.4. - American National Standards Institute: New York (1968). Also - in: Feinler, E. and J. Postel, eds., "ARPANET Protocol Hand- - book", NIC 7104. - - ANSI. "Representations of Universal Time, Local Time Differen- - tials, and United States Time Zone References for Information - Interchange," X3.51-1975. American National Standards Insti- - tute: New York (1975). - - Bemer, R.W., "Time and the Computer." In: Interface Age (Feb. - 1979). - - Bennett, C.J. "JNT Mail Protocol". Joint Network Team, Ruther- - ford and Appleton Laboratory: Didcot, England. - - Bhushan, A.K., Pogran, K.T., Tomlinson, R.S., and White, J.E. - "Standardizing Network Mail Headers," ARPANET Request for - Comments No. 561, Network Information Center No. 18516; SRI - International: Menlo Park (September 1973). - - Birrell, A.D., Levin, R., Needham, R.M., and Schroeder, M.D. - "Grapevine: An Exercise in Distributed Computing," Communica- - tions of the ACM 25, 4 (April 1982), 260-274. - - Crocker, D.H., Vittal, J.J., Pogran, K.T., Henderson, D.A. - "Standard for the Format of ARPA Network Text Message," - ARPANET Request for Comments No. 733, Network Information - Center No. 41952. SRI International: Menlo Park (November - 1977). - - Feinler, E.J. and Postel, J.B. ARPANET Protocol Handbook, Net- - work Information Center No. 7104 (NTIS AD A003890). SRI - International: Menlo Park (April 1976). - - Harary, F. "Graph Theory". Addison-Wesley: Reading, Mass. - (1969). - - Levin, R. and Schroeder, M. "Transport of Electronic Messages - through a Network," TeleInformatics 79, pp. 29-33. North - Holland (1979). Also as Xerox Palo Alto Research Center - Technical Report CSL-79-4. - - Myer, T.H. and Henderson, D.A. "Message Transmission Protocol," - ARPANET Request for Comments, No. 680, Network Information - Center No. 32116. SRI International: Menlo Park (1975). - - - August 13, 1982 - 34 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - NBS. "Specification of Message Format for Computer Based Message - Systems, Recommended Federal Information Processing Standard." - National Bureau of Standards: Gaithersburg, Maryland - (October 1981). - - NIC. Internet Protocol Transition Workbook. Network Information - Center, SRI-International, Menlo Park, California (March - 1982). - - Oppen, D.C. and Dalal, Y.K. "The Clearinghouse: A Decentralized - Agent for Locating Named Objects in a Distributed Environ- - ment," OPD-T8103. Xerox Office Products Division: Palo Alto, - CA. (October 1981). - - Postel, J.B. "Assigned Numbers," ARPANET Request for Comments, - No. 820. SRI International: Menlo Park (August 1982). - - Postel, J.B. "Simple Mail Transfer Protocol," ARPANET Request - for Comments, No. 821. SRI International: Menlo Park (August - 1982). - - Shoch, J.F. "Internetwork naming, addressing and routing," in - Proc. 17th IEEE Computer Society International Conference, pp. - 72-79, Sept. 1978, IEEE Cat. No. 78 CH 1388-8C. - - Su, Z. and Postel, J. "The Domain Naming Convention for Internet - User Applications," ARPANET Request for Comments, No. 819. - SRI International: Menlo Park (August 1982). - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 35 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - APPENDIX - - - A. EXAMPLES - - A.1. ADDRESSES - - A.1.1. Alfred Neuman <Neuman@BBN-TENEXA> - - A.1.2. Neuman@BBN-TENEXA - - These two "Alfred Neuman" examples have identical seman- - tics, as far as the operation of the local host's mail sending - (distribution) program (also sometimes called its "mailer") - and the remote host's mail protocol server are concerned. In - the first example, the "Alfred Neuman" is ignored by the - mailer, as "Neuman@BBN-TENEXA" completely specifies the reci- - pient. The second example contains no superfluous informa- - tion, and, again, "Neuman@BBN-TENEXA" is the intended reci- - pient. - - Note: When the message crosses name-domain boundaries, then - these specifications must be changed, so as to indicate - the remainder of the hierarchy, starting with the top - level. - - A.1.3. "George, Ted" <Shared@Group.Arpanet> - - This form might be used to indicate that a single mailbox - is shared by several users. The quoted string is ignored by - the originating host's mailer, because "Shared@Group.Arpanet" - completely specifies the destination mailbox. - - A.1.4. Wilt . (the Stilt) Chamberlain@NBA.US - - The "(the Stilt)" is a comment, which is NOT included in - the destination mailbox address handed to the originating - system's mailer. The local-part of the address is the string - "Wilt.Chamberlain", with NO space between the first and second - words. - - A.1.5. Address Lists - - Gourmets: Pompous Person <WhoZiWhatZit@Cordon-Bleu>, - Childs@WGBH.Boston, Galloping Gourmet@ - ANT.Down-Under (Australian National Television), - Cheapie@Discount-Liquors;, - Cruisers: Port@Portugal, Jones@SEA;, - Another@Somewhere.SomeOrg - - - August 13, 1982 - 36 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - This group list example points out the use of comments and the - mixing of addresses and groups. - - A.2. ORIGINATOR ITEMS - - A.2.1. Author-sent - - George Jones logs into his host as "Jones". He sends - mail himself. - - From: Jones@Group.Org - - or - - From: George Jones <Jones@Group.Org> - - A.2.2. Secretary-sent - - George Jones logs in as Jones on his host. His secre- - tary, who logs in as Secy sends mail for him. Replies to the - mail should go to George. - - From: George Jones <Jones@Group> - Sender: Secy@Other-Group - - A.2.3. Secretary-sent, for user of shared directory - - George Jones' secretary sends mail for George. Replies - should go to George. - - From: George Jones<Shared@Group.Org> - Sender: Secy@Other-Group - - Note that there need not be a space between "Jones" and the - "<", but adding a space enhances readability (as is the case - in other examples. - - A.2.4. Committee activity, with one author - - George is a member of a committee. He wishes to have any - replies to his message go to all committee members. - - From: George Jones <Jones@Host.Net> - Sender: Jones@Host - Reply-To: The Committee: Jones@Host.Net, - Smith@Other.Org, - Doe@Somewhere-Else; - - Note that if George had not included himself in the - - - August 13, 1982 - 37 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - enumeration of The Committee, he would not have gotten an - implicit reply; the presence of the "Reply-to" field SUPER- - SEDES the sending of a reply to the person named in the "From" - field. - - A.2.5. Secretary acting as full agent of author - - George Jones asks his secretary (Secy@Host) to send a - message for him in his capacity as Group. He wants his secre- - tary to handle all replies. - - From: George Jones <Group@Host> - Sender: Secy@Host - Reply-To: Secy@Host - - A.2.6. Agent for user without online mailbox - - A friend of George's, Sarah, is visiting. George's - secretary sends some mail to a friend of Sarah in computer- - land. Replies should go to George, whose mailbox is Jones at - Registry. - - From: Sarah Friendly <Secy@Registry> - Sender: Secy-Name <Secy@Registry> - Reply-To: Jones@Registry. - - A.2.7. Agent for member of a committee - - George's secretary sends out a message which was authored - jointly by all the members of a committee. Note that the name - of the committee cannot be specified, since <group> names are - not permitted in the From field. - - From: Jones@Host, - Smith@Other-Host, - Doe@Somewhere-Else - Sender: Secy@SHost - - - - - - - - - - - - - - - August 13, 1982 - 38 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - A.3. COMPLETE HEADERS - - A.3.1. Minimum required - - Date: 26 Aug 76 1429 EDT Date: 26 Aug 76 1429 EDT - From: Jones@Registry.Org or From: Jones@Registry.Org - Bcc: To: Smith@Registry.Org - - Note that the "Bcc" field may be empty, while the "To" field - is required to have at least one address. - - A.3.2. Using some of the additional fields - - Date: 26 Aug 76 1430 EDT - From: George Jones<Group@Host> - Sender: Secy@SHOST - To: "Al Neuman"@Mad-Host, - Sam.Irving@Other-Host - Message-ID: <some.string@SHOST> - - A.3.3. About as complex as you're going to get - - Date : 27 Aug 76 0932 PDT - From : Ken Davis <KDavis@This-Host.This-net> - Subject : Re: The Syntax in the RFC - Sender : KSecy@Other-Host - Reply-To : Sam.Irving@Reg.Organization - To : George Jones <Group@Some-Reg.An-Org>, - Al.Neuman@MAD.Publisher - cc : Important folk: - Tom Softwood <Balsa@Tree.Root>, - "Sam Irving"@Other-Host;, - Standard Distribution: - /main/davis/people/standard@Other-Host, - "<Jones>standard.dist.3"@Tops-20-Host>; - Comment : Sam is away on business. He asked me to handle - his mail for him. He'll be able to provide a - more accurate explanation when he returns - next week. - In-Reply-To: <some.string@DBM.Group>, George's message - X-Special-action: This is a sample of user-defined field- - names. There could also be a field-name - "Special-action", but its name might later be - preempted - Message-ID: <4231.629.XYzi-What@Other-Host> - - - - - - - August 13, 1982 - 39 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - B. SIMPLE FIELD PARSING - - Some mail-reading software systems may wish to perform only - minimal processing, ignoring the internal syntax of structured - field-bodies and treating them the same as unstructured-field- - bodies. Such software will need only to distinguish: - - o Header fields from the message body, - - o Beginnings of fields from lines which continue fields, - - o Field-names from field-contents. - - The abbreviated set of syntactic rules which follows will - suffice for this purpose. It describes a limited view of mes- - sages and is a subset of the syntactic rules provided in the main - part of this specification. One small exception is that the con- - tents of field-bodies consist only of text: - - B.1. SYNTAX - - - message = *field *(CRLF *text) - - field = field-name ":" [field-body] CRLF - - field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> - - field-body = *text [CRLF LWSP-char field-body] - - - B.2. SEMANTICS - - Headers occur before the message body and are terminated by - a null line (i.e., two contiguous CRLFs). - - A line which continues a header field begins with a SPACE or - HTAB character, while a line beginning a field starts with a - printable character which is not a colon. - - A field-name consists of one or more printable characters - (excluding colon, space, and control-characters). A field-name - MUST be contained on one line. Upper and lower case are not dis- - tinguished when comparing field-names. - - - - - - - - August 13, 1982 - 40 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - C. DIFFERENCES FROM RFC #733 - - The following summarizes the differences between this stan- - dard and the one specified in Arpanet Request for Comments #733, - "Standard for the Format of ARPA Network Text Messages". The - differences are listed in the order of their occurrence in the - current specification. - - C.1. FIELD DEFINITIONS - - C.1.1. FIELD NAMES - - These now must be a sequence of printable characters. They - may not contain any LWSP-chars. - - C.2. LEXICAL TOKENS - - C.2.1. SPECIALS - - The characters period ("."), left-square bracket ("["), and - right-square bracket ("]") have been added. For presentation - purposes, and when passing a specification to a system that - does not conform to this standard, periods are to be contigu- - ous with their surrounding lexical tokens. No linear-white- - space is permitted between them. The presence of one LWSP- - char between other tokens is still directed. - - C.2.2. ATOM - - Atoms may not contain SPACE. - - C.2.3. SPECIAL TEXT - - ctext and qtext have had backslash ("\") added to the list of - prohibited characters. - - C.2.4. DOMAINS - - The lexical tokens <domain-literal> and <dtext> have been - added. - - C.3. MESSAGE SPECIFICATION - - C.3.1. TRACE - - The "Return-path:" and "Received:" fields have been specified. - - - - - - August 13, 1982 - 41 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - C.3.2. FROM - - The "From" field must contain machine-usable addresses (addr- - spec). Multiple addresses may be specified, but named-lists - (groups) may not. - - C.3.3. RESENT - - The meta-construct of prefacing field names with the string - "Resent-" has been added, to indicate that a message has been - forwarded by an intermediate recipient. - - C.3.4. DESTINATION - - A message must contain at least one destination address field. - "To" and "CC" are required to contain at least one address. - - C.3.5. IN-REPLY-TO - - The field-body is no longer a comma-separated list, although a - sequence is still permitted. - - C.3.6. REFERENCE - - The field-body is no longer a comma-separated list, although a - sequence is still permitted. - - C.3.7. ENCRYPTED - - A field has been specified that permits senders to indicate - that the body of a message has been encrypted. - - C.3.8. EXTENSION-FIELD - - Extension fields are prohibited from beginning with the char- - acters "X-". - - C.4. DATE AND TIME SPECIFICATION - - C.4.1. SIMPLIFICATION - - Fewer optional forms are permitted and the list of three- - letter time zones has been shortened. - - C.5. ADDRESS SPECIFICATION - - - - - - - August 13, 1982 - 42 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - C.5.1. ADDRESS - - The use of quoted-string, and the ":"-atom-":" construct, have - been removed. An address now is either a single mailbox - reference or is a named list of addresses. The latter indi- - cates a group distribution. - - C.5.2. GROUPS - - Group lists are now required to to have a name. Group lists - may not be nested. - - C.5.3. MAILBOX - - A mailbox specification may indicate a person's name, as - before. Such a named list no longer may specify multiple - mailboxes and may not be nested. - - C.5.4. ROUTE ADDRESSING - - Addresses now are taken to be absolute, global specifications, - independent of transmission paths. The <route> construct has - been provided, to permit explicit specification of transmis- - sion path. RFC #733's use of multiple at-signs ("@") was - intended as a general syntax for indicating routing and/or - hierarchical addressing. The current standard separates these - specifications and only one at-sign is permitted. - - C.5.5. AT-SIGN - - The string " at " no longer is used as an address delimiter. - Only at-sign ("@") serves the function. - - C.5.6. DOMAINS - - Hierarchical, logical name-domains have been added. - - C.6. RESERVED ADDRESS - - The local-part "Postmaster" has been reserved, so that users can - be guaranteed at least one valid address at a site. - - - - - - - - - - - August 13, 1982 - 43 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - D. ALPHABETICAL LISTING OF SYNTAX RULES - - address = mailbox ; one addressee - / group ; named list - addr-spec = local-part "@" domain ; global address - ALPHA = <any ASCII alphabetic character> - ; (101-132, 65.- 90.) - ; (141-172, 97.-122.) - atom = 1*<any CHAR except specials, SPACE and CTLs> - authentic = "From" ":" mailbox ; Single author - / ( "Sender" ":" mailbox ; Actual submittor - "From" ":" 1#mailbox) ; Multiple authors - ; or not sender - CHAR = <any ASCII character> ; ( 0-177, 0.-127.) - comment = "(" *(ctext / quoted-pair / comment) ")" - CR = <ASCII CR, carriage return> ; ( 15, 13.) - CRLF = CR LF - ctext = <any CHAR excluding "(", ; => may be folded - ")", "\" & CR, & including - linear-white-space> - CTL = <any ASCII control ; ( 0- 37, 0.- 31.) - character and DEL> ; ( 177, 127.) - date = 1*2DIGIT month 2DIGIT ; day month year - ; e.g. 20 Jun 82 - dates = orig-date ; Original - [ resent-date ] ; Forwarded - date-time = [ day "," ] date time ; dd mm yy - ; hh:mm:ss zzz - day = "Mon" / "Tue" / "Wed" / "Thu" - / "Fri" / "Sat" / "Sun" - delimiters = specials / linear-white-space / comment - destination = "To" ":" 1#address ; Primary - / "Resent-To" ":" 1#address - / "cc" ":" 1#address ; Secondary - / "Resent-cc" ":" 1#address - / "bcc" ":" #address ; Blind carbon - / "Resent-bcc" ":" #address - DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.) - domain = sub-domain *("." sub-domain) - domain-literal = "[" *(dtext / quoted-pair) "]" - domain-ref = atom ; symbolic reference - dtext = <any CHAR excluding "[", ; => may be folded - "]", "\" & CR, & including - linear-white-space> - extension-field = - <Any field which is defined in a document - published as a formal extension to this - specification; none will have names beginning - with the string "X-"> - - - August 13, 1982 - 44 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - field = field-name ":" [ field-body ] CRLF - fields = dates ; Creation time, - source ; author id & one - 1*destination ; address required - *optional-field ; others optional - field-body = field-body-contents - [CRLF LWSP-char field-body] - field-body-contents = - <the ASCII characters making up the field-body, as - defined in the following sections, and consisting - of combinations of atom, quoted-string, and - specials tokens, or else consisting of texts> - field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> - group = phrase ":" [#mailbox] ";" - hour = 2DIGIT ":" 2DIGIT [":" 2DIGIT] - ; 00:00:00 - 23:59:59 - HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) - LF = <ASCII LF, linefeed> ; ( 12, 10.) - linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE - ; CRLF => folding - local-part = word *("." word) ; uninterpreted - ; case-preserved - LWSP-char = SPACE / HTAB ; semantics = SPACE - mailbox = addr-spec ; simple address - / phrase route-addr ; name & addr-spec - message = fields *( CRLF *text ) ; Everything after - ; first null line - ; is message body - month = "Jan" / "Feb" / "Mar" / "Apr" - / "May" / "Jun" / "Jul" / "Aug" - / "Sep" / "Oct" / "Nov" / "Dec" - msg-id = "<" addr-spec ">" ; Unique message id - optional-field = - / "Message-ID" ":" msg-id - / "Resent-Message-ID" ":" msg-id - / "In-Reply-To" ":" *(phrase / msg-id) - / "References" ":" *(phrase / msg-id) - / "Keywords" ":" #phrase - / "Subject" ":" *text - / "Comments" ":" *text - / "Encrypted" ":" 1#2word - / extension-field ; To be defined - / user-defined-field ; May be pre-empted - orig-date = "Date" ":" date-time - originator = authentic ; authenticated addr - [ "Reply-To" ":" 1#address] ) - phrase = 1*word ; Sequence of words - - - - - August 13, 1982 - 45 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - qtext = <any CHAR excepting <">, ; => may be folded - "\" & CR, and including - linear-white-space> - quoted-pair = "\" CHAR ; may quote any char - quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or - ; quoted chars. - received = "Received" ":" ; one per relay - ["from" domain] ; sending host - ["by" domain] ; receiving host - ["via" atom] ; physical path - *("with" atom) ; link/mail protocol - ["id" msg-id] ; receiver msg id - ["for" addr-spec] ; initial form - ";" date-time ; time received - - resent = resent-authentic - [ "Resent-Reply-To" ":" 1#address] ) - resent-authentic = - = "Resent-From" ":" mailbox - / ( "Resent-Sender" ":" mailbox - "Resent-From" ":" 1#mailbox ) - resent-date = "Resent-Date" ":" date-time - return = "Return-path" ":" route-addr ; return address - route = 1#("@" domain) ":" ; path-relative - route-addr = "<" [route] addr-spec ">" - source = [ trace ] ; net traversals - originator ; original mail - [ resent ] ; forwarded - SPACE = <ASCII SP, space> ; ( 40, 32.) - specials = "(" / ")" / "<" / ">" / "@" ; Must be in quoted- - / "," / ";" / ":" / "\" / <"> ; string, to use - / "." / "[" / "]" ; within a word. - sub-domain = domain-ref / domain-literal - text = <any CHAR, including bare ; => atoms, specials, - CR & bare LF, but NOT ; comments and - including CRLF> ; quoted-strings are - ; NOT recognized. - time = hour zone ; ANSI and Military - trace = return ; path to sender - 1*received ; receipt tags - user-defined-field = - <Any field which has not been defined - in this specification or published as an - extension to this specification; names for - such fields must be unique and may be - pre-empted by published extensions> - word = atom / quoted-string - - - - - August 13, 1982 - 46 - RFC #822 - - - - Standard for ARPA Internet Text Messages - - - zone = "UT" / "GMT" ; Universal Time - ; North American : UT - / "EST" / "EDT" ; Eastern: - 5/ - 4 - / "CST" / "CDT" ; Central: - 6/ - 5 - / "MST" / "MDT" ; Mountain: - 7/ - 6 - / "PST" / "PDT" ; Pacific: - 8/ - 7 - / 1ALPHA ; Military: Z = UT; - <"> = <ASCII quote mark> ; ( 42, 34.) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - August 13, 1982 - 47 - RFC #822 - diff --git a/proto/sieve/rfc3028.txt b/proto/sieve/rfc3028.txt @@ -1,2019 +0,0 @@ - - - - - - -Network Working Group T. Showalter -Request for Comments: 3028 Mirapoint, Inc. -Category: Standards Track January 2001 - - - Sieve: A Mail Filtering Language - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2001). All Rights Reserved. - -Abstract - - This document describes a language for filtering e-mail messages at - time of final delivery. It is designed to be implementable on either - a mail client or mail server. It is meant to be extensible, simple, - and independent of access protocol, mail architecture, and operating - system. It is suitable for running on a mail server where users may - not be allowed to execute arbitrary programs, such as on black box - Internet Message Access Protocol (IMAP) servers, as it has no - variables, loops, or ability to shell out to external programs. - -Table of Contents - - 1. Introduction ........................................... 3 - 1.1. Conventions Used in This Document ..................... 4 - 1.2. Example mail messages ................................. 4 - 2. Design ................................................. 5 - 2.1. Form of the Language .................................. 5 - 2.2. Whitespace ............................................ 5 - 2.3. Comments .............................................. 6 - 2.4. Literal Data .......................................... 6 - 2.4.1. Numbers ............................................... 6 - 2.4.2. Strings ............................................... 7 - 2.4.2.1. String Lists .......................................... 7 - 2.4.2.2. Headers ............................................... 8 - 2.4.2.3. Addresses ............................................. 8 - 2.4.2.4. MIME Parts ............................................ 9 - 2.5. Tests ................................................. 9 - 2.5.1. Test Lists ............................................ 9 - - - -Showalter Standards Track [Page 1] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - 2.6. Arguments ............................................. 9 - 2.6.1. Positional Arguments .................................. 9 - 2.6.2. Tagged Arguments ...................................... 10 - 2.6.3. Optional Arguments .................................... 10 - 2.6.4. Types of Arguments .................................... 10 - 2.7. String Comparison ..................................... 11 - 2.7.1. Match Type ............................................ 11 - 2.7.2. Comparisons Across Character Sets ..................... 12 - 2.7.3. Comparators ........................................... 12 - 2.7.4. Comparisons Against Addresses ......................... 13 - 2.8. Blocks ................................................ 14 - 2.9. Commands .............................................. 14 - 2.10. Evaluation ............................................ 15 - 2.10.1. Action Interaction .................................... 15 - 2.10.2. Implicit Keep ......................................... 15 - 2.10.3. Message Uniqueness in a Mailbox ....................... 15 - 2.10.4. Limits on Numbers of Actions .......................... 16 - 2.10.5. Extensions and Optional Features ...................... 16 - 2.10.6. Errors ................................................ 17 - 2.10.7. Limits on Execution ................................... 17 - 3. Control Commands ....................................... 17 - 3.1. Control Structure If .................................. 18 - 3.2. Control Structure Require ............................. 19 - 3.3. Control Structure Stop ................................ 19 - 4. Action Commands ........................................ 19 - 4.1. Action reject ......................................... 20 - 4.2. Action fileinto ....................................... 20 - 4.3. Action redirect ....................................... 21 - 4.4. Action keep ........................................... 21 - 4.5. Action discard ........................................ 22 - 5. Test Commands .......................................... 22 - 5.1. Test address .......................................... 23 - 5.2. Test allof ............................................ 23 - 5.3. Test anyof ............................................ 24 - 5.4. Test envelope ......................................... 24 - 5.5. Test exists ........................................... 25 - 5.6. Test false ............................................ 25 - 5.7. Test header ........................................... 25 - 5.8. Test not .............................................. 26 - 5.9. Test size ............................................. 26 - 5.10. Test true ............................................. 26 - 6. Extensibility .......................................... 26 - 6.1. Capability String ..................................... 27 - 6.2. IANA Considerations ................................... 28 - 6.2.1. Template for Capability Registrations ................. 28 - 6.2.2. Initial Capability Registrations ...................... 28 - 6.3. Capability Transport .................................. 29 - 7. Transmission ........................................... 29 - - - -Showalter Standards Track [Page 2] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - 8. Parsing ................................................ 30 - 8.1. Lexical Tokens ........................................ 30 - 8.2. Grammar ............................................... 31 - 9. Extended Example ....................................... 32 - 10. Security Considerations ................................ 34 - 11. Acknowledgments ........................................ 34 - 12. Author's Address ....................................... 34 - 13. References ............................................. 34 - 14. Full Copyright Statement ............................... 36 - -1. Introduction - - This memo documents a language that can be used to create filters for - electronic mail. It is not tied to any particular operating system or - mail architecture. It requires the use of [IMAIL]-compliant - messages, but should otherwise generalize to many systems. - - The language is powerful enough to be useful but limited in order to - allow for a safe server-side filtering system. The intention is to - make it impossible for users to do anything more complex (and - dangerous) than write simple mail filters, along with facilitating - the use of GUIs for filter creation and manipulation. The language is - not Turing-complete: it provides no way to write a loop or a function - and variables are not provided. - - Scripts written in Sieve are executed during final delivery, when the - message is moved to the user-accessible mailbox. In systems where - the MTA does final delivery, such as traditional Unix mail, it is - reasonable to sort when the MTA deposits mail into the user's - mailbox. - - There are a number of reasons to use a filtering system. Mail - traffic for most users has been increasing due to increased usage of - e-mail, the emergence of unsolicited email as a form of advertising, - and increased usage of mailing lists. - - Experience at Carnegie Mellon has shown that if a filtering system is - made available to users, many will make use of it in order to file - messages from specific users or mailing lists. However, many others - did not make use of the Andrew system's FLAMES filtering language - [FLAMES] due to difficulty in setting it up. - - Because of the expectation that users will make use of filtering if - it is offered and easy to use, this language has been made simple - enough to allow many users to make use of it, but rich enough that it - can be used productively. However, it is expected that GUI-based - editors will be the preferred way of editing filters for a large - number of users. - - - -Showalter Standards Track [Page 3] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -1.1. Conventions Used in This Document - - In the sections of this document that discuss the requirements of - various keywords and operators, the following conventions have been - adopted. - - The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and - "MAY" in this document are to be interpreted as defined in - [KEYWORDS]. - - Each section on a command (test, action, or control structure) has a - line labeled "Syntax:". This line describes the syntax of the - command, including its name and its arguments. Required arguments - are listed inside angle brackets ("<" and ">"). Optional arguments - are listed inside square brackets ("[" and "]"). Each argument is - followed by its type, so "<key: string>" represents an argument - called "key" that is a string. Literal strings are represented with - double-quoted strings. Alternatives are separated with slashes, and - parenthesis are used for grouping, similar to [ABNF]. - - In the "Syntax" line, there are three special pieces of syntax that - are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. - These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, - respectively. - - The formal grammar for these commands in section 10 and is the - authoritative reference on how to construct commands, but the formal - grammar does not specify the order, semantics, number or types of - arguments to commands, nor the legal command names. The intent is to - allow for extension without changing the grammar. - -1.2. Example mail messages - - The following mail messages will be used throughout this document in - examples. - - Message A - ----------------------------------------------------------- - Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) - From: coyote@desert.example.org - To: roadrunner@acme.example.com - Subject: I have a present for you - - Look, I'm sorry about the whole anvil thing, and I really - didn't mean to try and drop it on you from the top of the - cliff. I want to try to make it up to you. I've got some - great birdseed over here at my place--top of the line - - - - -Showalter Standards Track [Page 4] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - stuff--and if you come by, I'll have it all wrapped up - for you. I'm really sorry for all the problems I've caused - for you over the years, but I know we can work this out. - -- - Wile E. Coyote "Super Genius" coyote@desert.example.org - ----------------------------------------------------------- - - Message B - ----------------------------------------------------------- - From: youcouldberich!@reply-by-postal-mail.invalid - Sender: b1ff@de.res.example.com - To: rube@landru.example.edu - Date: Mon, 31 Mar 1997 18:26:10 -0800 - Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ - - YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT - IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL - GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! - MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER - $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! - !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST - SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! - ----------------------------------------------------------- - -2. Design - -2.1. Form of the Language - - The language consists of a set of commands. Each command consists of - a set of tokens delimited by whitespace. The command identifier is - the first token and it is followed by zero or more argument tokens. - Arguments may be literal data, tags, blocks of commands, or test - commands. - - The language is represented in UTF-8, as specified in [UTF-8]. - - Tokens in the ASCII range are considered case-insensitive. - -2.2. Whitespace - - Whitespace is used to separate tokens. Whitespace is made up of - tabs, newlines (CRLF, never just CR or LF), and the space character. - The amount of whitespace used is not significant. - - - - - - - - -Showalter Standards Track [Page 5] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -2.3. Comments - - Two types of comments are offered. Comments are semantically - equivalent to whitespace and can be used anyplace that whitespace is - (with one exception in multi-line strings, as described in the - grammar). - - Hash comments begin with a "#" character that is not contained within - a string and continue until the next CRLF. - - Example: if size :over 100K { # this is a comment - discard; - } - - Bracketed comments begin with the token "/*" and end with "*/" outside - of a string. Bracketed comments may span multiple lines. Bracketed - comments do not nest. - - Example: if size :over 100K { /* this is a comment - this is still a comment */ discard /* this is a comment - */ ; - } - -2.4. Literal Data - - Literal data means data that is not executed, merely evaluated "as - is", to be used as arguments to commands. Literal data is limited to - numbers and strings. - -2.4.1. Numbers - - Numbers are given as ordinary decimal numbers. However, those - numbers that have a tendency to be fairly large, such as message - sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of - a power of two. To be comparable with the power-of-two-based - versions of SI units that computers frequently use, K specifies - kibi-, or 1,024 (2^10) times the value of the number; M specifies - mebi-, or 1,048,576 (2^20) times the value of the number; and G - specifies tebi-, or 1,073,741,824 (2^30) times the value of the - number [BINARY-SI]. - - Implementations MUST provide 31 bits of magnitude in numbers, but MAY - provide more. - - Only positive integers are permitted by this specification. - - - - - - -Showalter Standards Track [Page 6] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -2.4.2. Strings - - Scripts involve large numbers of strings as they are used for pattern - matching, addresses, textual bodies, etc. Typically, short quoted - strings suffice for most uses, but a more convenient form is provided - for longer strings such as bodies of messages. - - A quoted string starts and ends with a single double quote (the <"> - character, ASCII 34). A backslash ("\", ASCII 92) inside of a quoted - string is followed by either another backslash or a double quote. - This two-character sequence represents a single backslash or double- - quote within the string, respectively. - - No other characters should be escaped with a single backslash. - - An undefined escape sequence (such as "\a" in a context where "a" has - no special meaning) is interpreted as if there were no backslash (in - this case, "\a" is just "a"). - - Non-printing characters such as tabs, CR and LF, and control - characters are permitted in quoted strings. Quoted strings MAY span - multiple lines. NUL (ASCII 0) is not allowed in strings. - - For entering larger amounts of text, such as an email message, a - multi-line form is allowed. It starts with the keyword "text:", - followed by a CRLF, and ends with the sequence of a CRLF, a single - period, and another CRLF. In order to allow the message to contain - lines with a single-dot, lines are dot-stuffed. That is, when - composing a message body, an extra `.' is added before each line - which begins with a `.'. When the server interprets the script, - these extra dots are removed. Note that a line that begins with a - dot followed by a non-dot character is not interpreted dot-stuffed; - that is, ".foo" is interpreted as ".foo". However, because this is - potentially ambiguous, scripts SHOULD be properly dot-stuffed so such - lines do not appear. - - Note that a hashed comment or whitespace may occur in between the - "text:" and the CRLF, but not within the string itself. Bracketed - comments are not allowed here. - -2.4.2.1. String Lists - - When matching patterns, it is frequently convenient to match against - groups of strings instead of single strings. For this reason, a list - of strings is allowed in many tests, implying that if the test is - true using any one of the strings, then the test is true. - Implementations are encouraged to use short-circuit evaluation in - these cases. - - - -Showalter Standards Track [Page 7] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - For instance, the test `header :contains ["To", "Cc"] - ["me@example.com", "me00@landru.example.edu"]' is true if either the - To header or Cc header of the input message contains either of the - e-mail addresses "me@example.com" or "me00@landru.example.edu". - - Conversely, in any case where a list of strings is appropriate, a - single string is allowed without being a member of a list: it is - equivalent to a list with a single member. This means that the test - `exists "To"' is equivalent to the test `exists ["To"]'. - -2.4.2.2. Headers - - Headers are a subset of strings. In the Internet Message - Specification [IMAIL] [RFC1123], each header line is allowed to have - whitespace nearly anywhere in the line, including after the field - name and before the subsequent colon. Extra spaces between the - header name and the ":" in a header field are ignored. - - A header name never contains a colon. The "From" header refers to a - line beginning "From:" (or "From :", etc.). No header will match - the string "From:" due to the trailing colon. - - Folding of long header lines (as described in [IMAIL] 3.4.8) is - removed prior to interpretation of the data. The folding syntax (the - CRLF that ends a line plus any leading whitespace at the beginning of - the next line that indicates folding) are interpreted as if they were - a single space. - -2.4.2.3. Addresses - - A number of commands call for email addresses, which are also a - subset of strings. When these addresses are used in outbound - contexts, addresses must be compliant with [IMAIL], but are further - constrained. Using the symbols defined in [IMAIL], section 6.1, the - syntax of an address is: - - sieve-address = addr-spec ; simple address - / phrase "<" addr-spec ">" ; name & addr-spec - - That is, routes and group syntax are not permitted. If multiple - addresses are required, use a string list. Named groups are not used - here. - - Implementations MUST ensure that the addresses are syntactically - valid, but need not ensure that they actually identify an email - recipient. - - - - - -Showalter Standards Track [Page 8] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -2.4.2.4. MIME Parts - - In a few places, [MIME] body parts are represented as strings. These - parts include MIME headers and the body. This provides a way of - embedding typed data within a Sieve script so that, among other - things, character sets other than UTF-8 can be used for output - messages. - -2.5. Tests - - Tests are given as arguments to commands in order to control their - actions. In this document, tests are given to if/elsif/else to - decide which block of code is run. - - Tests MUST NOT have side effects. That is, a test cannot affect the - state of the filter or message. No tests in this specification have - side effects, and side effects are forbidden in extension tests as - well. - - The rationale for this is that tests with side effects impair - readability and maintainability and are difficult to represent in a - graphic interface for generating scripts. Side effects are confined - to actions where they are clearer. - -2.5.1. Test Lists - - Some tests ("allof" and "anyof", which implement logical "and" and - logical "or", respectively) may require more than a single test as an - argument. The test-list syntax element provides a way of grouping - tests. - - Example: if anyof (not exists ["From", "Date"], - header :contains "from" "fool@example.edu") { - discard; - } - -2.6. Arguments - - In order to specify what to do, most commands take arguments. There - are three types of arguments: positional, tagged, and optional. - -2.6.1. Positional Arguments - - Positional arguments are given to a command which discerns their - meaning based on their order. When a command takes positional - arguments, all positional arguments must be supplied and must be in - the order prescribed. - - - - -Showalter Standards Track [Page 9] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -2.6.2. Tagged Arguments - - This document provides for tagged arguments in the style of - CommonLISP. These are also similar to flags given to commands in - most command-line systems. - - A tagged argument is an argument for a command that begins with ":" - followed by a tag naming the argument, such as ":contains". This - argument means that zero or more of the next tokens have some - particular meaning depending on the argument. These next tokens may - be numbers or strings but they are never blocks. - - Tagged arguments are similar to positional arguments, except that - instead of the meaning being derived from the command, it is derived - from the tag. - - Tagged arguments must appear before positional arguments, but they - may appear in any order with other tagged arguments. For simplicity - of the specification, this is not expressed in the syntax definitions - with commands, but they still may be reordered arbitrarily provided - they appear before positional arguments. Tagged arguments may be - mixed with optional arguments. - - To simplify this specification, tagged arguments SHOULD NOT take - tagged arguments as arguments. - -2.6.3. Optional Arguments - - Optional arguments are exactly like tagged arguments except that they - may be left out, in which case a default value is implied. Because - optional arguments tend to result in shorter scripts, they have been - used far more than tagged arguments. - - One particularly noteworthy case is the ":comparator" argument, which - allows the user to specify which [ACAP] comparator will be used to - compare two strings, since different languages may impose different - orderings on UTF-8 [UTF-8] characters. - -2.6.4. Types of Arguments - - Abstractly, arguments may be literal data, tests, or blocks of - commands. In this way, an "if" control structure is merely a command - that happens to take a test and a block as arguments and may execute - the block of code. - - However, this abstraction is ambiguous from a parsing standpoint. - The grammar in section 9.2 presents a parsable version of this: - Arguments are string-lists, numbers, and tags, which may be followed - - - -Showalter Standards Track [Page 10] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - by a test or a test-list, which may be followed by a block of - commands. No more than one test or test list, nor more than one - block of commands, may be used, and commands that end with blocks of - commands do not end with semicolons. - -2.7. String Comparison - - When matching one string against another, there are a number of ways - of performing the match operation. These are accomplished with three - types of matches: an exact match, a substring match, and a wildcard - glob-style match. These are described below. - - In order to provide for matches between character sets and case - insensitivity, Sieve borrows ACAP's comparator registry. - - However, when a string represents the name of a header, the - comparator is never user-specified. Header comparisons are always - done with the "i;ascii-casemap" operator, i.e., case-insensitive - comparisons, because this is the way things are defined in the - message specification [IMAIL]. - -2.7.1. Match Type - - There are three match types describing the matching used in this - specification: ":is", ":contains", and ":matches". Match type - arguments are supplied to those commands which allow them to specify - what kind of match is to be performed. - - These are used as tagged arguments to tests that perform string - comparison. - - The ":contains" match type describes a substring match. If the value - argument contains the key argument as a substring, the match is true. - For instance, the string "frobnitzm" contains "frob" and "nit", but - not "fbm". The null key ("") is contained in all values. - - The ":is" match type describes an absolute match; if the contents of - the first string are absolutely the same as the contents of the - second string, they match. Only the string "frobnitzm" is the string - "frobnitzm". The null key ":is" and only ":is" the null value. - - The ":matches" version specifies a wildcard match using the - characters "*" and "?". "*" matches zero or more characters, and "?" - matches a single character. "?" and "*" may be escaped as "\\?" and - "\\*" in strings to match against themselves. The first backslash - escapes the second backslash; together, they escape the "*". This is - awkward, but it is commonplace in several programming languages that - use globs and regular expressions. - - - -Showalter Standards Track [Page 11] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - In order to specify what type of match is supposed to happen, - commands that support matching take optional tagged arguments - ":matches", ":is", and ":contains". Commands default to using ":is" - matching if no match type argument is supplied. Note that these - modifiers may interact with comparators; in particular, some - comparators are not suitable for matching with ":contains" or - ":matches". It is an error to use a comparator with ":contains" or - ":matches" that is not compatible with it. - - It is an error to give more than one of these arguments to a given - command. - - For convenience, the "MATCH-TYPE" syntax element is defined here as - follows: - - Syntax: ":is" / ":contains" / ":matches" - -2.7.2. Comparisons Across Character Sets - - All Sieve scripts are represented in UTF-8, but messages may involve - a number of character sets. In order for comparisons to work across - character sets, implementations SHOULD implement the following - behavior: - - Implementations decode header charsets to UTF-8. Two strings are - considered equal if their UTF-8 representations are identical. - Implementations should decode charsets represented in the forms - specified by [MIME] for both message headers and bodies. - Implementations must be capable of decoding US-ASCII, ISO-8859-1, - the ASCII subset of ISO-8859-* character sets, and UTF-8. - - If implementations fail to support the above behavior, they MUST - conform to the following: - - No two strings can be considered equal if one contains octets - greater than 127. - -2.7.3. Comparators - - In order to allow for language-independent, case-independent matches, - the match type may be coupled with a comparator name. Comparators - are described for [ACAP]; a registry is defined for ACAP, and this - specification uses that registry. - - ACAP defines multiple comparator types. Only equality types are used - in this specification. - - - - - -Showalter Standards Track [Page 12] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - All implementations MUST support the "i;octet" comparator (simply - compares octets) and the "i;ascii-casemap" comparator (which treats - uppercase and lowercase characters in the ASCII subset of UTF-8 as - the same). If left unspecified, the default is "i;ascii-casemap". - - Some comparators may not be usable with substring matches; that is, - they may only work with ":is". It is an error to try and use a - comparator with ":matches" or ":contains" that is not compatible with - it. - - A comparator is specified by the ":comparator" option with commands - that support matching. This option is followed by a string providing - the name of the comparator to be used. For convenience, the syntax - of a comparator is abbreviated to "COMPARATOR", and (repeated in - several tests) is as follows: - - Syntax: ":comparator" <comparator-name: string> - - So in this example, - - Example: if header :contains :comparator "i;octet" "Subject" - "MAKE MONEY FAST" { - discard; - } - - would discard any message with subjects like "You can MAKE MONEY - FAST", but not "You can Make Money Fast", since the comparator used - is case-sensitive. - - Comparators other than i;octet and i;ascii-casemap must be declared - with require, as they are extensions. If a comparator declared with - require is not known, it is an error, and execution fails. If the - comparator is not declared with require, it is also an error, even if - the comparator is supported. (See 2.10.5.) - - Both ":matches" and ":contains" match types are compatible with the - "i;octet" and "i;ascii-casemap" comparators and may be used with - them. - - It is an error to give more than one of these arguments to a given - command. - -2.7.4. Comparisons Against Addresses - - Addresses are one of the most frequent things represented as strings. - These are structured, and being able to compare against the local- - part or the domain of an address is useful, so some tests that act - - - - -Showalter Standards Track [Page 13] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - exclusively on addresses take an additional optional argument that - specifies what the test acts on. - - These optional arguments are ":localpart", ":domain", and ":all", - which act on the local-part (left-side), the domain part (right- - side), and the whole address. - - The kind of comparison done, such as whether or not the test done is - case-insensitive, is specified as a comparator argument to the test. - - If an optional address-part is omitted, the default is ":all". - - It is an error to give more than one of these arguments to a given - command. - - For convenience, the "ADDRESS-PART" syntax element is defined here as - follows: - - Syntax: ":localpart" / ":domain" / ":all" - -2.8. Blocks - - Blocks are sets of commands enclosed within curly braces. Blocks are - supplied to commands so that the commands can implement control - commands. - - A control structure is a command that happens to take a test and a - block as one of its arguments; depending on the result of the test - supplied as another argument, it runs the code in the block some - number of times. - - With the commands supplied in this memo, there are no loops. The - control structures supplied--if, elsif, and else--run a block either - once or not at all. So there are two arguments, the test and the - block. - -2.9. Commands - - Sieve scripts are sequences of commands. Commands can take any of - the tokens above as arguments, and arguments may be either tagged or - positional arguments. Not all commands take all arguments. - - There are three kinds of commands: test commands, action commands, - and control commands. - - The simplest is an action command. An action command is an - identifier followed by zero or more arguments, terminated by a - semicolon. Action commands do not take tests or blocks as arguments. - - - -Showalter Standards Track [Page 14] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - A control command is similar, but it takes a test as an argument, and - ends with a block instead of a semicolon. - - A test command is used as part of a control command. It is used to - specify whether or not the block of code given to the control command - is executed. - -2.10. Evaluation - -2.10.1. Action Interaction - - Some actions cannot be used with other actions because the result - would be absurd. These restrictions are noted throughout this memo. - - Extension actions MUST state how they interact with actions defined - in this specification. - -2.10.2. Implicit Keep - - Previous experience with filtering systems suggests that cases tend - to be missed in scripts. To prevent errors, Sieve has an "implicit - keep". - - An implicit keep is a keep action (see 4.4) performed in absence of - any action that cancels the implicit keep. - - An implicit keep is performed if a message is not written to a - mailbox, redirected to a new address, or explicitly thrown out. That - is, if a fileinto, a keep, a redirect, or a discard is performed, an - implicit keep is not. - - Some actions may be defined to not cancel the implicit keep. These - actions may not directly affect the delivery of a message, and are - used for their side effects. None of the actions specified in this - document meet that criteria, but extension actions will. - - For instance, with any of the short messages offered above, the - following script produces no actions. - - Example: if size :over 500K { discard; } - - As a result, the implicit keep is taken. - -2.10.3. Message Uniqueness in a Mailbox - - Implementations SHOULD NOT deliver a message to the same folder more - than once, even if a script explicitly asks for a message to be - written to a mailbox twice. - - - -Showalter Standards Track [Page 15] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - The test for equality of two messages is implementation-defined. - - If a script asks for a message to be written to a mailbox twice, it - MUST NOT be treated as an error. - -2.10.4. Limits on Numbers of Actions - - Site policy MAY limit numbers of actions taken and MAY impose - restrictions on which actions can be used together. In the event - that a script hits a policy limit on the number of actions taken for - a particular message, an error occurs. - - Implementations MUST prohibit more than one reject. - - Implementations MUST allow at least one keep or one fileinto. If - fileinto is not implemented, implementations MUST allow at least one - keep. - - Implementations SHOULD prohibit reject when used with other actions. - -2.10.5. Extensions and Optional Features - - Because of the differing capabilities of many mail systems, several - features of this specification are optional. Before any of these - extensions can be executed, they must be declared with the "require" - action. - - If an extension is not enabled with "require", implementations MUST - treat it as if they did not support it at all. - - If a script does not understand an extension declared with require, - the script must not be used at all. Implementations MUST NOT execute - scripts which require unknown capability names. - - Note: The reason for this restriction is that prior experiences with - languages such as LISP and Tcl suggest that this is a workable - way of noting that a given script uses an extension. - - Experience with PostScript suggests that mechanisms that allow - a script to work around missing extensions are not used in - practice. - - Extensions which define actions MUST state how they interact with - actions discussed in the base specification. - - - - - - - -Showalter Standards Track [Page 16] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -2.10.6. Errors - - In any programming language, there are compile-time and run-time - errors. - - Compile-time errors are ones in syntax that are detectable if a - syntax check is done. - - Run-time errors are not detectable until the script is run. This - includes transient failures like disk full conditions, but also - includes issues like invalid combinations of actions. - - When an error occurs in a Sieve script, all processing stops. - - Implementations MAY choose to do a full parse, then evaluate the - script, then do all actions. Implementations might even go so far as - to ensure that execution is atomic (either all actions are executed - or none are executed). - - Other implementations may choose to parse and run at the same time. - Such implementations are simpler, but have issues with partial - failure (some actions happen, others don't). - - Implementations might even go so far as to ensure that scripts can - never execute an invalid set of actions (e.g., reject + fileinto) - before execution, although this could involve solving the Halting - Problem. - - This specification allows any of these approaches. Solving the - Halting Problem is considered extra credit. - - When an error happens, implementations MUST notify the user that an - error occurred, which actions (if any) were taken, and do an implicit - keep. - -2.10.7. Limits on Execution - - Implementations may limit certain constructs. However, this - specification places a lower bound on some of these limits. - - Implementations MUST support fifteen levels of nested blocks. - - Implementations MUST support fifteen levels of nested test lists. - -3. Control Commands - - Control structures are needed to allow for multiple and conditional - actions. - - - -Showalter Standards Track [Page 17] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -3.1. Control Structure If - - There are three pieces to if: "if", "elsif", and "else". Each is - actually a separate command in terms of the grammar. However, an - elsif MUST only follow an if, and an else MUST follow only either an - if or an elsif. An error occurs if these conditions are not met. - - Syntax: if <test1: test> <block1: block> - - Syntax: elsif <test2: test> <block2: block> - - Syntax: else <block> - - The semantics are similar to those of any of the many other - programming languages these control commands appear in. When the - interpreter sees an "if", it evaluates the test associated with it. - If the test is true, it executes the block associated with it. - - If the test of the "if" is false, it evaluates the test of the first - "elsif" (if any). If the test of "elsif" is true, it runs the - elsif's block. An elsif may be followed by an elsif, in which case, - the interpreter repeats this process until it runs out of elsifs. - - When the interpreter runs out of elsifs, there may be an "else" case. - If there is, and none of the if or elsif tests were true, the - interpreter runs the else case. - - This provides a way of performing exactly one of the blocks in the - chain. - - In the following example, both Message A and B are dropped. - - Example: require "fileinto"; - if header :contains "from" "coyote" { - discard; - } elsif header :contains ["subject"] ["$$$"] { - discard; - } else { - fileinto "INBOX"; - } - - - When the script below is run over message A, it redirects the message - to acm@example.edu; message B, to postmaster@example.edu; any other - message is redirected to field@example.edu. - - - - - - -Showalter Standards Track [Page 18] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Example: if header :contains ["From"] ["coyote"] { - redirect "acm@example.edu"; - } elsif header :contains "Subject" "$$$" { - redirect "postmaster@example.edu"; - } else { - redirect "field@example.edu"; - } - - Note that this definition prohibits the "... else if ..." sequence - used by C. This is intentional, because this construct produces a - shift-reduce conflict. - -3.2. Control Structure Require - - Syntax: require <capabilities: string-list> - - The require action notes that a script makes use of a certain - extension. Such a declaration is required to use the extension, as - discussed in section 2.10.5. Multiple capabilities can be declared - with a single require. - - The require command, if present, MUST be used before anything other - than a require can be used. An error occurs if a require appears - after a command other than require. - - Example: require ["fileinto", "reject"]; - - Example: require "fileinto"; - require "vacation"; - -3.3. Control Structure Stop - - Syntax: stop - - The "stop" action ends all processing. If no actions have been - executed, then the keep action is taken. - -4. Action Commands - - This document supplies five actions that may be taken on a message: - keep, fileinto, redirect, reject, and discard. - - Implementations MUST support the "keep", "discard", and "redirect" - actions. - - Implementations SHOULD support "reject" and "fileinto". - - - - - -Showalter Standards Track [Page 19] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Implementations MAY limit the number of certain actions taken (see - section 2.10.4). - -4.1. Action reject - - Syntax: reject <reason: string> - - The optional "reject" action refuses delivery of a message by sending - back an [MDN] to the sender. It resends the message to the sender, - wrapping it in a "reject" form, noting that it was rejected by the - recipient. In the following script, message A is rejected and - returned to the sender. - - Example: if header :contains "from" "coyote@desert.example.org" { - reject "I am not taking mail from you, and I don't want - your birdseed, either!"; - } - - A reject message MUST take the form of a failure MDN as specified by - [MDN]. The human-readable portion of the message, the first - component of the MDN, contains the human readable message describing - the error, and it SHOULD contain additional text alerting the - original sender that mail was refused by a filter. This part of the - MDN might appear as follows: - - ------------------------------------------------------------ - Message was refused by recipient's mail filtering program. Reason - given was as follows: - - I am not taking mail from you, and I don't want your birdseed, - either! - ------------------------------------------------------------ - - The MDN action-value field as defined in the MDN specification MUST - be "deleted" and MUST have the MDN-sent-automatically and automatic- - action modes set. - - Because some implementations can not or will not implement the reject - command, it is optional. The capability string to be used with the - require command is "reject". - -4.2. Action fileinto - - Syntax: fileinto <folder: string> - - The "fileinto" action delivers the message into the specified folder. - Implementations SHOULD support fileinto, but in some environments - this may be impossible. - - - -Showalter Standards Track [Page 20] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - The capability string for use with the require command is "fileinto". - - In the following script, message A is filed into folder - "INBOX.harassment". - - Example: require "fileinto"; - if header :contains ["from"] "coyote" { - fileinto "INBOX.harassment"; - } - -4.3. Action redirect - - Syntax: redirect <address: string> - - The "redirect" action is used to send the message to another user at - a supplied address, as a mail forwarding feature does. The - "redirect" action makes no changes to the message body or existing - headers, but it may add new headers. The "redirect" modifies the - envelope recipient. - - The redirect command performs an MTA-style "forward"--that is, what - you get from a .forward file using sendmail under UNIX. The address - on the SMTP envelope is replaced with the one on the redirect command - and the message is sent back out. (This is not an MUA-style forward, - which creates a new message with a different sender and message ID, - wrapping the old message in a new one.) - - A simple script can be used for redirecting all mail: - - Example: redirect "bart@example.edu"; - - Implementations SHOULD take measures to implement loop control, - possibly including adding headers to the message or counting received - headers. If an implementation detects a loop, it causes an error. - -4.4. Action keep - - Syntax: keep - - The "keep" action is whatever action is taken in lieu of all other - actions, if no filtering happens at all; generally, this simply means - to file the message into the user's main mailbox. This command - provides a way to execute this action without needing to know the - name of the user's main mailbox, providing a way to call it without - needing to understand the user's setup, or the underlying mail - system. - - - - - -Showalter Standards Track [Page 21] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - For instance, in an implementation where the IMAP server is running - scripts on behalf of the user at time of delivery, a keep command is - equivalent to a fileinto "INBOX". - - Example: if size :under 1M { keep; } else { discard; } - - Note that the above script is identical to the one below. - - Example: if not size :under 1M { discard; } - -4.5. Action discard - - Syntax: discard - - Discard is used to silently throw away the message. It does so by - simply canceling the implicit keep. If discard is used with other - actions, the other actions still happen. Discard is compatible with - all other actions. (For instance fileinto+discard is equivalent to - fileinto.) - - Discard MUST be silent; that is, it MUST NOT return a non-delivery - notification of any kind ([DSN], [MDN], or otherwise). - - In the following script, any mail from "idiot@example.edu" is thrown - out. - - Example: if header :contains ["from"] ["idiot@example.edu"] { - discard; - } - - While an important part of this language, "discard" has the potential - to create serious problems for users: Students who leave themselves - logged in to an unattended machine in a public computer lab may find - their script changed to just "discard". In order to protect users in - this situation (along with similar situations), implementations MAY - keep messages destroyed by a script for an indefinite period, and MAY - disallow scripts that throw out all mail. - -5. Test Commands - - Tests are used in conditionals to decide which part(s) of the - conditional to execute. - - Implementations MUST support these tests: "address", "allof", - "anyof", "exists", "false", "header", "not", "size", and "true". - - Implementations SHOULD support the "envelope" test. - - - - -Showalter Standards Track [Page 22] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -5.1. Test address - - Syntax: address [ADDRESS-PART] [COMPARATOR] [MATCH-TYPE] - <header-list: string-list> <key-list: string-list> - - The address test matches Internet addresses in structured headers - that contain addresses. It returns true if any header contains any - key in the specified part of the address, as modified by the - comparator and the match keyword. - - Like envelope and header, this test returns true if any combination - of the header-list and key-list arguments match. - - Internet email addresses [IMAIL] have the somewhat awkward - characteristic that the local-part to the left of the at-sign is - considered case sensitive, and the domain-part to the right of the - at-sign is case insensitive. The "address" command does not deal - with this itself, but provides the ADDRESS-PART argument for allowing - users to deal with it. - - The address primitive never acts on the phrase part of an email - address, nor on comments within that address. It also never acts on - group names, although it does act on the addresses within the group - construct. - - Implementations MUST restrict the address test to headers that - contain addresses, but MUST include at least From, To, Cc, Bcc, - Sender, Resent-From, Resent-To, and SHOULD include any other header - that utilizes an "address-list" structured header body. - - Example: if address :is :all "from" "tim@example.com" { - discard; - -5.2. Test allof - - Syntax: allof <tests: test-list> - - The allof test performs a logical AND on the tests supplied to it. - - Example: allof (false, false) => false - allof (false, true) => false - allof (true, true) => true - - The allof test takes as its argument a test-list. - - - - - - - -Showalter Standards Track [Page 23] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -5.3. Test anyof - - Syntax: anyof <tests: test-list> - - The anyof test performs a logical OR on the tests supplied to it. - - Example: anyof (false, false) => false - anyof (false, true) => true - anyof (true, true) => true - -5.4. Test envelope - - Syntax: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] - <envelope-part: string-list> <key-list: string-list> - - The "envelope" test is true if the specified part of the SMTP (or - equivalent) envelope matches the specified key. - - If one of the envelope-part strings is (case insensitive) "from", - then matching occurs against the FROM address used in the SMTP MAIL - command. - - If one of the envelope-part strings is (case insensitive) "to", then - matching occurs against the TO address used in the SMTP RCPT command - that resulted in this message getting delivered to this user. Note - that only the most recent TO is available, and only the one relevant - to this user. - - The envelope-part is a string list and may contain more than one - parameter, in which case all of the strings specified in the key-list - are matched against all parts given in the envelope-part list. - - Like address and header, this test returns true if any combination of - the envelope-part and key-list arguments is true. - - All tests against envelopes MUST drop source routes. - - If the SMTP transaction involved several RCPT commands, only the data - from the RCPT command that caused delivery to this user is available - in the "to" part of the envelope. - - If a protocol other than SMTP is used for message transport, - implementations are expected to adapt this command appropriately. - - The envelope command is optional. Implementations SHOULD support it, - but the necessary information may not be available in all cases. - - - - - -Showalter Standards Track [Page 24] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Example: require "envelope"; - if envelope :all :is "from" "tim@example.com" { - discard; - } - -5.5. Test exists - - Syntax: exists <header-names: string-list> - - The "exists" test is true if the headers listed in the header-names - argument exist within the message. All of the headers must exist or - the test is false. - - The following example throws out mail that doesn't have a From header - and a Date header. - - Example: if not exists ["From","Date"] { - discard; - } - -5.6. Test false - - Syntax: false - - The "false" test always evaluates to false. - -5.7. Test header - - Syntax: header [COMPARATOR] [MATCH-TYPE] - <header-names: string-list> <key-list: string-list> - - The "header" test evaluates to true if any header name matches any - key. The type of match is specified by the optional match argument, - which defaults to ":is" if not specified, as specified in section - 2.6. - - Like address and envelope, this test returns true if any combination - of the string-list and key-list arguments match. - - If a header listed in the header-names argument exists, it contains - the null key (""). However, if the named header is not present, it - does not contain the null key. So if a message contained the header - - X-Caffeine: C8H10N4O2 - - - - - - - -Showalter Standards Track [Page 25] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - these tests on that header evaluate as follows: - - header :is ["X-Caffeine"] [""] => false - header :contains ["X-Caffeine"] [""] => true - -5.8. Test not - - Syntax: not <test> - - The "not" test takes some other test as an argument, and yields the - opposite result. "not false" evaluates to "true" and "not true" - evaluates to "false". - -5.9. Test size - - Syntax: size <":over" / ":under"> <limit: number> - - The "size" test deals with the size of a message. It takes either a - tagged argument of ":over" or ":under", followed by a number - representing the size of the message. - - If the argument is ":over", and the size of the message is greater - than the number provided, the test is true; otherwise, it is false. - - If the argument is ":under", and the size of the message is less than - the number provided, the test is true; otherwise, it is false. - - Exactly one of ":over" or ":under" must be specified, and anything - else is an error. - - The size of a message is defined to be the number of octets from the - initial header until the last character in the message body. - - Note that for a message that is exactly 4,000 octets, the message is - neither ":over" 4000 octets or ":under" 4000 octets. - -5.10. Test true - - Syntax: true - - The "true" test always evaluates to true. - -6. Extensibility - - New control structures, actions, and tests can be added to the - language. Sites must make these features known to their users; this - document does not define a way to discover the list of extensions - supported by the server. - - - -Showalter Standards Track [Page 26] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Any extensions to this language MUST define a capability string that - uniquely identifies that extension. If a new version of an extension - changes the functionality of a previously defined extension, it MUST - use a different name. - - In a situation where there is a submission protocol and an extension - advertisement mechanism aware of the details of this language, - scripts submitted can be checked against the mail server to prevent - use of an extension that the server does not support. - - Extensions MUST state how they interact with constraints defined in - section 2.10, e.g., whether they cancel the implicit keep, and which - actions they are compatible and incompatible with. - -6.1. Capability String - - Capability strings are typically short strings describing what - capabilities are supported by the server. - - Capability strings beginning with "vnd." represent vendor-defined - extensions. Such extensions are not defined by Internet standards or - RFCs, but are still registered with IANA in order to prevent - conflicts. Extensions starting with "vnd." SHOULD be followed by the - name of the vendor and product, such as "vnd.acme.rocket-sled". - - The following capability strings are defined by this document: - - envelope The string "envelope" indicates that the implementation - supports the "envelope" command. - - fileinto The string "fileinto" indicates that the implementation - supports the "fileinto" command. - - reject The string "reject" indicates that the implementation - supports the "reject" command. - - comparator- The string "comparator-elbonia" is provided if the - implementation supports the "elbonia" comparator. - Therefore, all implementations have at least the - "comparator-i;octet" and "comparator-i;ascii-casemap" - capabilities. However, these comparators may be used - without being declared with require. - - - - - - - - - -Showalter Standards Track [Page 27] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -6.2. IANA Considerations - - In order to provide a standard set of extensions, a registry is - provided by IANA. Capability names may be registered on a first- - come, first-served basis. Extensions designed for interoperable use - SHOULD be defined as standards track or IESG approved experimental - RFCs. - -6.2.1. Template for Capability Registrations - - The following template is to be used for registering new Sieve - extensions with IANA. - - To: iana@iana.org - Subject: Registration of new Sieve extension - - Capability name: - Capability keyword: - Capability arguments: - Standards Track/IESG-approved experimental RFC number: - Person and email address to contact for further information: - -6.2.2. Initial Capability Registrations - - The following are to be added to the IANA registry for Sieve - extensions as the initial contents of the capability registry. - - Capability name: fileinto - Capability keyword: fileinto - Capability arguments: fileinto <folder: string> - Standards Track/IESG-approved experimental RFC number: - RFC 3028 (Sieve base spec) - Person and email address to contact for further information: - Tim Showalter - tjs@mirapoint.com - - Capability name: reject - Capability keyword: reject - Capability arguments: reject <reason: string> - Standards Track/IESG-approved experimental RFC number: - RFC 3028 (Sieve base spec) - Person and email address to contact for further information: - Tim Showalter - tjs@mirapoint.com - - - - - - - -Showalter Standards Track [Page 28] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Capability name: envelope - Capability keyword: envelope - Capability arguments: - envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] - <envelope-part: string-list> <key-list: string-list> - Standards Track/IESG-approved experimental RFC number: - RFC 3028 (Sieve base spec) - Person and email address to contact for further information: - Tim Showalter - tjs@mirapoint.com - - Capability name: comparator-* - Capability keyword: - comparator-* (anything starting with "comparator-") - Capability arguments: (none) - Standards Track/IESG-approved experimental RFC number: - RFC 3028, Sieve, by reference of - RFC 2244, Application Configuration Access Protocol - Person and email address to contact for further information: - Tim Showalter - tjs@mirapoint.com - -6.3. Capability Transport - - As the range of mail systems that this document is intended to apply - to is quite varied, a method of advertising which capabilities an - implementation supports is difficult due to the wide range of - possible implementations. Such a mechanism, however, should have - property that the implementation can advertise the complete set of - extensions that it supports. - -7. Transmission - - The MIME type for a Sieve script is "application/sieve". - - The registration of this type for RFC 2048 requirements is as - follows: - - Subject: Registration of MIME media type application/sieve - - MIME media type name: application - MIME subtype name: sieve - Required parameters: none - Optional parameters: none - Encoding considerations: Most sieve scripts will be textual, - written in UTF-8. When non-7bit characters are used, - quoted-printable is appropriate for transport systems - that require 7bit encoding. - - - -Showalter Standards Track [Page 29] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - Security considerations: Discussed in section 10 of RFC 3028. - Interoperability considerations: Discussed in section 2.10.5 - of RFC 3028. - Published specification: RFC 3028. - Applications which use this media type: sieve-enabled mail servers - Additional information: - Magic number(s): - File extension(s): .siv - Macintosh File Type Code(s): - Person & email address to contact for further information: - See the discussion list at ietf-mta-filters@imc.org. - Intended usage: - COMMON - Author/Change controller: - See Author information in RFC 3028. - -8. Parsing - - The Sieve grammar is separated into tokens and a separate grammar as - most programming languages are. - -8.1. Lexical Tokens - - Sieve scripts are encoded in UTF-8. The following assumes a valid - UTF-8 encoding; special characters in Sieve scripts are all ASCII. - - The following are tokens in Sieve: - - - identifiers - - tags - - numbers - - quoted strings - - multi-line strings - - other separators - - Blanks, horizontal tabs, CRLFs, and comments ("white space") are - ignored except as they separate tokens. Some white space is required - to separate otherwise adjacent tokens and in specific places in the - multi-line strings. - - The other separators are single individual characters, and are - mentioned explicitly in the grammar. - - The lexical structure of sieve is defined in the following BNF (as - described in [ABNF]): - - - - - - -Showalter Standards Track [Page 30] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - bracket-comment = "/*" *(CHAR-NOT-STAR / ("*" CHAR-NOT-SLASH)) "*/" - ;; No */ allowed inside a comment. - ;; (No * is allowed unless it is the last character, - ;; or unless it is followed by a character that isn't a - ;; slash.) - - CHAR-NOT-DOT = (%x01-09 / %x0b-0c / %x0e-2d / %x2f-ff) - ;; no dots, no CRLFs - - CHAR-NOT-CRLF = (%x01-09 / %x0b-0c / %x0e-ff) - - CHAR-NOT-SLASH = (%x00-57 / %x58-ff) - - CHAR-NOT-STAR = (%x00-51 / %x53-ff) - - comment = bracket-comment / hash-comment - - hash-comment = ( "#" *CHAR-NOT-CRLF CRLF ) - - identifier = (ALPHA / "_") *(ALPHA DIGIT "_") - - tag = ":" identifier - - number = 1*DIGIT [QUANTIFIER] - - QUANTIFIER = "K" / "M" / "G" - - quoted-string = DQUOTE *CHAR DQUOTE - ;; in general, \ CHAR inside a string maps to CHAR - ;; so \" maps to " and \\ maps to \ - ;; note that newlines and other characters are all allowed - ;; strings - - multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) - *(multi-line-literal / multi-line-dotstuff) - "." CRLF - multi-line-literal = [CHAR-NOT-DOT *CHAR-NOT-CRLF] CRLF - multi-line-dotstuff = "." 1*CHAR-NOT-CRLF CRLF - ;; A line containing only "." ends the multi-line. - ;; Remove a leading '.' if followed by another '.'. - - white-space = 1*(SP / CRLF / HTAB) / comment - -8.2. Grammar - - The following is the grammar of Sieve after it has been lexically - interpreted. No white space or comments appear below. The start - symbol is "start". - - - -Showalter Standards Track [Page 31] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - argument = string-list / number / tag - - arguments = *argument [test / test-list] - - block = "{" commands "}" - - command = identifier arguments ( ";" / block ) - - commands = *command - - start = commands - - string = quoted-string / multi-line - - string-list = "[" string *("," string) "]" / string ;; if - there is only a single string, the brackets are optional - - test = identifier arguments - - test-list = "(" test *("," test) ")" - -9. Extended Example - - The following is an extended example of a Sieve script. Note that it - does not make use of the implicit keep. - - # - # Example Sieve Filter - # Declare any optional features or extension used by the script - # - require ["fileinto", "reject"]; - - # - # Reject any large messages (note that the four leading dots get - # "stuffed" to three) - # - if size :over 1M - { - reject text: - Please do not send me large attachments. - Put your file on a server and send me the URL. - Thank you. - .... Fred - . - ; - stop; - } - # - - - -Showalter Standards Track [Page 32] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - # Handle messages from known mailing lists - # Move messages from IETF filter discussion list to filter folder - # - if header :is "Sender" "owner-ietf-mta-filters@imc.org" - { - fileinto "filter"; # move to "filter" folder - } - # - # Keep all messages to or from people in my company - # - elsif address :domain :is ["From", "To"] "example.com" - { - keep; # keep in "In" folder - } - - # - # Try and catch unsolicited email. If a message is not to me, - # or it contains a subject known to be spam, file it away. - # - elsif anyof (not address :all :contains - ["To", "Cc", "Bcc"] "me@example.com", - header :matches "subject" - ["*make*money*fast*", "*university*dipl*mas*"]) - { - # If message header does not contain my address, - # it's from a list. - fileinto "spam"; # move to "spam" folder - } - else - { - # Move all other (non-company) mail to "personal" - # folder. - fileinto "personal"; - } - - - - - - - - - - - - - - - - - -Showalter Standards Track [Page 33] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -10. Security Considerations - - Users must get their mail. It is imperative that whatever method - implementations use to store the user-defined filtering scripts be - secure. - - It is equally important that implementations sanity-check the user's - scripts, and not allow users to create on-demand mailbombs. For - instance, an implementation that allows a user to reject or redirect - multiple times to a single message might also allow a user to create - a mailbomb triggered by mail from a specific user. Site- or - implementation-defined limits on actions are useful for this. - - Several commands, such as "discard", "redirect", and "fileinto" allow - for actions to be taken that are potentially very dangerous. - - Implementations SHOULD take measures to prevent languages from - looping. - -11. Acknowledgments - - I am very thankful to Chris Newman for his support and his ABNF - syntax checker, to John Myers and Steve Hole for outlining the - requirements for the original drafts, to Larry Greenfield for nagging - me about the grammar and finally fixing it, to Greg Sereda for - repeatedly fixing and providing examples, to Ned Freed for fixing - everything else, to Rob Earhart for an early implementation and a - great deal of help, and to Randall Gellens for endless amounts of - proofreading. I am grateful to Carnegie Mellon University where most - of the work on this document was done. I am also indebted to all of - the readers of the ietf-mta-filters@imc.org mailing list. - -12. Author's Address - - Tim Showalter - Mirapoint, Inc. - 909 Hermosa Court - Sunnyvale, CA 94085 - - EMail: tjs@mirapoint.com - -13. References - - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 2234, November 1997. - - - - - - -Showalter Standards Track [Page 34] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - - [ACAP] Newman, C. and J. G. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November 1997. - - [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in - electrical technology - Part 2: Telecommunications and - electronics", January 1999. - - [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format - for Delivery Status Notifications", RFC 1894, January - 1996. - - [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and - Cooperative Work in a Practical Multimedia Message - System", Int. J. of Man-Machine Studies, April, 1991. - Reprinted in Computer-Supported Cooperative Work and - Groupware, Saul Greenberg, editor, Harcourt Brace - Jovanovich, 1991. Reprinted in Readings in Groupware and - Computer-Supported Cooperative Work, Ronald Baecker, - editor, Morgan Kaufmann, 1993. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [IMAP] Crispin, M., "Internet Message Access Protocol - version - 4rev1", RFC 2060, December 1996. - - [IMAIL] Crocker, D., "Standard for the Format of ARPA Internet - Text Messages", STD 11, RFC 822, August 1982. - - [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail - Extensions (MIME) Part One: Format of Internet Message - Bodies", RFC 2045, November 1996. - - [MDN] Fajman, R., "An Extensible Message Format for Message - Disposition Notifications", RFC 2298, March 1998. - - [RFC1123] Braden, R., "Requirements for Internet Hosts -- - Application and Support", STD 3, RFC 1123, November 1989. - - [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC - 821, August 1982. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of Unicode - and ISO 10646", RFC 2044, October 1996. - - - - - - - -Showalter Standards Track [Page 35] - -RFC 3028 Sieve: A Mail Filtering Language January 2001 - - -14. Full Copyright Statement - - Copyright (C) The Internet Society (2001). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Showalter Standards Track [Page 36] - diff --git a/proto/sieve/rfc3431.txt b/proto/sieve/rfc3431.txt @@ -1,451 +0,0 @@ - - - - - - -Network Working Group W. Segmuller -Request for Comment: 3431 IBM T.J. Watson Research Center -Category: Standards Track December 2002 - - - Sieve Extension: Relational Tests - -Status of this Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (C) The Internet Society (2002). All Rights Reserved. - -Abstract - - This document describes the RELATIONAL extension to the Sieve mail - filtering language defined in RFC 3028. This extension extends - existing conditional tests in Sieve to allow relational operators. - In addition to testing their content, it also allows for testing of - the number of entities in header and envelope fields. - -1 Introduction - - Sieve [SIEVE] is a language for filtering e-mail messages at the time - of final delivery. It is designed to be implementable on either a - mail client or mail server. It is meant to be extensible, simple, - and independent of access protocol, mail architecture, and operating - system. It is suitable for running on a mail server where users may - not be allowed to execute arbitrary programs, such as on black box - Internet Messages Access Protocol (IMAP) servers, as it has no - variables, loops, nor the ability to shell out to external programs. - - The RELATIONAL extension provides relational operators on the - address, envelope, and header tests. This extension also provides a - way of counting the entities in a message header or address field. - - With this extension, the sieve script may now determine if a field is - greater than or less than a value instead of just equivalent. One - use is for the x-priority field: move messages with a priority - greater than 3 to the "work on later" folder. Mail could also be - sorted by the from address. Those userids that start with 'a'-'m' go - to one folder, and the rest go to another folder. - - - -Segmuller Standards Track [Page 1] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - - The sieve script can also determine the number of fields in the - header, or the number of addresses in a recipient field. For - example: are there more than 5 addresses in the to and cc fields. - -2 Conventions used in this document - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in BCP 14, RFC 2119. - - Conventions for notations are as in [SIEVE] section 1.1, including - the use of [KEYWORDS] and "Syntax:" label for the definition of - action and tagged arguments syntax, and the use of [ABNF]. - - The capability string associated with extension defined in this - document is "relational". - -3 Comparators - - This document does not define any comparators or exempt any - comparators from the require clause. Any comparator used, other than - "i;octet" and "i;ascii-casemap", MUST be declared a require clause as - defined in [SIEVE]. - - The "i;ascii-numeric" comparator, as defined in [ACAP], MUST be - supported for any implementation of this extension. The comparator - "i;ascii-numeric" MUST support at least 32 bit unsigned integers. - - Larger integers MAY be supported. Note: the "i;ascii-numeric" - comparator does not support negative numbers. - -4 Match Type - - This document defines two new match types. They are the VALUE match - type and the COUNT match type. - - The syntax is: - - MATCH-TYPE =/ COUNT / VALUE - - COUNT = ":count" relational-match - - VALUE = ":value" relational-match - - relational-match = DQUOTE ( "gt" / "ge" / "lt" - / "le" / "eq" / "ne" ) DQUOTE - - - - - -Segmuller Standards Track [Page 2] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - -4.1 Match Type Value - - The VALUE match type does a relational comparison between strings. - - The VALUE match type may be used with any comparator which returns - sort information. - - Leading and trailing white space MUST be removed from the value of - the message for the comparison. White space is defined as - - SP / HTAB / CRLF - - A value from the message is considered the left side of the relation. - A value from the test expression, the key-list for address, envelope, - and header tests, is the right side of the relation. - - If there are multiple values on either side or both sides, the test - is considered true, if any pair is true. - -4.2 Match Type Count - - The COUNT match type first determines the number of the specified - entities in the message and does a relational comparison of the - number of entities to the values specified in the test expression. - - The COUNT match type SHOULD only be used with numeric comparators. - - The Address Test counts the number of recipients in the specified - fields. Group names are ignored. - - The Envelope Test counts the number of recipients in the specified - envelope parts. The envelope "to" will always have only one entry, - which is the address of the user for whom the sieve script is - running. There is no way a sieve script can determine if the message - was actually sent to someone else using this test. The envelope - "from" will be 0 if the MAIL FROM is blank, or 1 if MAIL FROM is not - blank. - - The Header Test counts the total number of instances of the specified - fields. This does not count individual addresses in the "to", "cc", - and other recipient fields. - - In all cases, if more than one field name is specified, the counts - for all specified fields are added together to obtain the number for - comparison. Thus, specifying ["to", "cc"] in an address COUNT test, - comparing the total number of "to" and "cc" addresses; if separate - counts are desired, they must be done in two comparisons, perhaps - joined by "allof" or "anyof". - - - -Segmuller Standards Track [Page 3] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - -5 Security Considerations - - Security considerations are discussed in [SIEVE]. - - An implementation MUST ensure that the test for envelope "to" only - reflects the delivery to the current user. It MUST not be possible - for a user to determine if this message was delivered to someone else - using this test. - -6 Example - - Using the message: - - received: ... - received: ... - subject: example - to: foo@example.com.invalid, baz@example.com.invalid - cc: qux@example.com.invalid - - The test: - - address :count "ge" :comparator "i;ascii-numeric" ["to", "cc"] - ["3"] - - would be true and the test - - anyof ( address :count "ge" :comparator "i;ascii-numeric" - ["to"] ["3"], - address :count "ge" :comparator "i;ascii-numeric" - ["cc"] ["3"] ) - - would be false. - - To check the number of received fields in the header, the - following test may be used: - - header :count "ge" :comparator "i;ascii-numeric" - ["received"] ["3"] - - This would return false. But - - header :count "ge" :comparator "i;ascii-numeric" - ["received", "subject"] ["3"] - - would return true. - - - - - - -Segmuller Standards Track [Page 4] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - - The test: - - header :count "ge" :comparator "i;ascii-numeric" - ["to", "cc"] ["3"] - - will always return false on an RFC 2822 compliant message [RFC2822], - since a message can have at most one "to" field and at most one "cc" - field. This test counts the number of fields, not the number of - addresses. - -7 Extended Example - - require ["relational", "comparator-i;ascii-numeric"]; - - if header :value "lt" :comparator "i;ascii-numeric" - ["x-priority"] ["3"] - { - fileinto "Priority"; - } - - elseif address :count "gt" :comparator "i;ascii-numeric" - ["to"] ["5"] - { - # everything with more than 5 recipients in the "to" field - # is considered SPAM - fileinto "SPAM"; - } - - elseif address :value "gt" :all :comparator "i;ascii-casemap" - ["from"] ["M"] - { - fileinto "From N-Z"; - } else { - fileinto "From A-M"; - } - - if allof ( address :count "eq" :comparator "i;ascii-numeric" - ["to", "cc"] ["1"] , - address :all :comparator "i;ascii-casemap" - ["to", "cc"] ["me@foo.example.com.invalid"] - { - fileinto "Only me"; - } - - - - - - - - -Segmuller Standards Track [Page 5] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - -8 IANA Considerations - - The following template specifies the IANA registration of the Sieve - extension specified in this document: - - To: iana@iana.org - Subject: Registration of new Sieve extension - - Capability name: RELATIONAL - Capability keyword: relational - Capability arguments: N/A - Standards Track/IESG-approved experimental RFC number: this RFC - Person and email address to contact for further information: - Wolfgang Segmuller - IBM T.J. Watson Research Center - 30 Saw Mill River Rd - Hawthorne, NY 10532 - - Email: whs@watson.ibm.com - - This information should be added to the list of sieve extensions - given on http://www.iana.org/assignments/sieve-extensions. - -9 References - -9.1 Normative References - - [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", RFC - 3028, January 2001. - - [Keywords] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [ABNF] Crocker, D., "Augmented BNF for Syntax Specifications: - ABNF", RFC 2234, November 1997. - - [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April - 2001. - -9.2 Non-Normative References - - [ACAP] Newman, C. and J. G. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November 1997. - - - - - - - - -Segmuller Standards Track [Page 6] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - -10 Author's Address - - Wolfgang Segmuller - IBM T.J. Watson Research Center - 30 Saw Mill River Rd - Hawthorne, NY 10532 - - EMail: whs@watson.ibm.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Segmuller Standards Track [Page 7] - -RFC 3431 Sieve Extension: Relational Tests December 2002 - - -11 Full Copyright Statement - - Copyright (C) The Internet Society (2002). All Rights Reserved. - - This document and translations of it may be copied and furnished to - others, and derivative works that comment on or otherwise explain it - or assist in its implementation may be prepared, copied, published - and distributed, in whole or in part, without restriction of any - kind, provided that the above copyright notice and this paragraph are - included on all such copies and derivative works. However, this - document itself may not be modified in any way, such as by removing - the copyright notice or references to the Internet Society or other - Internet organizations, except as needed for the purpose of - developing Internet standards in which case the procedures for - copyrights defined in the Internet Standards process must be - followed, or as required to translate it into languages other than - English. - - The limited permissions granted above are perpetual and will not be - revoked by the Internet Society or its successors or assigns. - - This document and the information contained herein is provided on an - "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING - TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING - BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION - HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF - MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Acknowledgement - - Funding for the RFC Editor function is currently provided by the - Internet Society. - - - - - - - - - - - - - - - - - - - -Segmuller Standards Track [Page 8] - diff --git a/proto/sieve/rfc5231.txt b/proto/sieve/rfc5231.txt @@ -1,507 +0,0 @@ - - - - - - -Network Working Group W. Segmuller -Request for Comments: 5231 B. Leiba -Obsoletes: 3431 IBM T.J. Watson Research Center -Category: Standards Track January 2008 - - - Sieve Email Filtering: Relational Extension - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - This document describes the RELATIONAL extension to the Sieve mail - filtering language defined in RFC 3028. This extension extends - existing conditional tests in Sieve to allow relational operators. - In addition to testing their content, it also allows for testing of - the number of entities in header and envelope fields. - - This document obsoletes RFC 3431. - -Table of Contents - - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 2. Conventions Used in This Document . . . . . . . . . . . . . . . 2 - 3. Comparators . . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 4. Match Types . . . . . . . . . . . . . . . . . . . . . . . . . . 3 - 4.1. Match Type VALUE . . . . . . . . . . . . . . . . . . . . . 3 - 4.2. Match Type COUNT . . . . . . . . . . . . . . . . . . . . . 3 - 5. Interaction with Other Sieve Actions . . . . . . . . . . . . . 4 - 6. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 7. Extended Example . . . . . . . . . . . . . . . . . . . . . . . 6 - 8. Changes since RFC 3431 . . . . . . . . . . . . . . . . . . . . 6 - 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 - 10. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 - 11. Normative References . . . . . . . . . . . . . . . . . . . . . 7 - - - - - - - - - - -Segmuller & Leiba Standards Track [Page 1] - -RFC 5231 Sieve: Relational Extension January 2008 - - -1. Introduction - - The RELATIONAL extension to the Sieve mail filtering language [Sieve] - provides relational operators on the address, envelope, and header - tests. This extension also provides a way of counting the entities - in a message header or address field. - - With this extension, the Sieve script may now determine if a field is - greater than or less than a value instead of just equivalent. One - use is for the x-priority field: move messages with a priority - greater than 3 to the "work on later" folder. Mail could also be - sorted by the from address. Those userids that start with 'a'-'m' go - to one folder, and the rest go to another folder. - - The Sieve script can also determine the number of fields in the - header, or the number of addresses in a recipient field, for example, - whether there are more than 5 addresses in the to and cc fields. - - The capability string associated with the extension defined in this - document is "relational". - -2. Conventions Used in This Document - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in BCP 14, RFC 2119. - - Conventions for notations are as in [Sieve] section 1.1, including - the use of [Kwds] and the use of [ABNF]. - -3. Comparators - - This document does not define any comparators or exempt any - comparators from the require clause. Any comparator used must be - treated as defined in [Sieve]. - - The "i;ascii-numeric" comparator, as defined in [RFC4790], MUST be - supported for any implementation of this extension. The comparator - "i;ascii-numeric" MUST support at least 32-bit unsigned integers. - - Larger integers MAY be supported. Note: the "i;ascii-numeric" - comparator does not support negative numbers. - - - - - - - - - -Segmuller & Leiba Standards Track [Page 2] - -RFC 5231 Sieve: Relational Extension January 2008 - - -4. Match Types - - This document defines two new match types. They are the VALUE match - type and the COUNT match type. - - The syntax is: - - MATCH-TYPE =/ COUNT / VALUE - - COUNT = ":count" relational-match - - VALUE = ":value" relational-match - - relational-match = DQUOTE - ("gt" / "ge" / "lt" / "le" / "eq" / "ne") DQUOTE - ; "gt" means "greater than", the C operator ">". - ; "ge" means "greater than or equal", the C operator ">=". - ; "lt" means "less than", the C operator "<". - ; "le" means "less than or equal", the C operator "<=". - ; "eq" means "equal to", the C operator "==". - ; "ne" means "not equal to", the C operator "!=". - -4.1. Match Type VALUE - - The VALUE match type does a relational comparison between strings. - - The VALUE match type may be used with any comparator that returns - sort information. - - A value from the message is considered the left side of the relation. - A value from the test expression, the key-list for address, envelope, - and header tests, is the right side of the relation. - - If there are multiple values on either side or both sides, the test - is considered true if any pair is true. - -4.2. Match Type COUNT - - The COUNT match type first determines the number of the specified - entities in the message and does a relational comparison of the - number of entities, as defined below, to the values specified in the - test expression. - - The COUNT match type SHOULD only be used with numeric comparators. - - The Address Test counts the number of addresses (the number of - "mailbox" elements, as defined in [RFC2822]) in the specified fields. - Group names are ignored, but the contained mailboxes are counted. - - - -Segmuller & Leiba Standards Track [Page 3] - -RFC 5231 Sieve: Relational Extension January 2008 - - - The Envelope Test counts the number of addresses in the specified - envelope parts. The envelope "to" will always have only one entry, - which is the address of the user for whom the Sieve script is - running. Using this test, there is no way a Sieve script can - determine if the message was actually sent to someone else. The - envelope "from" will be 0 if the MAIL FROM is empty, or 1 if MAIL - FROM is not empty. - - The Header Test counts the total number of instances of the specified - fields. This does not count individual addresses in the "to", "cc", - and other recipient fields. - - In all cases, if more than one field name is specified, the counts - for all specified fields are added together to obtain the number for - comparison. Thus, specifying ["to", "cc"] in an address COUNT test - compares the total number of "to" and "cc" addresses; if separate - counts are desired, they must be done in two comparisons, perhaps - joined by "allof" or "anyof". - -5. Interaction with Other Sieve Actions - - This specification adds two match types. The VALUE match type only - works with comparators that return sort information. The COUNT match - type only makes sense with numeric comparators. - - There is no interaction with any other Sieve operations, nor with any - known extensions. In particular, this specification has no effect on - implicit KEEP, nor on any explicit message actions. - -6. Example - - Using the message: - - received: ... - received: ... - subject: example - to: foo@example.com, baz@example.com - cc: qux@example.com - - The test: - - address :count "ge" :comparator "i;ascii-numeric" - ["to", "cc"] ["3"] - - would evaluate to true, and the test - - - - - - -Segmuller & Leiba Standards Track [Page 4] - -RFC 5231 Sieve: Relational Extension January 2008 - - - anyof ( address :count "ge" :comparator "i;ascii-numeric" - ["to"] ["3"], - address :count "ge" :comparator "i;ascii-numeric" - ["cc"] ["3"] ) - - would evaluate to false. - - To check the number of received fields in the header, the following - test may be used: - - header :count "ge" :comparator "i;ascii-numeric" - ["received"] ["3"] - - This would evaluate to false. But - - header :count "ge" :comparator "i;ascii-numeric" - ["received", "subject"] ["3"] - - would evaluate to true. - - The test: - - header :count "ge" :comparator "i;ascii-numeric" - ["to", "cc"] ["3"] - - will always evaluate to false on an RFC 2822 compliant message - [RFC2822], since a message can have at most one "to" field and at - most one "cc" field. This test counts the number of fields, not the - number of addresses. - - - - - - - - - - - - - - - - - - - - - - -Segmuller & Leiba Standards Track [Page 5] - -RFC 5231 Sieve: Relational Extension January 2008 - - -7. Extended Example - - require ["relational", "comparator-i;ascii-numeric", "fileinto"]; - - if header :value "lt" :comparator "i;ascii-numeric" - ["x-priority"] ["3"] - { - fileinto "Priority"; - } - - elsif address :count "gt" :comparator "i;ascii-numeric" - ["to"] ["5"] - { - # everything with more than 5 recipients in the "to" field - # is considered SPAM - fileinto "SPAM"; - } - - elsif address :value "gt" :all :comparator "i;ascii-casemap" - ["from"] ["M"] - { - fileinto "From N-Z"; - } else { - fileinto "From A-M"; - } - - if allof ( address :count "eq" :comparator "i;ascii-numeric" - ["to", "cc"] ["1"] , - address :all :comparator "i;ascii-casemap" - ["to", "cc"] ["me@foo.example.com"] ) - { - fileinto "Only me"; - } - -8. Changes since RFC 3431 - - Apart from several minor editorial/wording changes, the following - list describes the notable changes to this specification since RFC - 3431. - - o Updated references, including changing the comparator reference - from the Application Configuration Access Protocol (ACAP) to the - "Internet Application Protocol Collation Registry" document - [RFC4790]. - - o Updated and corrected the examples. - - - - - -Segmuller & Leiba Standards Track [Page 6] - -RFC 5231 Sieve: Relational Extension January 2008 - - - o Added definition comments to ABNF for "gt", "lt", etc. - - o Clarified what RFC 2822 elements are counted in the COUNT test. - - o Removed the requirement to strip white space from header fields - before comparing; a more general version of this requirement has - been added to the Sieve base spec. - -9. IANA Considerations - - The following template specifies the IANA registration of the - relational Sieve extension specified in this document: - - To: iana@iana.org - Subject: Registration of new Sieve extension - - Capability name: relational - Description: Extends existing conditional tests in Sieve language - to allow relational operators - RFC number: RFC 5231 - Contact address: The Sieve discussion list <ietf-mta-filters@imc.org> - -10. Security Considerations - - An implementation MUST ensure that the test for envelope "to" only - reflects the delivery to the current user. Using this test, it MUST - not be possible for a user to determine if this message was delivered - to someone else. - - Additional security considerations are discussed in [Sieve]. - -11. Normative References - - [ABNF] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", RFC 4234, October 2005. - - [Kwds] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", RFC 2119, March 1997. - - [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, - April 2001. - - [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet - Application Protocol Collation Registry", RFC 4790, - March 2007. - - [Sieve] Guenther, P., Ed. and T. Showalter, Ed., "Sieve: An Email - Filtering Language", RFC 5228, January 2008. - - - -Segmuller & Leiba Standards Track [Page 7] - -RFC 5231 Sieve: Relational Extension January 2008 - - -Authors' Addresses - - Wolfgang Segmuller - IBM T.J. Watson Research Center - 19 Skyline Drive - Hawthorne, NY 10532 - US - - Phone: +1 914 784 7408 - EMail: werewolf@us.ibm.com - - - Barry Leiba - IBM T.J. Watson Research Center - 19 Skyline Drive - Hawthorne, NY 10532 - US - - Phone: +1 914 784 7941 - EMail: leiba@watson.ibm.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Segmuller & Leiba Standards Track [Page 8] - -RFC 5231 Sieve: Relational Extension January 2008 - - -Full Copyright Statement - - Copyright (C) The IETF Trust (2008). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND - THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS - OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF - THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the procedures with respect to rights in RFC documents can be - found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - - - - - - - - - - - - -Segmuller & Leiba Standards Track [Page 9] - diff --git a/proto/sieve/rfc5260.txt b/proto/sieve/rfc5260.txt @@ -1,731 +0,0 @@ - - - - - - -Network Working Group N. Freed -Request for Comments: 5260 Sun Microsystems -Category: Standards Track July 2008 - - - Sieve Email Filtering: Date and Index Extensions - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Abstract - - This document describes the "date" and "index" extensions to the - Sieve email filtering language. The "date" extension gives Sieve the - ability to test date and time values in various ways. The "index" - extension provides a means to limit header and address tests to - specific instances of header fields when header fields are repeated. - -Table of Contents - - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 2. Conventions Used in This Document . . . . . . . . . . . . . . 2 - 3. Capability Identifiers . . . . . . . . . . . . . . . . . . . . 3 - 4. Date Test . . . . . . . . . . . . . . . . . . . . . . . . . . 3 - 4.1. Zone and Originalzone Arguments . . . . . . . . . . . . . 4 - 4.2. Date-part Argument . . . . . . . . . . . . . . . . . . . . 4 - 4.3. Comparator Interactions with Date-part Arguments . . . . . 5 - 4.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 5. Currentdate Test . . . . . . . . . . . . . . . . . . . . . . . 6 - 5.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 6. Index Extension . . . . . . . . . . . . . . . . . . . . . . . 7 - 6.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 - 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 9.1. Normative References . . . . . . . . . . . . . . . . . . . 9 - 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 - Appendix A. Julian Date Conversions . . . . . . . . . . . . . . . 11 - Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 12 - - - - - - - -Freed Standards Track [Page 1] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -1. Introduction - - Sieve [RFC5228] is a language for filtering email messages at or - around the time of final delivery. It is designed to be - implementable on either a mail client or mail server. It is meant to - be extensible, simple, and independent of access protocol, mail - architecture, and operating system. It is suitable for running on a - mail server where users may not be allowed to execute arbitrary - programs, such as on black box Internet Message Access Protocol - [RFC3501] servers, as it does not have user-controlled loops or the - ability to run external programs. - - The "date" extension provides a new date test to extract and match - date/time information from structured header fields. The date test - is similar in concept to the address test specified in [RFC5228], - which performs similar operations on addresses in header fields. - - The "date" extension also provides a currentdate test that operates - on the date and time when the Sieve script is executed. - - Some header fields containing date/time information, e.g., Received:, - naturally occur more than once in a single header. In such cases it - is useful to be able to restrict the date test to some subset of the - fields that are present. For example, it may be useful to apply a - date test to the last (earliest) Received: field. Additionally, it - may also be useful to apply similar restrictions to either the header - or address tests specified in [RFC5228]. - - For this reason, this specification also defines an "index" - extension. This extension adds two additional tagged arguments - :index and :last to the header, address, and date tests. If present, - these arguments specify which occurrence of the named header field is - to be tested. - -2. Conventions Used in This Document - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in RFC 2119 [RFC2119]. - - The terms used to describe the various components of the Sieve - language are taken from Section 1.1 of [RFC5228]. Section 2 of the - same document describes basic Sieve language syntax and semantics. - The date-time syntactic element defined using ABNF notation [RFC5234] - in [RFC3339] is also used here. - - - - - - -Freed Standards Track [Page 2] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -3. Capability Identifiers - - The capability strings associated with the two extensions defined in - this document are "date" and "index". - -4. Date Test - - Usage: date [<":zone" <time-zone: string>> / ":originalzone"] - [COMPARATOR] [MATCH-TYPE] <header-name: string> - <date-part: string> <key-list: string-list> - - The date test matches date/time information derived from headers - containing [RFC2822] date-time values. The date/time information is - extracted from the header, shifted to the specified time zone, and - the value of the given date-part is determined. The test returns - true if the resulting string matches any of the strings specified in - the key-list, as controlled by the comparator and match keywords. - The date test returns false unconditionally if the specified header - field does not exist, the field exists but does not contain a - syntactically valid date-time specification, the date-time isn't - valid according to the rules of the calendar system (e.g., January - 32nd, February 29 in a non-leap year), or the resulting string fails - to match any key-list value. - - The type of match defaults to ":is" and the default comparator is - "i;ascii-casemap". - - Unlike the header and address tests, the date test can only be - applied to a single header field at a time. If multiple header - fields with the same name are present, only the first field that is - found is used. (Note, however, that this behavior can be modified - with the "index" extension defined below.) These restrictions - simplify the test and keep the meaning clear. - - The "relational" extension [RFC5231] adds a match type called - ":count". The count of a date test is 1 if the specified field - exists and contains a valid date; 0, otherwise. - - Implementations MUST support extraction of RFC 2822 date-time - information that either makes up the entire header field (e.g., as it - does in a standard Date: header field) or appears at the end of a - header field following a semicolon (e.g., as it does in a standard - Received: header field). Implementations MAY support extraction of - date and time information in RFC2822 or other formats that appears in - other positions in header field content. In the case of a field - containing more than one date or time value, the last one that - appears SHOULD be used. - - - - -Freed Standards Track [Page 3] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -4.1. Zone and Originalzone Arguments - - The :originalzone argument specifies that the time zone offset - originally in the extracted date-time value should be retained. The - :zone argument specifies a specific time zone offset that the date- - time value is to be shifted to prior to testing. It is an error to - specify both :zone and :originalzone. - - The value of time-zone MUST be an offset relative to UTC with the - following syntax: - - time-zone = ( "+" / "-" ) 4DIGIT - - The "+" or "-" indicates whether the time-of-day is ahead of (i.e., - east of) or behind (i.e., west of) UTC. The first two digits - indicate the number of hours difference from Universal Time, and the - last two digits indicate the number of minutes difference from - Universal Time. Note that this agrees with the RFC 2822 format for - time zone offsets, not the ISO 8601 format. - - If both the :zone and :originalzone arguments are omitted, the local - time zone MUST be used. - -4.2. Date-part Argument - - The date-part argument specifies a particular part of the resulting - date/time value to match against the key-list. Possible case- - insensitive values are: - - "year" => the year, "0000" .. "9999". - "month" => the month, "01" .. "12". - "day" => the day, "01" .. "31". - "date" => the date in "yyyy-mm-dd" format. - "julian" => the Modified Julian Day, that is, the date - expressed as an integer number of days since - 00:00 UTC on November 17, 1858 (using the Gregorian - calendar). This corresponds to the regular - Julian Day minus 2400000.5. Sample routines to - convert to and from modified Julian dates are - given in Appendix A. - "hour" => the hour, "00" .. "23". - "minute" => the minute, "00" .. "59". - "second" => the second, "00" .. "60". - "time" => the time in "hh:mm:ss" format. - "iso8601" => the date and time in restricted ISO 8601 format. - "std11" => the date and time in a format appropriate - for use in a Date: header field [RFC2822]. - - - - -Freed Standards Track [Page 4] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - "zone" => the time zone in use. If the user specified a - time zone with ":zone", "zone" will - contain that value. If :originalzone is specified - this value will be the original zone specified - in the date-time value. If neither argument is - specified the value will be the server's default - time zone in offset format "+hhmm" or "-hhmm". An - offset of 0 (Zulu) always has a positive sign. - "weekday" => the day of the week expressed as an integer between - "0" and "6". "0" is Sunday, "1" is Monday, etc. - - The restricted ISO 8601 format is specified by the date-time ABNF - production given in [RFC3339], Section 5.6, with the added - restrictions that the letters "T" and "Z" MUST be in upper case, and - a time zone offset of zero MUST be represented by "Z" and not - "+00:00". - -4.3. Comparator Interactions with Date-part Arguments - - Not all comparators are suitable with all date-part arguments. In - general, the date-parts can be compared and tested for equality with - either "i;ascii-casemap" (the default) or "i;octet", but there are - two exceptions: - - julian This is an integer, and may or may not have leading zeros. - As such, "i;ascii-numeric" is almost certainly the best - comparator to use with it. - - std11 This is provided as a means to obtain date/time values in a - format appropriate for inclusion in email header fields. The - wide range of possible syntaxes for a std11 date/time -- - which implementations of this extension are free to use when - composing a std11 string -- makes this format a poor choice - for comparisons. Nevertheless, if a comparison must be - performed, this is case-insensitive, and therefore "i;ascii- - casemap" needs to be used. - - "year", "month", "day", "hour", "minute", "second" and "weekday" all - use fixed-width string representations of integers, and can therefore - be compared with "i;octet", "i;ascii-casemap", and "i;ascii-numeric" - with equivalent results. - - "date" and "time" also use fixed-width string representations of - integers, and can therefore be compared with "i;octet" and "i;ascii- - casemap"; however, "i;ascii-numeric" can't be used with it, as - "i;ascii-numeric" doesn't allow for non-digit characters. - - - - - -Freed Standards Track [Page 5] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -4.4. Examples - - The Date: field can be checked to test when the sender claims to have - created the message and act accordingly: - - require ["date", "relational", "fileinto"]; - if allof(header :is "from" "boss@example.com", - date :value "ge" :originalzone "date" "hour" "09", - date :value "lt" :originalzone "date" "hour" "17") - { fileinto "urgent"; } - - Testing the initial Received: field can provide an indication of when - a message was actually received by the local system: - - require ["date", "relational", "fileinto"]; - if anyof(date :is "received" "weekday" "0", - date :is "received" "weekday" "6") - { fileinto "weekend"; } - -5. Currentdate Test - - Usage: currentdate [":zone" <time-zone: string>] - [COMPARATOR] [MATCH-TYPE] - <date-part: string> - <key-list: string-list> - - The currentdate test is similar to the date test, except that it - operates on the current date/time rather than a value extracted from - the message header. In particular, the ":zone" and date-part - arguments are the same as those in the date test. - - All currentdate tests in a single Sieve script MUST refer to the same - point in time during execution of the script. - - The :count value of a currentdate test is always 1. - -5.1. Examples - - The simplest use of currentdate is to have an action that only - operates at certain times. For example, a user might want to have - messages redirected to their pager after business hours and on - weekends: - - - - - - - - - -Freed Standards Track [Page 6] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - require ["date", "relational"]; - if anyof(currentdate :is "weekday" "0", - currentdate :is "weekday" "6", - currentdate :value "lt" "hour" "09", - currentdate :value "ge" "hour" "17") - { redirect "pager@example.com"; } - - Currentdate can be used to set up vacation [RFC5230] responses in - advance and to stop response generation automatically: - - require ["date", "relational", "vacation"]; - if allof(currentdate :value "ge" "date" "2007-06-30", - currentdate :value "le" "date" "2007-07-07") - { vacation :days 7 "I'm away during the first week in July."; } - - Currentdate may also be used in conjunction with the variables - extension to pass time-dependent arguments to other tests and - actions. The following Sieve places messages in a folder named - according to the current month and year: - - require ["date", "variables", "fileinto"]; - if currentdate :matches "month" "*" { set "month" "${1}"; } - if currentdate :matches "year" "*" { set "year" "${1}"; } - fileinto "${month}-${year}"; - - Finally, currentdate can be used in conjunction with the editheader - extension to insert a header-field containing date/time information: - - require ["variables", "date", "editheader"]; - if currentdate :matches "std11" "*" - {addheader "Processing-date" "${0}";} - -6. Index Extension - - The "index" extension, if specified, adds optional :index and :last - arguments to the header, address, and date tests as follows: - - Syntax: date [":index" <fieldno: number> [":last"]] - [<":zone" <time-zone: string>> / ":originalzone"] - [COMPARATOR] [MATCH-TYPE] <header-name: string> - <date-part: string> <key-list: string-list> - - - Syntax: header [":index" <fieldno: number> [":last"]] - [COMPARATOR] [MATCH-TYPE] - <header-names: string-list> <key-list: string-list> - - - - - -Freed Standards Track [Page 7] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - Syntax: address [":index" <fieldno: number> [":last"]] - [ADDRESS-PART] [COMPARATOR] [MATCH-TYPE] - <header-list: string-list> <key-list: string-list> - - If :index <fieldno> is specified, the attempts to match a value are - limited to the header field fieldno (beginning at 1, the first named - header field). If :last is also specified, the count is backwards; 1 - denotes the last named header field, 2 the second to last, and so on. - Specifying :last without :index is an error. - - :index only counts separate header fields, not multiple occurrences - within a single field. In particular, :index cannot be used to test - a specific address in an address list contained within a single - header field. - - Both header and address allow the specification of more than one - header field name. If more than one header field name is specified, - all the named header fields are counted in the order specified by the - header-list. - -6.1. Example - - Mail delivery may involve multiple hops, resulting in the Received: - field containing information about when a message first entered the - local administrative domain being the second or subsequent field in - the message. As long as the field offset is consistent, it can be - tested: - - # Implement the Internet-Draft cutoff date check assuming the - # second Received: field specifies when the message first - # entered the local email infrastructure. - require ["date", "relational", "index"]; - if date :value "gt" :index 2 :zone "-0500" "received" - "iso8601" "2007-02-26T09:00:00-05:00", - { redirect "aftercutoff@example.org"; } - -7. Security Considerations - - The facilities defined here, like the facilities in the base Sieve - specification, operate on message header information that can easily - be forged. Note, however, that some fields are inherently more - reliable than others. For example, the Date: field is typically - inserted by the message sender and can be altered at any point. By - contrast, the uppermost Received: field is typically inserted by the - local mail system and is therefore difficult for the sender or an - intermediary to falsify. - - - - - -Freed Standards Track [Page 8] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - Use of the currentdate test makes script behavior inherently less - predictable and harder to analyze. This may have consequences for - systems that use script analysis to try and spot problematic scripts. - - All of the security considerations given in the base Sieve - specification also apply to these extensions. - -8. IANA Considerations - - The following templates specify the IANA registrations of the two - Sieve extensions specified in this document: - - To: iana@iana.org - Subject: Registration of new Sieve extensions - - Capability name: date - Description: The "date" extension gives Sieve the ability - to test date and time values. - RFC number: RFC 5260 - Contact address: Sieve discussion list <ietf-mta-filters@imc.org> - - Capability name: index - Description: The "index" extension provides a means to - limit header and address tests to specific - instances when more than one field of a - given type is present. - RFC number: RFC 5260 - Contact address: Sieve discussion list <ietf-mta-filters@imc.org> - -9. References - -9.1. Normative References - - [CALGO199] Tantzen, R., "Algorithm 199: Conversions Between Calendar - Date and Julian Day Number", Collected Algorithms from - CACM 199. - - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, - April 2001. - - [RFC3339] Klyne, G., Ed. and C. Newman, "Date and Time on the - Internet: Timestamps", RFC 3339, July 2002. - - [RFC5228] Guenther, P. and T. Showalter, "Sieve: An Email Filtering - Language", RFC 5228, January 2008. - - - -Freed Standards Track [Page 9] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - [RFC5231] Segmuller, W. and B. Leiba, "Sieve Email Filtering: - Relational Extension", RFC 5231, January 2008. - - [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", STD 68, RFC 5234, January 2008. - -9.2. Informative References - - [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION - 4rev1", RFC 3501, March 2003. - - [RFC5230] Showalter, T. and N. Freed, "Sieve Email Filtering: - Vacation Extension", RFC 5230, January 2008. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Freed Standards Track [Page 10] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -Appendix A. Julian Date Conversions - - The following C routines show how to translate day/month/year - information to and from modified Julian dates. These routines are - straightforward translations of the Algol routines specified in CACM - Algorithm 199 [CALGO199]. - - Given the day, month, and year, jday returns the modified Julian - date. - - int jday(int year, int month, int day) - { - int j, c, ya; - - if (month > 2) - month -= 3; - else - { - month += 9; - year--; - } - c = year / 100; - ya = year - c * 100; - return (c * 146097 / 4 + ya * 1461 / 4 + (month * 153 + 2) / 5 + - day + 1721119); - } - - - - - - - - - - - - - - - - - - - - - - - - - -Freed Standards Track [Page 11] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - - Given j, the modified Julian date, jdate returns the day, month, and - year. - - void jdate(int j, int *year, int *month, int *day) - { - int y, m, d; - - j -= 1721119; - y = (j * 4 - 1) / 146097; - j = j * 4 - y * 146097 - 1; - d = j / 4; - j = (d * 4 + 3) / 1461; - d = d * 4 - j * 1461 + 3; - d = (d + 4) / 4; - m = (d * 5 - 3) / 153; - d = d * 5 - m * 153 - 3; - *day = (d + 5) / 5; - *year = y * 100 + j; - if (m < 10) - *month = m + 3; - else - { - *month = m - 9; - *year += 1; - } - } - -Appendix B. Acknowledgements - - Dave Cridland contributed the text describing the proper comparators - to use with different date-parts. Cyrus Daboo, Frank Ellerman, - Alexey Melnikov, Chris Newman, Dilyan Palauzov, and Aaron Stone - provided helpful suggestions and corrections. - -Author's Address - - Ned Freed - Sun Microsystems - 800 Royal Oaks - Monrovia, CA 91016-6347 - USA - - Phone: +1 909 457 4293 - EMail: ned.freed@mrochek.com - - - - - - - -Freed Standards Track [Page 12] - -RFC 5260 Sieve Date and Index Extensions July 2008 - - -Full Copyright Statement - - Copyright (C) The IETF Trust (2008). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors - retain all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS - OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND - THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS - OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF - THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - -Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in - this document or the extent to which any license under such rights - might or might not be available; nor does it represent that it has - made any independent effort to identify any such rights. Information - on the procedures with respect to rights in RFC documents can be - found in BCP 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an - attempt made to obtain a general license or permission for the use of - such proprietary rights by implementers or users of this - specification can be obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary - rights that may cover technology that may be required to implement - this standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - - - - - - - - - - - - -Freed Standards Track [Page 13] - diff --git a/proto/sieve/rfc5437.txt b/proto/sieve/rfc5437.txt @@ -1,787 +0,0 @@ - - - - - - -Network Working Group P. Saint-Andre -Request for Comments: 5437 Cisco -Category: Standards Track A. Melnikov - Isode Limited - January 2009 - - - Sieve Notification Mechanism: - Extensible Messaging and Presence Protocol (XMPP) - -Status of This Memo - - This document specifies an Internet standards track protocol for the - Internet community, and requests discussion and suggestions for - improvements. Please refer to the current edition of the "Internet - Official Protocol Standards" (STD 1) for the standardization state - and status of this protocol. Distribution of this memo is unlimited. - -Copyright Notice - - Copyright (c) 2009 IETF Trust and the persons identified as the - document authors. All rights reserved. - - This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents (http://trustee.ietf.org/ - license-info) in effect on the date of publication of this document. - Please review these documents carefully, as they describe your rights - and restrictions with respect to this document. - -Abstract - - This document describes a profile of the Sieve extension for - notifications, to allow notifications to be sent over the Extensible - Messaging and Presence Protocol (XMPP), also known as Jabber. - - - - - - - - - - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 1] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -Table of Contents - - 1. Introduction ....................................................3 - 1.1. Overview ...................................................3 - 1.2. Terminology ................................................3 - 2. Definition ......................................................3 - 2.1. Notify Parameter "method" ..................................3 - 2.2. Test notify_method_capability ..............................3 - 2.3. Notify Tag ":from" .........................................4 - 2.4. Notify Tag ":importance" ...................................4 - 2.5. Notify Tag ":message" ......................................4 - 2.6. Notify Tag ":options" ......................................4 - 2.7. XMPP Syntax ................................................4 - 3. Examples ........................................................6 - 3.1. Basic Action ...............................................6 - 3.2. Action with "body" .........................................7 - 3.3. Action with "body", ":importance", ":message", and - "subject" ..................................................7 - 3.4. Action with ":from", ":message", ":importance", - "body", and "subject" ......................................8 - 4. Requirements Conformance ........................................9 - 5. Internationalization Considerations ............................10 - 6. Security Considerations ........................................11 - 7. IANA Considerations ............................................12 - 8. References .....................................................12 - 8.1. Normative References ......................................12 - 8.2. Informative References ....................................13 - - - - - - - - - - - - - - - - - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 2] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -1. Introduction - -1.1. Overview - - The [NOTIFY] extension to the [SIEVE] mail filtering language is a - framework for providing notifications by employing URIs to specify - the notification mechanism. This document defines how xmpp URIs (see - [XMPP-URI]) are used to generate notifications via the Extensible - Messaging and Presence Protocol [XMPP], which is widely implemented - in Jabber instant messaging technologies. - -1.2. Terminology - - This document inherits terminology from [NOTIFY], [SIEVE], and - [XMPP]. In particular, the terms "parameter" and "tag" are used as - described in [NOTIFY] to refer to aspects of Sieve scripts, and the - term "key" is used as described in [XMPP-URI] to refer to aspects of - an XMPP URI. - - The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", - "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT - RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be - interpreted as described in [TERMS]. - -2. Definition - -2.1. Notify Parameter "method" - - The "method" parameter MUST be a URI that conforms to the xmpp URI - scheme (as specified in [XMPP-URI]) and that identifies an XMPP - account associated with the email inbox. The URI MAY include the - resource identifier of an XMPP address and/or the query component - portion of an XMPP URI, but SHOULD NOT include an authority component - or fragment identifier component. The processing application MUST - extract an XMPP address from the URI in accordance with the - processing rules specified in [XMPP-URI]. The resulting XMPP address - MUST be encapsulated in XMPP syntax as the value of the XMPP 'to' - attribute. - -2.2. Test notify_method_capability - - In response to a notify_method_capability test for the "online" - notification-capability, an implementation SHOULD return a value of - "yes" if it has knowledge of an active presence session (see - [XMPP-IM]) for the specified XMPP notification-uri; otherwise, it - SHOULD return a value of "maybe" (since typical XMPP systems may not - allow a Sieve engine to gain knowledge about the presence of XMPP - entities). - - - -Saint-Andre & Melnikov Standards Track [Page 3] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -2.3. Notify Tag ":from" - - If included, the ":from" tag MUST be an electronic address that - conforms to the "Mailbox" rule defined in [RFC5321]. The value of - the ":from" tag MAY be included in the human-readable XML character - data of the XMPP notification; alternatively or in addition, it MAY - be transformed into formal XMPP syntax, in which case it MUST be - encapsulated as the value of an XMPP SHIM (Stanza Headers and - Internet Metadata) [SHIM] header named "Resent-From". - -2.4. Notify Tag ":importance" - - The ":importance" tag has no special meaning for this notification - mechanism, and this specification puts no restriction on its use. - The value of the ":importance" tag MAY be transformed into XMPP - syntax (in addition to or instead of including appropriate text in - the XML character data of the XMPP <body/> element); if so, it SHOULD - be encapsulated as the value of an XMPP SHIM (Stanza Headers and - Internet Metadata) [SHIM] header named "Urgency", where the XML - character of that header is "high" if the value of the ":importance" - tag is "1", "medium" if the value of the ":importance" tag is "2", - and "low" if the value of the ":importance" tag is "3". - -2.5. Notify Tag ":message" - - If the ":message" tag is included, that string MUST be transformed - into the XML character data of an XMPP <body/> element (where the - string is generated according to the guidelines specified in Section - 3.6 of [NOTIFY]). - -2.6. Notify Tag ":options" - - The ":options" tag has no special meaning for this notification - mechanism. Any handling of this tag is the responsibility of an - implementation. - -2.7. XMPP Syntax - - The xmpp mechanism results in the sending of an XMPP message to - notify a recipient about an email message. The general XMPP syntax - is as follows: - - o The notification MUST be an XMPP <message/> stanza. - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 4] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - o The value of the XMPP 'from' attribute SHOULD be the XMPP address - of the notification service associated with the Sieve engine or - the XMPP address of the entity to be notified. The value of the - XMPP 'from' attribute MUST NOT be generated from the Sieve ":from" - tag. - - o The value of the XMPP 'to' attribute MUST be the XMPP address - specified in the XMPP URI contained in the "method" notify - parameter. - - o The value of the XMPP 'type' attribute MUST be 'headline' or - 'normal'. - - o The XMPP <message/> stanza MUST include a <body/> child element. - If the ":message" tag is included in the Sieve script, that string - MUST be used as the XML character data of the <body/> element. If - not and if the XMPP URI contained in the "method" notify parameter - specified a "body" key in the query component, that value SHOULD - be used. Otherwise, the XML character data SHOULD be some - configurable text indicating that the message is a Sieve - notification. - - o The XMPP <message/> stanza MAY include a <subject/> child element. - If the XMPP URI contained in the "method" notify parameter - specified a "subject" key in the query component, that value - SHOULD be used as the XML character data of the <subject/> - element. Otherwise, the XML character data SHOULD be some - configurable text indicating that the message is a Sieve - notification. - - o The XMPP <message/> stanza SHOULD include a URI, for the recipient - to use as a hint in locating the message, encapsulated as the XML - character data of a <url/> child element of an <x/> element - qualified by the 'jabber:x:oob' namespace, as specified in [OOB]. - If included, the URI SHOULD be an Internet Message Access Protocol - [IMAP] URL that specifies the location of the message, as defined - in [IMAP-URL], but MAY be another URI type that can specify or - hint at the location of an email message, such as a URI for an - HTTP resource [HTTP] or a Post Office Protocol Version 3 (POP3) - mailbox [POP-URL] at which the message can be accessed. It is not - expected that an XMPP user agent shall directly handle such a URI, - but instead that it shall invoke an appropriate helper application - to handle the URI. - - o The XMPP <message/> stanza MAY include an XMPP SHIM (Stanza - Headers and Internet Metadata) [SHIM] header named "Resent-From". - If the Sieve script included a ":from" tag, the "Resent-From" - - - - -Saint-Andre & Melnikov Standards Track [Page 5] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - value MUST be the value of the ":from" tag; otherwise, the - "Resent-From" value SHOULD be the envelope recipient address of - the original email message that triggered the notification. - -3. Examples - - In the following examples, the sender of the email has an address of - <mailto:juliet@example.org>, the entity to be notified has an email - address of <mailto:romeo@example.com> and an XMPP address of - romeo@im.example.com (resulting in an XMPP URI of - <xmpp:romeo@im.example.com>), and the notification service associated - with the Sieve engine has an XMPP address of notify.example.com. - - Note: In the following examples, line breaks are included in XMPP - URIs solely for the purpose of readability. - -3.1. Basic Action - - The following is a basic Sieve notify action with only a method. The - XML character data of the XMPP <body/> and <subject/> elements are - therefore generated by the Sieve engine based on configuration. In - addition, the Sieve engine includes a URI pointing to the message. - - Basic action (Sieve syntax) - - notify "xmpp:romeo@im.example.com" - - The resulting XMPP <message/> stanza might be as follows: - - Basic action (XMPP syntax) - - <message from='notify.example.com' - to='romeo@im.example.com' - type='headline' - xml:lang='en'> - <subject>SIEVE</subject> - <body>&lt;juliet@example.com&gt; You got mail.</body> - <x xmlns='jabber:x:oob'> - <url> - imap://romeo@example.com/INBOX;UIDVALIDITY=385759043/;UID=18 - </url> - </x> - </message> - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 6] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -3.2. Action with "body" - - The following action contains a "body" key in the query component of - the XMPP URI but no ":message" tag in the Sieve script. As a result, - the XML character data of the XMPP <body/> element in the XMPP - notification is taken from the XMPP URI. In addition, the Sieve - engine includes a URI pointing to the message. - - Action with "body" (Sieve syntax) - - notify "xmpp:romeo@im.example.com?message - ;body=Wherefore%20art%20thou%3F" - - The resulting XMPP <message/> stanza might be as follows. - - Action with "body" (XMPP syntax) - - <message from='notify.example.com' - to='romeo@im.example.com' - type='headline' - xml:lang='en'> - <subject>SIEVE</subject> - <body>Wherefore art thou?</body> - <x xmlns='jabber:x:oob'> - <url> - imap://romeo@example.com/INBOX;UIDVALIDITY=385759044/;UID=19 - </url> - </x> - </message> - -3.3. Action with "body", ":importance", ":message", and "subject" - - The following action specifies an ":importance" tag and a ":message" - tag in the Sieve script, as well as a "body" key and a "subject" key - in the query component of the XMPP URI. As a result, the ":message" - tag from the Sieve script overrides the "body" key from the XMPP URI - when generating the XML character data of the XMPP <body/> element. - In addition, the Sieve engine includes a URI pointing to the message. - - Action with "body", ":importance", ":message", and "subject" (Sieve - syntax) - - notify :importance "1" - :message "Contact Juliet immediately!" - "xmpp:romeo@im.example.com?message - ;body=You%27re%20in%20trouble - ;subject=ALERT%21" - - - - -Saint-Andre & Melnikov Standards Track [Page 7] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - The resulting XMPP <message/> stanza might be as follows. - - Action with "body", ":importance", ":message", and "subject" (XMPP - syntax) - - <message from='notify.example.com' - to='romeo@im.example.com' - type='headline' - xml:lang='en'> - <subject>ALERT!</subject> - <body>Contact Juliet immediately!</body> - <headers xmlns='http://jabber.org/protocol/shim'> - <header name='Urgency'>high</header> - </headers> - <x xmlns='jabber:x:oob'> - <url> - imap://romeo@example.com/INBOX;UIDVALIDITY=385759045/;UID=20 - </url> - </x> - </message> - -3.4. Action with ":from", ":message", ":importance", "body", and - "subject" - - The following action specifies a ":from" tag, an ":importance" tag, - and a ":message" tag in the Sieve script, as well as a "body" key and - a "subject" key in the query component of the XMPP URI. As a result, - the ":message" tag from the Sieve script overrides the "body" key - from the XMPP URI when generating the XML character data of the XMPP - <body/> element. In addition, the Sieve engine includes a URI - pointing to the message, as well as an XMPP SHIM (Stanza Headers and - Internet Metadata) [SHIM] header named "Resent-From" (which - encapsulates the value of the ":from" tag). - - Action with ":from", ":importance", ":message", "body", and "subject" - (Sieve syntax) - - notify :from "romeo.my.romeo@example.com" - :importance "1" - :message "Contact Juliet immediately!" - "xmpp:romeo@im.example.com?message - ;body=You%27re%20in%20trouble - ;subject=ALERT%21" - - The resulting XMPP <message/> stanza might be as follows. - - - - - - -Saint-Andre & Melnikov Standards Track [Page 8] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - Action with ":from", ":importance", ":message", "body", and "subject" - (XMPP syntax) - - <message from='notify.example.com' - to='romeo@im.example.com' - type='headline' - xml:lang='en'> - <subject>ALERT!</subject> - <body>Contact Juliet immediately!</body> - <headers xmlns='http://jabber.org/protocol/shim'> - <header name='Resent-From'>romeo.my.romeo@example.com</header> - <header name='Urgency'>high</header> - </headers> - <x xmlns='jabber:x:oob'> - <url> - imap://romeo@example.com/INBOX;UIDVALIDITY=385759045/;UID=21 - </url> - </x> - </message> - -4. Requirements Conformance - - Section 3.8 of [NOTIFY] specifies a set of requirements for Sieve - notification methods. The conformance of the xmpp notification - mechanism is provided here. - - 1. An implementation of the xmpp notification method SHOULD NOT - modify the final notification text (e.g., to limit the length); - however, a given deployment MAY do so (e.g., if recipients pay - per character or byte for XMPP messages). Modification of - characters themselves should not be necessary, since XMPP - character data is encoded in [UTF-8]. - - 2. An implementation MAY ignore parameters specified in the - ":from", ":importance", and ":options" tags. - - 3. There is no recommended default message for an implementation to - include if the ":message" tag is not specified. - - 4. A notification sent via the xmpp notification method MAY include - a timestamp in the textual message. - - 5. The value of the XMPP 'from' attribute MUST be the XMPP address - of the notification service associated with the Sieve engine. - The value of the Sieve ":from" tag MAY be transformed into the - value of an XMPP SHIM (Stanza Headers and Internet Metadata) - [SHIM] header named "Resent-From". - - - - -Saint-Andre & Melnikov Standards Track [Page 9] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - 6. The value of the XMPP 'to' attribute MUST be the XMPP address - specified in the XMPP URI contained in the "method" parameter. - - 7. In accordance with [XMPP-URI], an implementation MUST ignore any - URI action or key it does not understand (i.e., the URI MUST be - processed as if the action or key were not present). It is - RECOMMENDED to support the XMPP "message" query type (see - [QUERIES]) and the associated "body" and "subject" keys, which - SHOULD be mapped to the XMPP <body/> and <subject/> child - elements of the XMPP <message/> stanza, respectively. However, - if included, then the Sieve notify ":message" tag MUST be mapped - to the XMPP <body/> element, overriding the "body" key (if any) - included in the XMPP URI. - - 8. An implementation MUST NOT include any other extraneous - information not specified in parameters to the notify action. - - 9. In response to a notify_method_capability test for the "online" - notification-capability, an implementation SHOULD return a value - of "yes" if it has knowledge of an active presence session (see - [XMPP-IM]) for the specified XMPP notification-uri, but only if - the entity that requested the test is authorized to know the - presence of the associated XMPP entity (e.g., via explicit - presence subscription as specified in [XMPP-IM]); otherwise, it - SHOULD return a value of "maybe" (since typical XMPP systems may - not allow a Sieve engine to gain knowledge about the presence of - XMPP entities). - - 10. An implementation SHOULD NOT attempt to retry delivery of a - notification if it receives an XMPP error of type "auth" or - "cancel", MAY attempt to retry delivery if it receives an XMPP - error of type "wait", and MAY attempt to retry delivery if it - receives an XMPP error of "modify", but only if it makes - appropriate modifications to the notification (see [XMPP]); in - any case, the number of retries SHOULD be limited to a - configurable number no less than 3 and no more than 10. An - implementation MAY throttle notifications if the number of - notifications within a given time period becomes excessive - according to local service policy. Duplicate suppression (if - any) is a matter of implementation and is not specified herein. - -5. Internationalization Considerations - - Although an XMPP address may contain nearly any [UNICODE] character, - the value of the "method" parameter MUST be a Uniform Resource - Identifier (see [URI]) rather than an Internationalized Resource - Identifier (see [IRI]). The rules specified in [XMPP-URI] MUST be - followed when generating XMPP URIs. - - - -Saint-Andre & Melnikov Standards Track [Page 10] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - - In accordance with Section 13 of RFC 3920, all data sent over XMPP - MUST be encoded in [UTF-8]. - -6. Security Considerations - - Depending on the information included, sending a notification can be - comparable to forwarding mail to the notification recipient. Care - must be taken when forwarding mail automatically, to ensure that - confidential information is not sent into an insecure environment. - In particular, implementations MUST conform to the security - considerations given in [NOTIFY], [SIEVE], and [XMPP]. - - [NOTIFY] specifies that a notification method MUST provide mechanisms - for avoiding notification loops. One type of notification loop can - be caused by message forwarding; however, such loops are prevented - because XMPP does not support the forwarding of messages from one - XMPP address to another. Another type of notification loop can be - caused by auto-replies to XMPP messages received by the XMPP - notification service associated with the Sieve engine; therefore, - such a service MUST NOT auto-reply to XMPP messages it receives. - - A common use case might be for a user to create a script that enables - the Sieve engine to act differently if the user is currently - available at a particular type of service (e.g., send notifications - to the user's XMPP address if the user has an active session at an - XMPP service). Whether the user is currently available can be - determined by means of a notify_method_capability test for the - "online" notification-capability. In XMPP, information about current - network availability is called "presence" (see also [MODEL]). Since - [XMPP-IM] requires that a user must approve a presence subscription - before an entity can gain access to the user's presence information, - a limited but reasonably safe implementation might be for the Sieve - engine to request a subscription to the user's presence. The user - would then need to approve that subscription request so that the - Sieve engine can act appropriately depending on whether the user is - online or offline. However, the Sieve engine MUST NOT use the user's - presence information when processing scripts on behalf of a script - owner other than the user, unless the Sieve engine has explicit - knowledge (e.g., via integration with an XMPP server's presence - authorization rules) that the script owner is authorized to know the - user's presence. While it would be possible to design a more - advanced approach to the delegation of presence authorization, any - such approach is left to future standards work. - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 11] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -7. IANA Considerations - - The following template provides the IANA registration of the Sieve - notification mechanism specified in this document: - - To: iana@iana.org - Subject: Registration of new Sieve notification mechanism - Mechanism name: xmpp - Mechanism URI: RFC 5122 [XMPP-URI] - Mechanism-specific options: none - Permanent and readily available reference: RFC 5437 - Person and email address to contact for further information: - Peter Saint-Andre <registrar@xmpp.org> - - This information has been added to the list of Sieve notification - mechanisms maintained at <http://www.iana.org>. - -8. References - -8.1. Normative References - - [NOTIFY] Melnikov, A., Ed., Leiba, B., Ed., Segmuller, W., and T. - Martin, "Sieve Email Filtering: Extension for - Notifications", RFC 5435, January 2009. - - [OOB] Saint-Andre, P., "Out of Band Data", XSF XEP 0066, - August 2006. - - [QUERIES] Saint-Andre, P., "XMPP URI Scheme Query Components", XSF - XEP 0147, September 2006. - - [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, - October 2008. - - [SHIM] Saint-Andre, P. and J. Hildebrand, "Stanza Headers and - Internet Metadata", XSF XEP 0131, July 2006. - - [SIEVE] Guenther, P., Ed. and T. Showalter, Ed., "Sieve: An Email - Filtering Language", RFC 5228, January 2008. - - [TERMS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [XMPP-URI] Saint-Andre, P., "Internationalized Resource Identifiers - (IRIs) and Uniform Resource Identifiers (URIs) for the - Extensible Messaging and Presence Protocol (XMPP)", - RFC 5122, February 2008. - - - - -Saint-Andre & Melnikov Standards Track [Page 12] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -8.2. Informative References - - [HTTP] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., - Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext - Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. - - [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION - 4rev1", RFC 3501, March 2003. - - [IMAP-URL] Melnikov, A. and C. Newman, "IMAP URL Scheme", RFC 5092, - November 2007. - - [IRI] Duerst, M. and M. Suignard, "Internationalized Resource - Identifiers (IRIs)", RFC 3987, January 2005. - - [MODEL] Day, M., Rosenberg, J., and H. Sugano, "A Model for - Presence and Instant Messaging", RFC 2778, February 2000. - - [POP-URL] Gellens, R., "POP URL Scheme", RFC 2384, August 1998. - - [UNICODE] The Unicode Consortium, "The Unicode Standard, Version - 3.2.0", 2000. - - The Unicode Standard, Version 3.2.0 is defined by The - Unicode Standard, Version 3.0 (Reading, MA, Addison- - Wesley, 2000. ISBN 0-201-61633-5), as amended by the - Unicode Standard Annex #27: Unicode 3.1 - (http://www.unicode.org/reports/tr27/) and by the Unicode - Standard Annex #28: Unicode 3.2 - (http://www.unicode.org/reports/tr28/). - - [URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform - Resource Identifier (URI): Generic Syntax", STD 66, - RFC 3986, January 2005. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. - - [XMPP] Saint-Andre, P., "Extensible Messaging and Presence - Protocol (XMPP): Core", RFC 3920, October 2004. - - [XMPP-IM] Saint-Andre, P., "Extensible Messaging and Presence - Protocol (XMPP): Instant Messaging and Presence", - RFC 3921, October 2004. - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 13] - -RFC 5437 Sieve Notify Method: XMPP January 2009 - - -Authors' Addresses - - Peter Saint-Andre - Cisco - - EMail: psaintan@cisco.com - - - Alexey Melnikov - Isode Limited - - EMail: Alexey.Melnikov@isode.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Saint-Andre & Melnikov Standards Track [Page 14] - diff --git a/proto/sieve/rfc5804.txt b/proto/sieve/rfc5804.txt @@ -1,2747 +0,0 @@ - - - - - - -Internet Engineering Task Force (IETF) A. Melnikov, Ed. -Request for Comments: 5804 Isode Limited -Category: Standards Track T. Martin -ISSN: 2070-1721 BeThereBeSquare, Inc. - July 2010 - - - A Protocol for Remotely Managing Sieve Scripts - -Abstract - - Sieve scripts allow users to filter incoming email. Message stores - are commonly sealed servers so users cannot log into them, yet users - must be able to update their scripts on them. This document - describes a protocol "ManageSieve" for securely managing Sieve - scripts on a remote server. This protocol allows a user to have - multiple scripts, and also alerts a user to syntactically flawed - scripts. - -Status of This Memo - - This is an Internet Standards Track document. - - This document is a product of the Internet Engineering Task Force - (IETF). It represents the consensus of the IETF community. It has - received public review and has been approved for publication by the - Internet Engineering Steering Group (IESG). Further information on - Internet Standards is available in Section 2 of RFC 5741. - - Information about the current status of this document, any errata, - and how to provide feedback on it may be obtained at - http://www.rfc-editor.org/info/rfc5804. - -Copyright Notice - - Copyright (c) 2010 IETF Trust and the persons identified as the - document authors. All rights reserved. - - This document is subject to BCP 78 and the IETF Trust's Legal - Provisions Relating to IETF Documents - (http://trustee.ietf.org/license-info) in effect on the date of - publication of this document. Please review these documents - carefully, as they describe your rights and restrictions with respect - to this document. Code Components extracted from this document must - include Simplified BSD License text as described in Section 4.e of - the Trust Legal Provisions and are provided without warranty as - described in the Simplified BSD License. - - - - -Melnikov & Martin Standards Track [Page 1] - -RFC 5804 ManageSieve July 2010 - - -Table of Contents - - 1. Introduction ....................................................3 - 1.1. Commands and Responses .....................................3 - 1.2. Syntax .....................................................3 - 1.3. Response Codes .............................................3 - 1.4. Active Script ..............................................6 - 1.5. Quotas .....................................................6 - 1.6. Script Names ...............................................6 - 1.7. Capabilities ...............................................7 - 1.8. Transport ..................................................9 - 1.9. Conventions Used in This Document .........................10 - 2. Commands .......................................................10 - 2.1. AUTHENTICATE Command ......................................11 - 2.1.1. Use of SASL PLAIN Mechanism over TLS ...............16 - 2.2. STARTTLS Command ..........................................16 - 2.2.1. Server Identity Check ..............................17 - 2.3. LOGOUT Command ............................................20 - 2.4. CAPABILITY Command ........................................20 - 2.5. HAVESPACE Command .........................................20 - 2.6. PUTSCRIPT Command .........................................21 - 2.7. LISTSCRIPTS Command .......................................23 - 2.8. SETACTIVE Command .........................................24 - 2.9. GETSCRIPT Command .........................................25 - 2.10. DELETESCRIPT Command .....................................25 - 2.11. RENAMESCRIPT Command .....................................26 - 2.12. CHECKSCRIPT Command ......................................27 - 2.13. NOOP Command .............................................28 - 2.14. Recommended Extensions ...................................28 - 2.14.1. UNAUTHENTICATE Command ............................28 - 3. Sieve URL Scheme ...............................................29 - 4. Formal Syntax ..................................................31 - 5. Security Considerations ........................................37 - 6. IANA Considerations ............................................38 - 6.1. ManageSieve Capability Registration Template ..............39 - 6.2. Registration of Initial ManageSieve Capabilities ..........39 - 6.3. ManageSieve Response Code Registration Template ...........41 - 6.4. Registration of Initial ManageSieve Response Codes ........41 - 7. Internationalization Considerations ............................46 - 8. Acknowledgements ...............................................46 - 9. References .....................................................47 - 9.1. Normative References ......................................47 - 9.2. Informative References ....................................48 - - - - - - - - -Melnikov & Martin Standards Track [Page 2] - -RFC 5804 ManageSieve July 2010 - - -1. Introduction - -1.1. Commands and Responses - - A ManageSieve connection consists of the establishment of a client/ - server network connection, an initial greeting from the server, and - client/server interactions. These client/server interactions consist - of a client command, server data, and a server completion result - response. - - All interactions transmitted by client and server are in the form of - lines, that is, strings that end with a CRLF. The protocol receiver - of a ManageSieve client or server is either reading a line or reading - a sequence of octets with a known count followed by a line. - -1.2. Syntax - - ManageSieve is a line-oriented protocol much like [IMAP] or [ACAP], - which runs over TCP. There are three data types: atoms, numbers and - strings. Strings may be quoted or literal. See [ACAP] for detailed - descriptions of these types. - - Each command consists of an atom (the command name) followed by zero - or more strings and numbers terminated by CRLF. - - All client queries are replied to with either an OK, NO, or BYE - response. Each response may be followed by a response code (see - Section 1.3) and by a string consisting of human-readable text in the - local language (as returned by the LANGUAGE capability; see - Section 1.7), encoded in UTF-8 [UTF-8]. The contents of the string - SHOULD be shown to the user ,and implementations MUST NOT attempt to - parse the message for meaning. - - The BYE response SHOULD be used if the server wishes to close the - connection. A server may wish to do this because the client was idle - for too long or there were too many failed authentication attempts. - This response can be issued at any time and should be immediately - followed by a server hang-up of the connection. If a server has an - inactivity timeout resulting in client autologout, it MUST be no less - than 30 minutes after successful authentication. The inactivity - timeout MAY be less before authentication. - -1.3. Response Codes - - An OK, NO, or BYE response from the server MAY contain a response - code to describe the event in a more detailed machine-parsable - fashion. A response code consists of data inside parentheses in the - form of an atom, possibly followed by a space and arguments. - - - -Melnikov & Martin Standards Track [Page 3] - -RFC 5804 ManageSieve July 2010 - - - Response codes are defined when there is a specific action that a - client can take based upon the additional information. In order to - support future extension, the response code is represented as a - slash-separated (Solidus, %x2F) hierarchy with each level of - hierarchy representing increasing detail about the error. Response - codes MUST NOT start with the Solidus character. Clients MUST - tolerate additional hierarchical response code detail that they don't - understand. For example, if the client supports the "QUOTA" response - code, but doesn't understand the "QUOTA/MAXSCRIPTS" response code, it - should treat "QUOTA/MAXSCRIPTS" as "QUOTA". - - Client implementations MUST tolerate (ignore) response codes that - they do not recognize. - - The currently defined response codes are the following: - - AUTH-TOO-WEAK - - This response code is returned in the NO or BYE response from an - AUTHENTICATE command. It indicates that site security policy forbids - the use of the requested mechanism for the specified authentication - identity. - - ENCRYPT-NEEDED - - This response code is returned in the NO or BYE response from an - AUTHENTICATE command. It indicates that site security policy - requires the use of a strong encryption mechanism for the specified - authentication identity and mechanism. - - QUOTA - - If this response code is returned in the NO/BYE response, it means - that the command would have placed the user above the site-defined - quota constraints. If this response code is returned in the OK - response, it can mean that the user's storage is near its quota, or - it can mean that the account exceeded its quota but that the - condition is being allowed by the server (the server supports - so-called soft quotas). The QUOTA response code has two more - detailed variants: "QUOTA/MAXSCRIPTS" (the maximum number of per-user - scripts) and "QUOTA/MAXSIZE" (the maximum script size). - - REFERRAL - - This response code may be returned with a BYE result from any - command, and includes a mandatory parameter that indicates what - server to access to manage this user's Sieve scripts. The server - will be specified by a Sieve URL (see Section 3). The scriptname - - - -Melnikov & Martin Standards Track [Page 4] - -RFC 5804 ManageSieve July 2010 - - - portion of the URL MUST NOT be specified. The client should - authenticate to the specified server and use it for all further - commands in the current session. - - SASL - - This response code can occur in the OK response to a successful - AUTHENTICATE command and includes the optional final server response - data from the server as specified by [SASL]. - - TRANSITION-NEEDED - - This response code occurs in a NO response of an AUTHENTICATE - command. It indicates that the user name is valid, but the entry in - the authentication database needs to be updated in order to permit - authentication with the specified mechanism. This is typically done - by establishing a secure channel using TLS, verifying server identity - as specified in Section 2.2.1, and finally authenticating once using - the [PLAIN] authentication mechanism. The selected mechanism SHOULD - then work for authentications in subsequent sessions. - - This condition can happen if a user has an entry in a system - authentication database such as Unix /etc/passwd, but does not have - credentials suitable for use by the specified mechanism. - - TRYLATER - - A command failed due to a temporary server failure. The client MAY - continue using local information and try the command later. This - response code only makes sense when returned in a NO/BYE response. - - ACTIVE - - A command failed because it is not allowed on the active script, for - example, DELETESCRIPT on the active script. This response code only - makes sense when returned in a NO/BYE response. - - NONEXISTENT - - A command failed because the referenced script name doesn't exist. - This response code only makes sense when returned in a NO/BYE - response. - - ALREADYEXISTS - - A command failed because the referenced script name already exists. - This response code only makes sense when returned in a NO/BYE - response. - - - -Melnikov & Martin Standards Track [Page 5] - -RFC 5804 ManageSieve July 2010 - - - TAG - - This response code name is followed by a string specified in the - command. See Section 2.13 for a possible use case. - - WARNINGS - - This response code MAY be returned by the server in the OK response - (but it might be returned with the NO/BYE response as well) and - signals the client that even though the script is syntactically - valid, it might contain errors not intended by the script writer. - This response code is typically returned in response to PUTSCRIPT - and/or CHECKSCRIPT commands. A client seeing such response code - SHOULD present the returned warning text to the user. - -1.4. Active Script - - A user may have multiple Sieve scripts on the server, yet only one - script may be used for filtering of incoming messages. This is the - active script. Users may have zero or one active script and MUST use - the SETACTIVE command described below for changing the active script - or disabling Sieve processing. For example, users may have an - everyday script they normally use and a special script they use when - they go on vacation. Users can change which script is being used - without having to download and upload a script stored somewhere else. - -1.5. Quotas - - Servers SHOULD impose quotas to prevent malicious users from - overflowing available storage. If a command would place a user over - a quota setting, servers that impose such quotas MUST reply with a NO - response containing the QUOTA response code. Client implementations - MUST be able to handle commands failing because of quota - restrictions. - -1.6. Script Names - - A Sieve script name is a sequence of Unicode characters encoded in - UTF-8 [UTF-8]. A script name MUST comply with Net-Unicode Definition - (Section 2 of [NET-UNICODE]), with the additional restriction of - prohibiting the following Unicode characters: - - o 0000-001F; [CONTROL CHARACTERS] - - o 007F; DELETE - - o 0080-009F; [CONTROL CHARACTERS] - - - - -Melnikov & Martin Standards Track [Page 6] - -RFC 5804 ManageSieve July 2010 - - - o 2028; LINE SEPARATOR - - o 2029; PARAGRAPH SEPARATOR - - Sieve script names MUST be at least one octet (and hence Unicode - character) long. Zero octets script name has a special meaning (see - Section 2.8). Servers MUST allow names of up to 128 Unicode - characters in length (which can take up to 512 bytes when encoded in - UTF-8, not counting the terminating NUL), and MAY allow longer names. - A server that receives a script name longer than its internal limit - MUST reject the corresponding operation, in particular it MUST NOT - truncate the script name. - -1.7. Capabilities - - Server capabilities are sent automatically by the server upon a - client connection, or after successful STARTTLS and AUTHENTICATE - (which establishes a Simple Authentication and Security Layer (SASL)) - commands. Capabilities may change immediately after a successfully - completed STARTTLS command, and/or immediately after a successfully - completed AUTHENTICATE command, and/or after a successfully completed - UNAUTHENTICATE command (see Section 2.14.1). Capabilities MUST - remain static at all other times. - - Clients MAY request the capabilities at a later time by issuing the - CAPABILITY command described later. The capabilities consist of a - series of lines each with one or two strings. The first string is - the name of the capability, which is case-insensitive. The second - optional string is the value associated with that capability. Order - of capabilities is arbitrary, but each capability name can appear at - most once. - - The following capabilities are defined in this document: - - IMPLEMENTATION - Name of implementation and version. This capability - MUST always be returned by the server. - - SASL - List of SASL mechanisms supported by the server, each - separated by a space. This list can be empty if and only if STARTTLS - is also advertised. This means that the client must negotiate TLS - encryption with STARTTLS first, at which point the SASL capability - will list a non-empty list of SASL mechanisms. - - SIEVE - List of space-separated Sieve extensions (as listed in Sieve - "require" action [SIEVE]) supported by the Sieve engine. This - capability MUST always be returned by the server. - - - - - -Melnikov & Martin Standards Track [Page 7] - -RFC 5804 ManageSieve July 2010 - - - STARTTLS - If TLS [TLS] is supported by this implementation. Before - advertising this capability a server MUST verify to the best of its - ability that TLS can be successfully negotiated by a client with - common cipher suites. Specifically, a server should verify that a - server certificate has been installed and that the TLS subsystem has - successfully initialized. This capability SHOULD NOT be advertised - once STARTTLS or AUTHENTICATE command completes successfully. Client - and server implementations MUST implement the STARTTLS extension. - - MAXREDIRECTS - Specifies the limit on the number of Sieve "redirect" - actions a script can perform during a single evaluation. Note that - this is different from the total number of "redirect" actions a - script can contain. The value is a non-negative number represented - as a ManageSieve string. - - NOTIFY - A space-separated list of URI schema parts for supported - notification methods. This capability MUST be specified if the Sieve - implementation supports the "enotify" extension [NOTIFY]. - - LANGUAGE - The language (<Language-Tag> from [RFC5646]) currently - used for human-readable error messages. If this capability is not - returned, the "i-default" [RFC2277] language is assumed. Note that - the current language MAY be per-user configurable (i.e., it MAY - change after authentication). - - OWNER - The canonical name of the logged-in user (SASL "authorization - identity") encoded in UTF-8. This capability MUST NOT be returned in - unauthenticated state and SHOULD be returned once the AUTHENTICATE - command succeeds. - - VERSION - This capability MUST be returned by servers compliant with - this document or its successor. For servers compliant with this - document, the capability value is the string "1.0". Lack of this - capability means that the server predates this specification and thus - doesn't support the following commands: RENAMESCRIPT, CHECKSCRIPT, - and NOOP. - - Section 2.14 defines some additional ManageSieve extensions and their - respective capabilities. - - A server implementation MUST return SIEVE, IMPLEMENTATION, and - VERSION capabilities. - - A client implementation MUST ignore any listed capabilities that it - does not understand. - - - - - - -Melnikov & Martin Standards Track [Page 8] - -RFC 5804 ManageSieve July 2010 - - - Example: - - S: "IMPlemENTATION" "Example1 ManageSieved v001" - S: "SASl" "DIGEST-MD5 GSSAPI" - S: "SIeVE" "fileinto vacation" - S: "StaRTTLS" - S: "NOTIFY" "xmpp mailto" - S: "MAXREdIRECTS" "5" - S: "VERSION" "1.0" - S: OK - - After successful authentication, this might look like this: - - Example: - - S: "IMPlemENTATION" "Example1 ManageSieved v001" - S: "SASl" "DIGEST-MD5 GSSAPI" - S: "SIeVE" "fileinto vacation" - S: "NOTIFY" "xmpp mailto" - S: "OWNER" "alexey@example.com" - S: "MAXREdIRECTS" "5" - S: "VERSION" "1.0" - S: OK - -1.8. Transport - - The ManageSieve protocol assumes a reliable data stream such as that - provided by TCP. When TCP is used, a ManageSieve server typically - listens on port 4190. - - Before opening the TCP connection, the ManageSieve client first MUST - resolve the Domain Name System (DNS) hostname associated with the - receiving entity and determine the appropriate TCP port for - communication with the receiving entity. The process is as follows: - - 1. Attempt to resolve the hostname using a [DNS-SRV] Service of - "sieve" and a Proto of "tcp" for the target domain (e.g., - "example.net"), resulting in resource records such as - "_sieve._tcp.example.net.". The result of the SRV lookup, if - successful, will be one or more combinations of a port and - hostname; the ManageSieve client MUST resolve the returned - hostnames to IPv4/IPv6 addresses according to returned SRV record - weight. IP addresses from the first successfully resolved - hostname (with the corresponding port number returned by SRV - lookup) are used to connect to the server. If connection using - one of the IP addresses fails, the next resolved IP address is - - - - - -Melnikov & Martin Standards Track [Page 9] - -RFC 5804 ManageSieve July 2010 - - - used to connect. If connection to all resolved IP addresses - fails, then the resolution/connect is repeated for the next - hostname returned by SRV lookup. - - 2. If the SRV lookup fails, the fallback SHOULD be a normal IPv4 or - IPv6 address record resolution to determine the IP address, where - the port used is the default ManageSieve port of 4190. - -1.9. Conventions Used in This Document - - The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", - "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this - document are to be interpreted as described in [KEYWORDS]. - - In examples, "C:" and "S:" indicate lines sent by the client and - server respectively. Line breaks that do not start a new "C:" or - "S:" exist for editorial reasons. - - Examples of authentication in this document are using DIGEST-MD5 - [DIGEST-MD5] and GSSAPI [GSSAPI] SASL mechanisms. - -2. Commands - - This section and its subsections describe valid ManageSieve commands. - Upon initial connection to the server, the client's session is in - non-authenticated state. Prior to successful authentication, only - the AUTHENTICATE, CAPABILITY, STARTTLS, LOGOUT, and NOOP (see Section - 2.13) commands are valid. ManageSieve extensions MAY define other - commands that are valid in non-authenticated state. Servers MUST - reject all other commands with a NO response. Clients may pipeline - commands (send more than one command at a time without waiting for - completion of the first command). However, a group of commands sent - together MUST NOT have an AUTHENTICATE (*), a STARTTLS, or a - HAVESPACE command anywhere but the last command in the list. - - (*) - The only exception to this rule is when the AUTHENTICATE - command contains an initial response for a SASL mechanism that allows - clients to send data first, the mechanism is known to complete in one - round trip, and the mechanism doesn't negotiate a SASL security - layer. Two examples of such SASL mechanisms are PLAIN [PLAIN] and - EXTERNAL [SASL]. - - - - - - - - - - -Melnikov & Martin Standards Track [Page 10] - -RFC 5804 ManageSieve July 2010 - - -2.1. AUTHENTICATE Command - - Arguments: String - mechanism - String - initial data (optional) - - The AUTHENTICATE command indicates a SASL [SASL] authentication - mechanism to the server. If the server supports the requested - authentication mechanism, it performs an authentication protocol - exchange to identify and authenticate the user. Optionally, it also - negotiates a security layer for subsequent protocol interactions. If - the requested authentication mechanism is not supported, the server - rejects the AUTHENTICATE command by sending the NO response. - - The authentication protocol exchange consists of a series of server - challenges and client responses that are specific to the selected - authentication mechanism. A server challenge consists of a string - (quoted or literal) followed by a CRLF. The contents of the string - is a base-64 encoding [BASE64] of the SASL data. A client response - consists of a string (quoted or literal) with the base-64 encoding of - the SASL data followed by a CRLF. If the client wishes to cancel the - authentication exchange, it issues a string containing a single "*". - If the server receives such a response, it MUST reject the - AUTHENTICATE command by sending a NO reply. - - Note that an empty challenge/response is sent as an empty string. If - the mechanism dictates that the final response is sent by the server, - this data MAY be placed within the data portion of the SASL response - code to save a round trip. - - The optional initial-response argument to the AUTHENTICATE command is - used to save a round trip when using authentication mechanisms that - are defined to send no data in the initial challenge. When the - initial-response argument is used with such a mechanism, the initial - empty challenge is not sent to the client and the server uses the - data in the initial-response argument as if it were sent in response - to the empty challenge. If the initial-response argument to the - AUTHENTICATE command is used with a mechanism that sends data in the - initial challenge, the server MUST reject the AUTHENTICATE command by - sending the NO response. - - The service name specified by this protocol's profile of SASL is - "sieve". - - Reauthentication is not supported by ManageSieve protocol's profile - of SASL. That is, after a successfully completed AUTHENTICATE - command, no more AUTHENTICATE commands may be issued in the same - session. After a successful AUTHENTICATE command completes, a server - MUST reject any further AUTHENTICATE commands with a NO reply. - - - -Melnikov & Martin Standards Track [Page 11] - -RFC 5804 ManageSieve July 2010 - - - However, note that a server may implement the UNAUTHENTICATE - extension described in Section 2.14.1. - - If a security layer is negotiated through the SASL authentication - exchange, it takes effect immediately following the CRLF that - concludes the successful authentication exchange for the client, and - the CRLF of the OK response for the server. - - When a security layer takes effect, the ManageSieve protocol is reset - to the initial state (the state in ManageSieve after a client has - connected to the server). The server MUST discard any knowledge - obtained from the client that was not obtained from the SASL (or TLS) - negotiation itself. Likewise, the client MUST discard any knowledge - obtained from the server, such as the list of ManageSieve extensions, - that was not obtained from the SASL (and/or TLS) negotiation itself. - (Note that a client MAY compare the advertised SASL mechanisms before - and after authentication in order to detect an active down- - negotiation attack. See below.) - - Once a SASL security layer is established, the server MUST re-issue - the capability results, followed by an OK response. This is - necessary to protect against man-in-the-middle attacks that alter the - capabilities list prior to SASL negotiation. The capability results - MUST include all SASL mechanisms the server was capable of - negotiating with that client. This is done in order to allow the - client to detect an active down-negotiation attack. If a user- - oriented client detects such a down-negotiation attack, it SHOULD - either notify the user (it MAY give the user the opportunity to - continue with the ManageSieve session in this case) or close the - transport connection and indicate that a down-negotiation attack - might be in progress. If an automated client detects a down- - negotiation attack, it SHOULD return or log an error indicating that - a possible attack might be in progress and/or SHOULD close the - transport connection. - - When both [TLS] and SASL security layers are in effect, the TLS - encoding MUST be applied (when sending data) after the SASL encoding. - - Server implementations SHOULD support SASL proxy authentication so - that an administrator can administer a user's scripts. Proxy - authentication is when a user authenticates as herself/himself but - requests the server to act (authorize) as another user. - - The authorization identity generated by this [SASL] exchange is a - "simple username" (in the sense defined in [SASLprep]), and both - client and server MUST use the [SASLprep] profile of the [StringPrep] - algorithm to prepare these names for transmission or comparison. If - preparation of the authorization identity fails or results in an - - - -Melnikov & Martin Standards Track [Page 12] - -RFC 5804 ManageSieve July 2010 - - - empty string (unless it was transmitted as the empty string), the - server MUST fail the authentication. - - If an AUTHENTICATE command fails with a NO response, the client MAY - try another authentication mechanism by issuing another AUTHENTICATE - command. In other words, the client may request authentication types - in decreasing order of preference. - - Note that a failed (NO) response to the AUTHENTICATE command may - contain one of the following response codes: AUTH-TOO-WEAK, ENCRYPT- - NEEDED, or TRANSITION-NEEDED. See Section 1.3 for detailed - description of the relevant conditions. - - To ensure interoperability, both client and server implementations of - the ManageSieve protocol MUST implement the SCRAM-SHA-1 [SCRAM] SASL - mechanism, as well as [PLAIN] over [TLS]. - - Note: use of PLAIN over TLS reflects current use of PLAIN over TLS in - other email-related protocols; however, a longer-term goal is to - migrate email-related protocols from using PLAIN over TLS to SCRAM- - SHA-1 mechanism. - - Examples (Note that long lines are folded for readability and are not - part of protocol exchange): - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: "VERSION" "1.0" - S: OK - C: Authenticate "DIGEST-MD5" - S: "cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik - 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz - cyxjaGFyc2V0PXV0Zi04" - C: "Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 - QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo - aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX - N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy - ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 - A9YXV0aA==" - S: OK (SASL "cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZ - mZmZA==") - - - - - - - - -Melnikov & Martin Standards Track [Page 13] - -RFC 5804 ManageSieve July 2010 - - - A slightly different variant of the same authentication exchange is: - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "VERSION" "1.0" - S: "STARTTLS" - S: OK - C: Authenticate "DIGEST-MD5" - S: {136} - S: cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik - 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz - cyxjaGFyc2V0PXV0Zi04 - C: {300+} - C: Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 - QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo - aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX - N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy - ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 - A9YXV0aA== - S: {56} - S: cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA== - C: "" - S: OK - - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 14] - -RFC 5804 ManageSieve July 2010 - - - Another example demonstrating use of SASL PLAIN mechanism under TLS - follows. This example also demonstrate use of SASL "initial - response" (the second parameter to the Authenticate command): - - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: OK - C: STARTTLS - S: OK - <TLS negotiation, further commands are under TLS layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN" - S: "SIEVE" "fileinto vacation" - S: OK - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xu" - S: NO - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xz" - S: NO - C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xy" - S: BYE "Too many failed authentication attempts" - <Server closes connection> - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 15] - -RFC 5804 ManageSieve July 2010 - - - The following example demonstrates use of SASL "initial response". - It also demonstrates that an empty response can be sent as a literal - and that negotiating a SASL security layer results in the server - re-issuing server capabilities: - - C: AUTHENTICATE "GSSAPI" {1488+} - C: YIIE[...1480 octets here ...]dA== - S: {208} - S: YIGZBgkqhkiG9xIBAgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKic - [...114 octets here ...] - /yzpAy9p+Y0LanLskOTvMc0MnjgAa4YEr3eJ6 - C: {0+} - C: - S: {44} - S: BQQF/wAMAAwAAAAAYRGFAo6W0vIHti8i1UXODgEAEAA= - C: {44+} - C: BQQE/wAMAAwAAAAAIsT1iv9UkZApw471iXt6cwEAAAE= - S: OK - <Further commands/responses are under SASL security layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "LANGUAGE" "ru" - S: "MAXREDIRECTS" "3" - S: ok - -2.1.1. Use of SASL PLAIN Mechanism over TLS - - This section is normative for ManageSieve client implementations that - support SASL [PLAIN] over [TLS]. - - If a ManageSieve client is willing to use SASL PLAIN over TLS to - authenticate to the ManageSieve server, the client MUST verify the - server identity (see Section 2.2.1). If the server identity can't be - verified (e.g., the server has not provided any certificate, or if - the certificate verification fails), the client MUST NOT attempt to - authenticate using the SASL PLAIN mechanism. - -2.2. STARTTLS Command - - Support for STARTTLS command in servers is optional. Its - availability is advertised with "STARTTLS" capability as described in - Section 1.7. - - The STARTTLS command requests commencement of a TLS [TLS] - negotiation. The negotiation begins immediately after the CRLF in - the OK response. After a client issues a STARTTLS command, it MUST - - - -Melnikov & Martin Standards Track [Page 16] - -RFC 5804 ManageSieve July 2010 - - - NOT issue further commands until a server response is seen and the - TLS negotiation is complete. - - The STARTTLS command is only valid in non-authenticated state. The - server remains in non-authenticated state, even if client credentials - are supplied during the TLS negotiation. The SASL [SASL] EXTERNAL - mechanism MAY be used to authenticate once TLS client credentials are - successfully exchanged, but servers supporting the STARTTLS command - are not required to support the EXTERNAL mechanism. - - After the TLS layer is established, the server MUST re-issue the - capability results, followed by an OK response. This is necessary to - protect against man-in-the-middle attacks that alter the capabilities - list prior to STARTTLS. This capability result MUST NOT include the - STARTTLS capability. - - The client MUST discard cached capability information and replace it - with the new information. The server MAY advertise different - capabilities after STARTTLS. - - Example: - - C: StartTls - S: oK - <TLS negotiation, further commands are under TLS layer> - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "VERSION" "1.0" - S: "LANGUAGE" "fr" - S: ok - -2.2.1. Server Identity Check - - During the TLS negotiation, the ManageSieve client MUST check its - understanding of the server hostname/IP address against the server's - identity as presented in the server Certificate message, in order to - prevent man-in-the-middle attacks. In this section, the client's - understanding of the server's identity is called the "reference - identity". - - Checking is performed according to the following rules: - - o If the reference identity is a hostname: - - 1. If a subjectAltName extension of the SRVName [X509-SRV], - dNSName [X509] (in that order of preference) type is present - in the server's certificate, then it SHOULD be used as the - - - -Melnikov & Martin Standards Track [Page 17] - -RFC 5804 ManageSieve July 2010 - - - source of the server's identity. Matching is performed as - described in Section 2.2.1.1, with the exception that no - wildcard matching is allowed for SRVName type. If the - certificate contains multiple names (e.g., more than one - dNSName field), then a match with any one of the fields is - considered acceptable. - - 2. The client MAY use other types of subjectAltName for - performing comparison. - - 3. The server's identity MAY also be verified by comparing the - reference identity to the Common Name (CN) [RFC4519] value in - the leaf Relative Distinguished Name (RDN) of the subjectName - field of the server's certificate. This comparison is - performed using the rules for comparison of DNS names in - Section 2.2.1.1, below. Although the use of the Common Name - value is existing practice, it is deprecated, and - Certification Authorities are encouraged to provide - subjectAltName values instead. Note that the TLS - implementation may represent DNs in certificates according to - X.500 or other conventions. For example, some X.500 - implementations order the RDNs in a DN using a left-to-right - (most significant to least significant) convention instead of - LDAP's right-to-left convention. - - o When the reference identity is an IP address, the iPAddress - subjectAltName SHOULD be used by the client for comparison. The - comparison is performed as described in Section 2.2.1.2. - - If the server identity check fails, user-oriented clients SHOULD - either notify the user (clients MAY give the user the opportunity to - continue with the ManageSieve session in this case) or close the - transport connection and indicate that the server's identity is - suspect. Automated clients SHOULD return or log an error indicating - that the server's identity is suspect and/or SHOULD close the - transport connection. Automated clients MAY provide a configuration - setting that disables this check, but MUST provide a setting that - enables it. - - Beyond the server identity check described in this section, clients - should be prepared to do further checking to ensure that the server - is authorized to provide the service it is requested to provide. The - client may need to make use of local policy information in making - this determination. - - - - - - - -Melnikov & Martin Standards Track [Page 18] - -RFC 5804 ManageSieve July 2010 - - -2.2.1.1. Comparison of DNS Names - - If the reference identity is an internationalized domain name, - conforming implementations MUST convert it to the ASCII Compatible - Encoding (ACE) format as specified in Section 4 of RFC 3490 [RFC3490] - before comparison with subjectAltName values of type dNSName. - Specifically, conforming implementations MUST perform the conversion - operation specified in Section 4 of [RFC3490] as follows: - - o in step 1, the domain name SHALL be considered a "stored string"; - - o in step 3, set the flag called "UseSTD3ASCIIRules"; - - o in step 4, process each label with the "ToASCII" operation; and - - o in step 5, change all label separators to U+002E (full stop). - - After performing the "to-ASCII" conversion, the DNS labels and names - MUST be compared for equality according to the rules specified in - Section 3 of [RFC3490]; i.e., once all label separators are replaced - with U+002E (dot) they are compared in the case-insensitive manner. - - The '*' (ASCII 42) wildcard character is allowed in subjectAltName - values of type dNSName, and then only as the left-most (least - significant) DNS label in that value. This wildcard matches any - left-most DNS label in the server name. That is, the subject - *.example.com matches the server names a.example.com and - b.example.com, but does not match example.com or a.b.example.com. - -2.2.1.2. Comparison of IP Addresses - - When the reference identity is an IP address, the identity MUST be - converted to the "network byte order" octet string representation - [RFC791][RFC2460]. For IP Version 4, as specified in RFC 791, the - octet string will contain exactly four octets. For IP Version 6, as - specified in RFC 2460, the octet string will contain exactly sixteen - octets. This octet string is then compared against subjectAltName - values of type iPAddress. A match occurs if the reference identity - octet string and value octet strings are identical. - -2.2.1.3. Comparison of Other subjectName Types - - Client implementations MAY support matching against subjectAltName - values of other types as described in other documents. - - - - - - - -Melnikov & Martin Standards Track [Page 19] - -RFC 5804 ManageSieve July 2010 - - -2.3. LOGOUT Command - - The client sends the LOGOUT command when it is finished with a - connection and wishes to terminate it. The server MUST reply with an - OK response. The server MUST ignore commands issued by the client - after the LOGOUT command. - - The client SHOULD wait for the OK response before closing the - connection. This avoids the TCP connection going into the TIME_WAIT - state on the server. In order to avoid going into the TIME_WAIT TCP - state, the server MAY wait for a short while for the client to close - the TCP connection first. Whether or not the server waits for the - client to close the connection, it MUST then close the connection - itself. - - Example: - - C: Logout - S: Ok - <connection is terminated> - -2.4. CAPABILITY Command - - The CAPABILITY command requests the server capabilities as described - earlier in this document. It has no parameters. - - Example: - - C: CAPABILITY - S: "IMPLEMENTATION" "Example1 ManageSieved v001" - S: "VERSION" "1.0" - S: "SASL" "PLAIN SCRAM-SHA-1 GSSAPI" - S: "SIEVE" "fileinto vacation" - S: "STARTTLS" - S: OK - -2.5. HAVESPACE Command - - Arguments: String - name - Number - script size - - The HAVESPACE command is used to query the server for available - space. Clients specify the name they wish to save the script as and - its size in octets. Both parameters can be used by the server to see - if the script with the specified name and size is within a user's - quota(s). For example, the server MAY use the script name to check - if a script would be replaced or a new one would be created. Servers - respond with a NO if storing a script with that name and size would - - - -Melnikov & Martin Standards Track [Page 20] - -RFC 5804 ManageSieve July 2010 - - - fail or OK otherwise. Clients SHOULD issue this command before - attempting to place a script on the server. - - Note that the OK response from the HAVESPACE command does not - constitute a guarantee of success as server disk space conditions - could change between the client issuing the HAVESPACE and the client - issuing the PUTSCRIPT commands. A QUOTA response code (see - Section 1.3) remains a possible (albeit unlikely) response to a - subsequent PUTSCRIPT with the same name and size. - - Example: - - C: HAVESPACE "myscript" 999999 - S: NO (QUOTA/MAXSIZE) "Quota exceeded" - - C: HAVESPACE "foobar" 435 - S: OK - -2.6. PUTSCRIPT Command - - Arguments: String - Script name - String - Script content - - The PUTSCRIPT command is used by the client to submit a Sieve script - to the server. - - If the script already exists, upon success the old script will be - overwritten. The old script MUST NOT be overwritten if PUTSCRIPT - fails in any way. A script of zero length SHOULD be disallowed. - - This command places the script on the server. It does not affect - whether the script is processed on incoming mail, unless it replaces - the script that is already active. The SETACTIVE command is used to - mark a script as active. - - When submitting large scripts, clients SHOULD use the HAVESPACE - command beforehand to query if the server is willing to accept a - script of that size. - - The server MUST check the submitted script for validity, which - includes checking that the script complies with the Sieve grammar - [SIEVE] and that all Sieve extensions mentioned in the script's - "require" statement(s) are supported by the Sieve interpreter. (Note - that if the Sieve interpreter supports the Sieve "ihave" extension - [I-HAVE], any unrecognized/unsupported extension mentioned in the - "ihave" test MUST NOT cause the validation failure.) Other checks - such as validating the supplied command arguments for each command - MAY be performed. Essentially, the performed validation SHOULD be - - - -Melnikov & Martin Standards Track [Page 21] - -RFC 5804 ManageSieve July 2010 - - - the same as performed when compiling the script for execution. - Implementations that use a binary representation to store compiled - scripts can extend the validation to a full compilation, in order to - avoid validating uploaded scripts multiple times. - - If the script fails the validation, the server MUST reply with a NO - response. Any script that fails the validity test MUST NOT be stored - on the server. The message given with a NO response MUST be human - readable and SHOULD contain a specific error message giving the line - number of the first error. Implementors should strive to produce - helpful error messages similar to those given by programming language - compilers. Client implementations should note that this may be a - multiline literal string with more than one error message separated - by CRLFs. The human-readable message is in the language returned in - the latest LANGUAGE capability (or in "i-default"; see Section 1.7), - encoded in UTF-8 [UTF-8]. - - An OK response MAY contain the WARNINGS response code. In such a - case the human-readable message that follows the OK response SHOULD - contain a specific warning message (or messages) giving the line - number(s) in the script that might contain errors not intended by the - script writer. The human-readable message is in the language - returned in the latest LANGUAGE capability (or in "i-default"; see - Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a - response code SHOULD present the message to the user. - - - - - - - - - - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 22] - -RFC 5804 ManageSieve July 2010 - - - Examples: - - C: Putscript "foo" {31+} - C: #comment - C: InvalidSieveCommand - C: - S: NO "line 2: Syntax error" - - C: Putscript "mysievescript" {110+} - C: require ["fileinto"]; - C: - C: if envelope :contains "to" "tmartin+sent" { - C: fileinto "INBOX.sent"; - C: } - S: OK - - C: Putscript "myforwards" {190+} - C: redirect "111@example.net"; - C: - C: if size :under 10k { - C: redirect "mobile@cell.example.com"; - C: } - C: - C: if envelope :contains "to" "tmartin+lists" { - C: redirect "lists@groups.example.com"; - C: } - S: OK (WARNINGS) "line 8: server redirect action - limit is 2, this redirect might be ignored" - -2.7. LISTSCRIPTS Command - - This command lists the scripts the user has on the server. Upon - success, a list of CRLF-separated script names (each represented as a - quoted or literal string) is returned followed by an OK response. If - there exists an active script, the atom ACTIVE is appended to the - corresponding script name. The atom ACTIVE MUST NOT appear on more - than one response line. - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 23] - -RFC 5804 ManageSieve July 2010 - - - Example: - - C: Listscripts - S: "summer_script" - S: "vacation_script" - S: {13} - S: clever"script - S: "main_script" ACTIVE - S: OK - - C: listscripts - S: "summer_script" - S: "main_script" active - S: OK - -2.8. SETACTIVE Command - - Arguments: String - script name - - This command sets a script active. If the script name is the empty - string (i.e., ""), then any active script is disabled. Disabling an - active script when there is no script active is not an error and MUST - result in an OK reply. - - If the script does not exist on the server, then the server MUST - reply with a NO response. Such a reply SHOULD contain the - NONEXISTENT response code. - - Examples: - - C: Setactive "vacationscript" - S: Ok - - C: Setactive "" - S: Ok - - C: Setactive "baz" - S: No (NONEXISTENT) "There is no script by that name" - - C: Setactive "baz" - S: No (NONEXISTENT) {31} - S: There is no script by that name - - - - - - - - - -Melnikov & Martin Standards Track [Page 24] - -RFC 5804 ManageSieve July 2010 - - -2.9. GETSCRIPT Command - - Arguments: String - script name - - This command gets the contents of the specified script. If the - script does not exist, the server MUST reply with a NO response. - Such a reply SHOULD contain the NONEXISTENT response code. - - Upon success, a string with the contents of the script is returned - followed by an OK response. - - Example: - - C: Getscript "myscript" - S: {54} - S: #this is my wonderful script - S: reject "I reject all"; - S: - S: OK - -2.10. DELETESCRIPT Command - - Arguments: String - script name - - This command is used to delete a user's Sieve script. Servers MUST - reply with a NO response if the script does not exist. Such - responses SHOULD include the NONEXISTENT response code. - - The server MUST NOT allow the client to delete an active script, so - the server MUST reply with a NO response if attempted. Such a - response SHOULD contain the ACTIVE response code. If a client wishes - to delete an active script, it should use the SETACTIVE command to - disable the script first. - - Example: - - C: Deletescript "foo" - S: Ok - - C: Deletescript "baz" - S: No (ACTIVE) "You may not delete an active script" - - - - - - - - - - -Melnikov & Martin Standards Track [Page 25] - -RFC 5804 ManageSieve July 2010 - - -2.11. RENAMESCRIPT Command - - Arguments: String - Old Script name - String - New Script name - - This command is used to rename a user's Sieve script. Servers MUST - reply with a NO response if the old script does not exist (in which - case the NONEXISTENT response code SHOULD be included), or a script - with the new name already exists (in which case the ALREADYEXISTS - response code SHOULD be included). Renaming the active script is - allowed; the renamed script remains active. - - Example: - - C: Renamescript "foo" "bar" - S: Ok - - C: Renamescript "baz" "bar" - S: No "bar already exists" - - If the server doesn't support the RENAMESCRIPT command, the client - can emulate it by performing the following steps: - - 1. List available scripts with LISTSCRIPTS. If the script with the - new script name exists, then the client should ask the user - whether to abort the operation, to replace the script (by issuing - the DELETESCRIPT <newname> after that), or to choose a different - name. - - 2. Download the old script with GETSCRIPT <oldname>. - - 3. Upload the old script with the new name: PUTSCRIPT <newname>. - - 4. If the old script was active (as reported by LISTSCRIPTS in step - 1), then make the new script active: SETACTIVE <newname>. - - 5. Delete the old script: DELETESCRIPT <oldname>. - - Note that these steps don't describe how to handle various other - error conditions (for example, NO response containing QUOTA response - code in step 3). Error handling is left as an exercise for the - reader. - - - - - - - - - -Melnikov & Martin Standards Track [Page 26] - -RFC 5804 ManageSieve July 2010 - - -2.12. CHECKSCRIPT Command - - Arguments: String - Script content - - The CHECKSCRIPT command is used by the client to verify Sieve script - validity without storing the script on the server. - - The server MUST check the submitted script for syntactic validity, - which includes checking that all Sieve extensions mentioned in Sieve - script "require" statement(s) are supported by the Sieve interpreter. - (Note that if the Sieve interpreter supports the Sieve "ihave" - extension [I-HAVE], any unrecognized/unsupported extension mentioned - in the "ihave" test MUST NOT cause the syntactic validation failure.) - If the script fails this test, the server MUST reply with a NO - response. The message given with a NO response MUST be human - readable and SHOULD contain a specific error message giving the line - number of the first error. Implementors should strive to produce - helpful error messages similar to those given by programming language - compilers. Client implementations should note that this may be a - multiline literal string with more than one error message separated - by CRLFs. The human-readable message is in the language returned in - the latest LANGUAGE capability (or in "i-default"; see Section 1.7), - encoded in UTF-8 [UTF-8]. - - Examples: - - C: CheckScript {31+} - C: #comment - C: InvalidSieveCommand - C: - S: NO "line 2: Syntax error" - - A ManageSieve server supporting this command MUST NOT check if the - script will put the current user over its quota limit. - - An OK response MAY contain the WARNINGS response code. In such a - case, the human-readable message that follows the OK response SHOULD - contain a specific warning message (or messages) giving the line - number(s) in the script that might contain errors not intended by the - script writer. The human-readable message is in the language - returned in the latest LANGUAGE capability (or in "i-default"; see - Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a - response code SHOULD present the message to the user. - - - - - - - - -Melnikov & Martin Standards Track [Page 27] - -RFC 5804 ManageSieve July 2010 - - -2.13. NOOP Command - - Arguments: String - tag to echo back (optional) - - The NOOP command does nothing, beyond returning a response to the - client. It may be used by clients for protocol re-synchronization or - to reset any inactivity auto-logout timer on the server. - - The response to the NOOP command is always OK, followed by the TAG - response code together with the supplied string. If no string was - supplied in the NOOP command, the TAG response code MUST NOT be - included. - - Examples: - - C: NOOP - S: OK "NOOP completed" - - C: NOOP "STARTTLS-SYNC-42" - S: OK (TAG {16} - S: STARTTLS-SYNC-42) "Done" - -2.14. Recommended Extensions - - The UNAUTHENTICATE extension (advertised as the "UNAUTHENTICATE" - capability with no parameters) defines a new UNAUTHENTICATE command, - which allows a client to return the server to non-authenticated - state. Support for this extension is RECOMMENDED. - -2.14.1. UNAUTHENTICATE Command - - The UNAUTHENTICATE command returns the server to the - non-authenticated state. It doesn't affect any previously - established TLS [TLS] or SASL (Section 2.1) security layer. - - The UNAUTHENTICATE command is only valid in authenticated state. If - issued in a wrong state, the server MUST reject it with a NO - response. - - The UNAUTHENTICATE command has no parameters. - - When issued in the authenticated state, the UNAUTHENTICATE command - MUST NOT fail (i.e., it must never return anything other than OK or - BYE). - - - - - - - -Melnikov & Martin Standards Track [Page 28] - -RFC 5804 ManageSieve July 2010 - - -3. Sieve URL Scheme - - URI scheme name: sieve - - Status: permanent - - URI scheme syntax: Described using ABNF [ABNF]. Some ABNF - productions not defined below are from [URI-GEN]. - - sieveurl = sieveurl-server / sieveurl-list-scripts / - sieveurl-script - - sieveurl-server = "sieve://" authority - - sieveurl-list-scripts = "sieve://" authority ["/"] - - sieveurl-script = "sieve://" authority "/" - [owner "/"] scriptname - - authority = <defined in [URI-GEN]> - - owner = *ochar - ;; %-encoded version of [SASL] authorization - ;; identity (script owner) or "userid". - ;; - ;; Empty owner is used to reference - ;; global scripts. - ;; - ;; Note that ASCII characters such as " ", ";", - ;; "&", "=", "/" and "?" must be %-encoded - ;; as per rule specified in [URI-GEN]. - - scriptname = 1*ochar - ;; %-encoded version of UTF-8 representation - ;; of the script name. - ;; Note that ASCII characters such as " ", ";", - ;; "&", "=", "/" and "?" must be %-encoded - ;; as per rule specified in [URI-GEN]. - - ochar = unreserved / pct-encoded / sub-delims-sh / - ":" / "@" - ;; Same as [URI-GEN] 'pchar', - ;; but without ";", "&" and "=". - - unreserved = <defined in [URI-GEN]> - - pct-encoded = <defined in [URI-GEN]> - - - - -Melnikov & Martin Standards Track [Page 29] - -RFC 5804 ManageSieve July 2010 - - - sub-delims-sh = "!" / "$" / "'" / "(" / ")" / - "*" / "+" / "," - ;; Same as [URI-GEN] sub-delims, - ;; but without ";", "&" and "=". - - URI scheme semantics: - - A Sieve URL identifies a Sieve server or a Sieve script on a Sieve - server. The latter form is associated with the application/sieve - MIME type defined in [SIEVE]. There is no MIME type associated - with the former form of Sieve URI. - - The server form is used in the REFERRAL response code (see Section - 1.3) in order to designate another server where the client should - perform its operations. - - The script form allows to retrieve (GETSCRIPT), update - (PUTSCRIPT), delete (DELETESCRIPT), or activate (SETACTIVE) the - named script; however, the most typical action would be to - retrieve the script. If the script name is empty (omitted), the - URI requests that the client lists available scripts using the - LISTSCRIPTS command. - - Encoding considerations: - - The script name and/or the owner, if present, is in UTF-8. Non-- - US-ASCII UTF-8 octets MUST be percent-encoded as described in - [URI-GEN]. US-ASCII characters such as " " (space), ";", "&", - "=", "/" and "?" MUST be %-encoded as described in [URI-GEN]. - Note that "&" and "?" are in this list in order to allow for - future extensions. - - Note that the empty owner (e.g., sieve://example.com//script) is - different from the missing owner (e.g., - sieve://example.com/script) and is reserved for referencing global - scripts. - - The user name (in the "authority" part), if present, is in UTF-8. - Non-US-ASCII UTF-8 octets MUST be percent-encoded as described in - [URI-GEN]. - - Applications/protocols that use this URI scheme name: - ManageSieve [RFC5804] clients and servers. Clients that can store - user preferences in protocols such as [LDAP] or [ACAP]. - - Interoperability considerations: None. - - - - - -Melnikov & Martin Standards Track [Page 30] - -RFC 5804 ManageSieve July 2010 - - - Security considerations: - The <scriptname> part of a ManageSieve URL might potentially disclose - some confidential information about the author of the script or, - depending on a ManageSieve implementation, about configuration of the - mail system. The latter might be used to prepare for a more complex - attack on the mail system. - - Clients resolving ManageSieve URLs that wish to achieve data - confidentiality and/or integrity SHOULD use the STARTTLS command (if - supported by the server) before starting authentication, or use a - SASL mechanism, such as GSSAPI, that provides a confidentiality - security layer. - - Contact: Alexey Melnikov <alexey.melnikov@isode.com> - - Author/Change controller: IESG. - - References: This document and RFC 5228 [SIEVE]. - -4. Formal Syntax - - The following syntax specification uses the Augmented Backus-Naur - Form (BNF) notation as specified in [ABNF]. This uses the ABNF core - rules as specified in Appendix A of the ABNF specification [ABNF]. - "UTF8-2", "UTF8-3", and "UTF8-4" non-terminal are defined in [UTF-8]. - - Except as noted otherwise, all alphabetic characters are case- - insensitive. The use of upper- or lowercase characters to define - token strings is for editorial clarity only. Implementations MUST - accept these strings in a case-insensitive fashion. - - SAFE-CHAR = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / - %x5D-7F - ;; any TEXT-CHAR except QUOTED-SPECIALS - - QUOTED-CHAR = SAFE-UTF8-CHAR / "\" QUOTED-SPECIALS - - QUOTED-SPECIALS = DQUOTE / "\" - - SAFE-UTF8-CHAR = SAFE-CHAR / UTF8-2 / UTF8-3 / UTF8-4 - ;; <UTF8-2>, <UTF8-3>, and <UTF8-4> - ;; are defined in [UTF-8]. - - ATOM-CHAR = "!" / %x23-27 / %x2A-5B / %x5D-7A / %x7C-7E - ;; Any CHAR except ATOM-SPECIALS - - ATOM-SPECIALS = "(" / ")" / "{" / SP / CTL / QUOTED-SPECIALS - - - - -Melnikov & Martin Standards Track [Page 31] - -RFC 5804 ManageSieve July 2010 - - - NZDIGIT = %x31-39 - ;; 1-9 - - atom = 1*1024ATOM-CHAR - - iana-token = atom - ;; MUST be registered with IANA - - auth-type = DQUOTE auth-type-name DQUOTE - - auth-type-name = iana-token - ;; as defined in SASL [SASL] - - command = (command-any / command-auth / - command-nonauth) CRLF - ;; Modal based on state - - command-any = command-capability / command-logout / - command-noop - ;; Valid in all states - - command-auth = command-getscript / command-setactive / - command-listscripts / command-deletescript / - command-putscript / command-checkscript / - command-havespace / - command-renamescript / - command-unauthenticate - ;; Valid only in Authenticated state - - command-nonauth = command-authenticate / command-starttls - ;; Valid only when in Non-Authenticated - ;; state - - command-authenticate = "AUTHENTICATE" SP auth-type [SP string] - *(CRLF string) - - command-capability = "CAPABILITY" - - command-deletescript = "DELETESCRIPT" SP sieve-name - - command-getscript = "GETSCRIPT" SP sieve-name - - command-havespace = "HAVESPACE" SP sieve-name SP number - - command-listscripts = "LISTSCRIPTS" - - command-noop = "NOOP" [SP string] - - - - -Melnikov & Martin Standards Track [Page 32] - -RFC 5804 ManageSieve July 2010 - - - command-logout = "LOGOUT" - - command-putscript = "PUTSCRIPT" SP sieve-name SP sieve-script - - command-checkscript = "CHECKSCRIPT" SP sieve-script - - sieve-script = string - - command-renamescript = "RENAMESCRIPT" SP old-sieve-name SP - new-sieve-name - - old-sieve-name = sieve-name - - new-sieve-name = sieve-name - - command-setactive = "SETACTIVE" SP active-sieve-name - - command-starttls = "STARTTLS" - - command-unauthenticate= "UNAUTHENTICATE" - - extend-token = atom - ;; MUST be defined by a Standards Track or - ;; IESG-approved experimental protocol - ;; extension - - extension-data = extension-item *(SP extension-item) - - extension-item = extend-token / string / number / - "(" [extension-data] ")" - - literal-c2s = "{" number "+}" CRLF *OCTET - ;; The number represents the number of - ;; octets. - ;; This type of literal can only be sent - ;; from the client to the server. - - literal-s2c = "{" number "}" CRLF *OCTET - ;; Almost identical to literal-c2s, - ;; but with no '+' character. - ;; The number represents the number of - ;; octets. - ;; This type of literal can only be sent - ;; from the server to the client. - - - - - - - -Melnikov & Martin Standards Track [Page 33] - -RFC 5804 ManageSieve July 2010 - - - number = (NZDIGIT *DIGIT) / "0" - ;; A 32-bit unsigned number - ;; with no extra leading zeros. - ;; (0 <= n < 4,294,967,296) - - number-str = string - ;; <number> encoded as a <string>. - - quoted = DQUOTE *1024QUOTED-CHAR DQUOTE - ;; limited to 1024 octets between the <">s - - resp-code = "AUTH-TOO-WEAK" / "ENCRYPT-NEEDED" / "QUOTA" - ["/" ("MAXSCRIPTS" / "MAXSIZE")] / - resp-code-sasl / - resp-code-referral / - "TRANSITION-NEEDED" / "TRYLATER" / - "ACTIVE" / "NONEXISTENT" / - "ALREADYEXISTS" / "WARNINGS" / - "TAG" SP string / - resp-code-ext - - resp-code-referral = "REFERRAL" SP sieveurl - - resp-code-sasl = "SASL" SP string - - resp-code-name = iana-token - ;; The response code name is hierarchical, - ;; separated by '/'. - ;; The response code name MUST NOT start - ;; with '/'. - - resp-code-ext = resp-code-name [SP extension-data] - ;; unknown response codes MUST be tolerated - ;; by the client. - - response = response-authenticate / - response-logout / - response-getscript / - response-setactive / - response-listscripts / - response-deletescript / - response-putscript / - response-checkscript / - response-capability / - response-havespace / - response-starttls / - response-renamescript / - response-noop / - - - -Melnikov & Martin Standards Track [Page 34] - -RFC 5804 ManageSieve July 2010 - - - response-unauthenticate - - response-authenticate = *(string CRLF) - ((response-ok [response-capability]) / - response-nobye) - ;; <response-capability> is REQUIRED if a - ;; SASL security layer was negotiated and - ;; MUST be omitted otherwise. - - response-capability = *(single-capability) response-oknobye - - single-capability = capability-name [SP string] CRLF - - capability-name = string - - ;; Note that literal-s2c is allowed. - - initial-capabilities = DQUOTE "IMPLEMENTATION" DQUOTE SP string / - DQUOTE "SASL" DQUOTE SP sasl-mechs / - DQUOTE "SIEVE" DQUOTE SP sieve-extensions / - DQUOTE "MAXREDIRECTS" DQUOTE SP number-str / - DQUOTE "NOTIFY" DQUOTE SP notify-mechs / - DQUOTE "STARTTLS" DQUOTE / - DQUOTE "LANGUAGE" DQUOTE SP language / - DQUOTE "VERSION" DQUOTE SP version / - DQUOTE "OWNER" DQUOTE SP string - ;; Each capability conforms to - ;; the syntax for single-capability. - ;; Also, note that the capability name - ;; can be returned as either literal-s2c - ;; or quoted, even though only "quoted" - ;; string is shown above. - - version = ( DQUOTE "1.0" DQUOTE ) / version-ext - - version-ext = DQUOTE ver-major "." ver-minor DQUOTE - ; Future versions specified in updates - ; to this document. An increment to - ; the ver-major means a backward-incompatible - ; change to the protocol, e.g., "3.5" (ver-major "3") - ; is not backward-compatible with any "2.X" version. - ; Any version "Z.W" MUST be backward compatible - ; with any version "Z.Q", where Q < W. - ; For example, version "2.4" is backward compatible - ; with version "2.0", "2.1", "2.2", and "2.3". - - ver-major = number - - - - -Melnikov & Martin Standards Track [Page 35] - -RFC 5804 ManageSieve July 2010 - - - ver-minor = number - - sasl-mechs = string - ; Space-separated list of SASL mechanisms, - ; each SASL mechanism name complies with rules - ; specified in [SASL]. - ; Can be empty. - - sieve-extensions = string - ; Space-separated list of supported SIEVE extensions. - ; Can be empty. - - language = string - ; Contains <Language-Tag> from [RFC5646]. - - - notify-mechs = string - ; Space-separated list of URI schema parts - ; for supported notification [NOTIFY] methods. - ; MUST NOT be empty. - - response-deletescript = response-oknobye - - response-getscript = (sieve-script CRLF response-ok) / - response-nobye - - response-havespace = response-oknobye - - response-listscripts = *(sieve-name [SP "ACTIVE"] CRLF) - response-oknobye - ;; ACTIVE may only occur with one sieve-name - - response-logout = response-oknobye - - response-unauthenticate= response-oknobye - ;; "NO" response can only be returned when - ;; the command is issued in a wrong state - ;; or has a wrong number of parameters - - response-ok = "OK" [SP "(" resp-code ")"] - [SP string] CRLF - ;; The string contains human-readable text - ;; encoded as UTF-8. - - response-nobye = ("NO" / "BYE") [SP "(" resp-code ")"] - [SP string] CRLF - ;; The string contains human-readable text - ;; encoded as UTF-8. - - - -Melnikov & Martin Standards Track [Page 36] - -RFC 5804 ManageSieve July 2010 - - - response-oknobye = response-ok / response-nobye - - response-noop = response-ok - - response-putscript = response-oknobye - - response-checkscript = response-oknobye - - response-renamescript = response-oknobye - - response-setactive = response-oknobye - - response-starttls = (response-ok response-capability) / - response-nobye - - sieve-name = string - ;; See Section 1.6 for the full list of - ;; prohibited characters. - ;; Empty string is not allowed. - - active-sieve-name = string - ;; See Section 1.6 for the full list of - ;; prohibited characters. - ;; This is similar to <sieve-name>, but - ;; empty string is allowed and has a special - ;; meaning. - - string = quoted / literal-c2s / literal-s2c - ;; literal-c2s is only allowed when sent - ;; from the client to the server. - ;; literal-s2c is only allowed when sent - ;; from the server to the client. - ;; quoted is allowed in either direction. - -5. Security Considerations - - The AUTHENTICATE command uses SASL [SASL] to provide authentication - and authorization services. Integrity and privacy services can be - provided by [SASL] and/or [TLS]. When a SASL mechanism is used, the - security considerations for that mechanism apply. - - This protocol's transactions are susceptible to passive observers or - man-in-the-middle attacks that alter the data, unless the optional - encryption and integrity services of the SASL (via the AUTHENTICATE - command) and/or [TLS] (via the STARTTLS command) are enabled, or an - external security mechanism is used for protection. It may be useful - to allow configuration of both clients and servers to refuse to - transfer sensitive information in the absence of strong encryption. - - - -Melnikov & Martin Standards Track [Page 37] - -RFC 5804 ManageSieve July 2010 - - - If an implementation supports SASL mechanisms that are vulnerable to - passive eavesdropping attacks (such as [PLAIN]), then the - implementation MUST support at least one configuration where these - SASL mechanisms are not advertised or used without the presence of an - external security layer such as [TLS]. - - Some response codes returned on failed AUTHENTICATE command may - disclose whether or not the username is valid (e.g., TRANSITION- - NEEDED), so server implementations SHOULD provide the ability to - disable these features (or make them not conditional on a per-user - basis) for sites concerned about such disclosure. In the case of - ENCRYPT-NEEDED, if it is applied to all identities then no extra - information is disclosed, but if it is applied on a per-user basis it - can disclose information. - - A compromised or malicious server can use the TRANSITION-NEEDED - response code to force the client that is configured to use a - mechanism that does not disclose the user's password to the server - (e.g., Kerberos), to send the bare password to the server. Clients - SHOULD have the ability to disable the password transition feature, - or disclose that risk to the user and offer the user an option of how - to proceed. - -6. IANA Considerations - - IANA has reserved TCP port number 4190 for use with the ManageSieve - protocol described in this document. - - IANA has registered the "sieve" URI scheme defined in Section 3 of - this document. - - IANA has registered "sieve" in the "GSSAPI/Kerberos/SASL Service - Names" registry. - - IANA has created a new registry for ManageSieve capabilities. The - registration template for ManageSieve capabilities is specified in - Section 6.1. ManageSieve protocol capabilities MUST be specified in - a Standards-Track or IESG-approved Experimental RFC. - - IANA has created a new registry for ManageSieve response codes. The - registration template for ManageSieve response codes is specified in - Section 6.3. ManageSieve protocol response codes MUST be specified - in a Standards-Track or IESG-approved Experimental RFC. - - - - - - - - -Melnikov & Martin Standards Track [Page 38] - -RFC 5804 ManageSieve July 2010 - - -6.1. ManageSieve Capability Registration Template - - To: iana@iana.org - Subject: ManageSieve Capability Registration - - Please register the following ManageSieve capability: - - Capability name: - Description: - Relevant publications: - Person & email address to contact for further information: - Author/Change controller: - -6.2. Registration of Initial ManageSieve Capabilities - - To: iana@iana.org - Subject: ManageSieve Capability Registration - - Please register the following ManageSieve capabilities: - - Capability name: IMPLEMENTATION - Description: Its value contains the name of the server - implementation and its version. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: SASL - Description: Its value contains a space-separated list of SASL - mechanisms supported by the server. - Relevant publications: this RFC, Sections 1.7 and 2.1. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: SIEVE - Description: Its value contains a space-separated list of supported - SIEVE extensions. - Relevant publications: this RFC, Section 1.7. Also [SIEVE]. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 39] - -RFC 5804 ManageSieve July 2010 - - - Capability name: STARTTLS - Description: This capability is returned if the server supports TLS - (STARTTLS command). - Relevant publications: this RFC, Sections 1.7 and 2.2. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: NOTIFY - Description: This capability is returned if the server supports the - 'enotify' [NOTIFY] Sieve extension. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: MAXREDIRECTS - Description: This capability returns the limit on the number of - Sieve "redirect" actions a script can perform during a - single evaluation. The value is a non-negative number - represented as a ManageSieve string. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: LANGUAGE - Description: The language (<Language-Tag> from [RFC5646]) currently - used for human-readable error messages. - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Capability name: OWNER - Description: Its value contains the UTF-8-encoded name of the - currently logged-in user ("authorization identity" - according to RFC 4422). - Relevant publications: this RFC, Section 1.7. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - -Melnikov & Martin Standards Track [Page 40] - -RFC 5804 ManageSieve July 2010 - - - Capability name: VERSION - Description: This capability is returned if the server is compliant - with RFC 5804; i.e., that it supports RENAMESCRIPT, - CHECKSCRIPT, and NOOP commands. - Relevant publications: this RFC, Sections 2.11, 2.12, and 2.13. - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - -6.3. ManageSieve Response Code Registration Template - - To: iana@iana.org - Subject: ManageSieve Response Code Registration - - Please register the following ManageSieve response code: - - Response Code: - Arguments (use ABNF to specify syntax, or the word NONE if none - can be specified): - Purpose: - Published Specification(s): - Person & email address to contact for further information: - Author/Change controller: - -6.4. Registration of Initial ManageSieve Response Codes - - To: iana@iana.org - Subject: ManageSieve Response Code Registration - - Please register the following ManageSieve response codes: - - Response Code: AUTH-TOO-WEAK - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code is returned in the NO response from - an AUTHENTICATE command. It indicates that site - security policy forbids the use of the requested - mechanism for the specified authentication identity. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - -Melnikov & Martin Standards Track [Page 41] - -RFC 5804 ManageSieve July 2010 - - - Response Code: ENCRYPT-NEEDED - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code is returned in the NO response from - an AUTHENTICATE command. It indicates that site - security policy requires the use of a strong - encryption mechanism for the specified authentication - identity and mechanism. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: QUOTA - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined quota constraints. If - this response code is returned in the OK response, it - can mean that the user is near its quota or that the - user exceeded its quota, but the server supports soft - quotas. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: QUOTA/MAXSCRIPTS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined limit on the number of - Sieve scripts. If this response code is returned in - the OK response, it can mean that the user is near its - quota or that the user exceeded its quota, but the - server supports soft quotas. This response code is a - more specific version of the QUOTA response code. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 42] - -RFC 5804 ManageSieve July 2010 - - - Response Code: QUOTA/MAXSIZE - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: If this response code is returned in the NO/BYE - response, it means that the command would have placed - the user above the site-defined maximum script size. - If this response code is returned in the OK response, - it can mean that the user is near its quota or that - the user exceeded its quota, but the server supports - soft quotas. This response code is a more specific - version of the QUOTA response code. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: REFERRAL - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): <sieveurl> - Purpose: This response code may be returned with a BYE result - from any command, and includes a mandatory parameter - that indicates what server to access to manage this - user's Sieve scripts. The server will be specified by - a Sieve URL (see Section 3). The scriptname portion - of the URL MUST NOT be specified. The client should - authenticate to the specified server and use it for - all further commands in the current session. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: SASL - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): <string> - Purpose: This response code can occur in the OK response to a - successful AUTHENTICATE command and includes the - optional final server response data from the server as - specified by [SASL]. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - -Melnikov & Martin Standards Track [Page 43] - -RFC 5804 ManageSieve July 2010 - - - Response Code: TRANSITION-NEEDED - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code occurs in a NO response of an - AUTHENTICATE command. It indicates that the user name - is valid, but the entry in the authentication database - needs to be updated in order to permit authentication - with the specified mechanism. This is typically done - by establishing a secure channel using TLS, followed - by authenticating once using the [PLAIN] - authentication mechanism. The selected mechanism - SHOULD then work for authentications in subsequent - sessions. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: TRYLATER - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed due to a temporary server failure. - The client MAY continue using local information and - try the command later. This response code only make - sense when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: ACTIVE - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because it is not allowed on the - active script, for example, DELETESCRIPT on the active - script. This response code only makes sense when - returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - - - - - -Melnikov & Martin Standards Track [Page 44] - -RFC 5804 ManageSieve July 2010 - - - Response Code: NONEXISTENT - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because the referenced script name - doesn't exist. This response code only makes sense - when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: ALREADYEXISTS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: A command failed because the referenced script name - already exists. This response code only makes sense - when returned in a NO/BYE response. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: WARNINGS - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): NONE - Purpose: This response code MAY be returned by the server in - the OK response (but it might be returned with the NO/ - BYE response as well) and signals the client that even - though the script is syntactically valid, it might - contain errors not intended by the script writer. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - Response Code: TAG - Arguments (use ABNF to specify syntax, or the word NONE if none can - be specified): string - Purpose: This response code name is followed by a string - specified in the command that caused this response. - It is typically used for client state synchronization. - Published Specification(s): [RFC5804] - Person & email address to contact for further information: - Alexey Melnikov <alexey.melnikov@isode.com> - Author/Change controller: IESG. - - - - - - -Melnikov & Martin Standards Track [Page 45] - -RFC 5804 ManageSieve July 2010 - - -7. Internationalization Considerations - - The LANGUAGE capability (see Section 1.7) allows a client to discover - the current language used in all human-readable responses that might - be returned at the end of any OK/NO/BYE response. Human-readable - text in OK responses typically doesn't need to be shown to the user, - unless it is returned in response to a PUTSCRIPT or CHECKSCRIPT - command that also contains the WARNINGS response code (Section 1.3). - Human-readable text from NO/BYE responses is intended be shown to the - user, unless the client can automatically handle failure of the - command that caused such a response. Clients SHOULD use response - codes (Section 1.3) for automatic error handling. Response codes MAY - also be used by the client to present error messages in a language - understood by the user, for example, if the LANGUAGE capability - doesn't return a language understood by the user. - - Note that the human-readable text from OK (WARNINGS) or NO/BYE - responses for PUTSCRIPT/CHECKSCRIPT commands is intended for advanced - users that understand Sieve language. Such advanced users are often - sophisticated enough to be able to handle whatever language the - server is using, even if it is not their preferred language, and will - want to see error/warning text no matter what language the server - puts it in. - - A client that generates Sieve script automatically, for example, if - the script is generated without user intervention or from a UI that - presents an abstract list of conditions and corresponding actions, - SHOULD NOT present warning/error messages to the user, because the - user might not even be aware that the client is using Sieve - underneath. However, if the client has a debugging mode, such - warnings/errors SHOULD be available in the debugging mode. - - Note that this document doesn't provide a way to modify the currently - used language. It is expected that a future extension will address - that. - -8. Acknowledgements - - Thanks to Simon Josefsson, Larry Greenfield, Allen Johnson, Chris - Newman, Lyndon Nerenberg, Tim Showalter, Sarah Robeson, Walter Wong, - Barry Leiba, Arnt Gulbrandsen, Stephan Bosch, Ken Murchison, Phil - Pennock, Ned Freed, Jeffrey Hutzelman, Mark E. Mallett, Dilyan - Palauzov, Dave Cridland, Aaron Stone, Robert Burrell Donkin, Patrick - Ben Koetter, Bjoern Hoehrmann, Martin Duerst, Pasi Eronen, Magnus - Westerlund, Tim Polk, and Julien Coloos for help with this document. - Special thank you to Phil Pennock for providing text for the NOOP - command, as well as finding various bugs in the document. - - - - -Melnikov & Martin Standards Track [Page 46] - -RFC 5804 ManageSieve July 2010 - - -9. References - -9.1. Normative References - - [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax - Specifications: ABNF", STD 68, RFC 5234, January 2008. - - [ACAP] Newman, C. and J. Myers, "ACAP -- Application - Configuration Access Protocol", RFC 2244, November - 1997. - - [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data - Encodings", RFC 4648, October 2006. - - [DNS-SRV] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR - for specifying the location of services (DNS SRV)", - RFC 2782, February 2000. - - [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. - - [NET-UNICODE] Klensin, J. and M. Padlipsky, "Unicode Format for - Network Interchange", RFC 5198, March 2008. - - [NOTIFY] Melnikov, A., Leiba, B., Segmuller, W., and T. Martin, - "Sieve Email Filtering: Extension for Notifications", - RFC 5435, January 2009. - - [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and - Languages", BCP 18, RFC 2277, January 1998. - - [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version - 6 (IPv6) Specification", RFC 2460, December 1998. - - [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, - "Internationalizing Domain Names in Applications - (IDNA)", RFC 3490, March 2003. - - [RFC4519] Sciberras, A., "Lightweight Directory Access Protocol - (LDAP): Schema for User Applications", RFC 4519, June - 2006. - - [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying - Languages", BCP 47, RFC 5646, September 2009. - - [RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791, - September 1981. - - - - -Melnikov & Martin Standards Track [Page 47] - -RFC 5804 ManageSieve July 2010 - - - [SASL] Melnikov, A. and K. Zeilenga, "Simple Authentication - and Security Layer (SASL)", RFC 4422, June 2006. - - [SASLprep] Zeilenga, K., "SASLprep: Stringprep Profile for User - Names and Passwords", RFC 4013, February 2005. - - [SCRAM] Menon-Sen, A., Melnikov, A., Newman, C., and N. - Williams, "Salted Challenge Response Authentication - Mechanism (SCRAM) SASL and GSS-API Mechanisms", RFC - 5802, July 2010. - - [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email - Filtering Language", RFC 5228, January 2008. - - [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of - Internationalized Strings ("stringprep")", RFC 3454, - December 2002. - - [TLS] Dierks, T. and E. Rescorla, "The Transport Layer - Security (TLS) Protocol Version 1.2", RFC 5246, August - 2008. - - [URI-GEN] Berners-Lee, T., Fielding, R., and L. Masinter, - "Uniform Resource Identifier (URI): Generic Syntax", - STD 66, RFC 3986, January 2005. - - [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO - 10646", STD 63, RFC 3629, November 2003. - - [X509] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., - Housley, R., and W. Polk, "Internet X.509 Public Key - Infrastructure Certificate and Certificate Revocation - List (CRL) Profile", RFC 5280, May 2008. - - [X509-SRV] Santesson, S., "Internet X.509 Public Key - Infrastructure Subject Alternative Name for Expression - of Service Name", RFC 4985, August 2007. - -9.2. Informative References - - [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest Authentication - as a SASL Mechanism", RFC 2831, May 2000. - - [GSSAPI] Melnikov, A., "The Kerberos V5 ("GSSAPI") Simple - Authentication and Security Layer (SASL) Mechanism", - RFC 4752, November 2006. - - - - - -Melnikov & Martin Standards Track [Page 48] - -RFC 5804 ManageSieve July 2010 - - - [I-HAVE] Freed, N., "Sieve Email Filtering: Ihave Extension", - RFC 5463, March 2009. - - [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - - VERSION 4rev1", RFC 3501, March 2003. - - [LDAP] Zeilenga, K., "Lightweight Directory Access Protocol - (LDAP): Technical Specification Road Map", RFC 4510, - June 2006. - - [PLAIN] Zeilenga, K., "The PLAIN Simple Authentication and - Security Layer (SASL) Mechanism", RFC 4616, August - 2006. - -Authors' Addresses - - Alexey Melnikov (editor) - Isode Limited - 5 Castle Business Village - 36 Station Road - Hampton, Middlesex TW12 2BX - UK - - EMail: Alexey.Melnikov@isode.com - - - Tim Martin - BeThereBeSquare, Inc. - 672 Haight st. - San Francisco, CA 94117 - USA - - Phone: +1 510 260-4175 - EMail: timmartin@alumni.cmu.edu - - - - - - - - - - - - - - - - - -Melnikov & Martin Standards Track [Page 49] - diff --git a/rfc/rfc1341.txt b/rfc/rfc1341.txt @@ -0,0 +1,5265 @@ + + + + + + + Network Working Group N. Borenstein, Bellcore + Request for Comments: 1341 N. Freed, Innosoft + June 1992 + + + + MIME (Multipurpose Internet Mail Extensions): + + + Mechanisms for Specifying and Describing + the Format of Internet Message Bodies + + + Status of this Memo + + This RFC specifies an IAB standards track protocol for the + Internet community, and requests discussion and suggestions + for improvements. Please refer to the current edition of + the "IAB Official Protocol Standards" for the + standardization state and status of this protocol. + Distribution of this memo is unlimited. + + Abstract + + RFC 822 defines a message representation protocol which + specifies considerable detail about message headers, but + which leaves the message content, or message body, as flat + ASCII text. This document redefines the format of message + bodies to allow multi-part textual and non-textual message + bodies to be represented and exchanged without loss of + information. This is based on earlier work documented in + RFC 934 and RFC 1049, but extends and revises that work. + Because RFC 822 said so little about message bodies, this + document is largely orthogonal to (rather than a revision + of) RFC 822. + + In particular, this document is designed to provide + facilities to include multiple objects in a single message, + to represent body text in character sets other than US- + ASCII, to represent formatted multi-font text messages, to + represent non-textual material such as images and audio + fragments, and generally to facilitate later extensions + defining new types of Internet mail for use by cooperating + mail agents. + + This document does NOT extend Internet mail header fields to + permit anything other than US-ASCII text data. It is + recognized that such extensions are necessary, and they are + the subject of a companion document [RFC -1342]. + + A table of contents appears at the end of this document. + + + + + + + Borenstein & Freed [Page i] + + + + + + + + 1 Introduction + + Since its publication in 1982, RFC 822 [RFC-822] has defined + the standard format of textual mail messages on the + Internet. Its success has been such that the RFC 822 format + has been adopted, wholly or partially, well beyond the + confines of the Internet and the Internet SMTP transport + defined by RFC 821 [RFC-821]. As the format has seen wider + use, a number of limitations have proven increasingly + restrictive for the user community. + + RFC 822 was intended to specify a format for text messages. + As such, non-text messages, such as multimedia messages that + might include audio or images, are simply not mentioned. + Even in the case of text, however, RFC 822 is inadequate for + the needs of mail users whose languages require the use of + character sets richer than US ASCII [US-ASCII]. Since RFC + 822 does not specify mechanisms for mail containing audio, + video, Asian language text, or even text in most European + languages, additional specifications are needed + + One of the notable limitations of RFC 821/822 based mail + systems is the fact that they limit the contents of + electronic mail messages to relatively short lines of + seven-bit ASCII. This forces users to convert any non- + textual data that they may wish to send into seven-bit bytes + representable as printable ASCII characters before invoking + a local mail UA (User Agent, a program with which human + users send and receive mail). Examples of such encodings + currently used in the Internet include pure hexadecimal, + uuencode, the 3-in-4 base 64 scheme specified in RFC 1113, + the Andrew Toolkit Representation [ATK], and many others. + + The limitations of RFC 822 mail become even more apparent as + gateways are designed to allow for the exchange of mail + messages between RFC 822 hosts and X.400 hosts. X.400 [X400] + specifies mechanisms for the inclusion of non-textual body + parts within electronic mail messages. The current + standards for the mapping of X.400 messages to RFC 822 + messages specify that either X.400 non-textual body parts + should be converted to (not encoded in) an ASCII format, or + that they should be discarded, notifying the RFC 822 user + that discarding has occurred. This is clearly undesirable, + as information that a user may wish to receive is lost. + Even though a user's UA may not have the capability of + dealing with the non-textual body part, the user might have + some mechanism external to the UA that can extract useful + information from the body part. Moreover, it does not allow + for the fact that the message may eventually be gatewayed + back into an X.400 message handling system (i.e., the X.400 + message is "tunneled" through Internet mail), where the + non-textual information would definitely become useful + again. + + + + + Borenstein & Freed [Page 1] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + This document describes several mechanisms that combine to + solve most of these problems without introducing any serious + incompatibilities with the existing world of RFC 822 mail. + In particular, it describes: + + 1. A MIME-Version header field, which uses a version number + to declare a message to be conformant with this + specification and allows mail processing agents to + distinguish between such messages and those generated + by older or non-conformant software, which is presumed + to lack such a field. + + 2. A Content-Type header field, generalized from RFC 1049 + [RFC-1049], which can be used to specify the type and + subtype of data in the body of a message and to fully + specify the native representation (encoding) of such + data. + + 2.a. A "text" Content-Type value, which can be used to + represent textual information in a number of + character sets and formatted text description + languages in a standardized manner. + + 2.b. A "multipart" Content-Type value, which can be + used to combine several body parts, possibly of + differing types of data, into a single message. + + 2.c. An "application" Content-Type value, which can be + used to transmit application data or binary data, + and hence, among other uses, to implement an + electronic mail file transfer service. + + 2.d. A "message" Content-Type value, for encapsulating + a mail message. + + 2.e An "image" Content-Type value, for transmitting + still image (picture) data. + + 2.f. An "audio" Content-Type value, for transmitting + audio or voice data. + + 2.g. A "video" Content-Type value, for transmitting + video or moving image data, possibly with audio as + part of the composite video data format. + + 3. A Content-Transfer-Encoding header field, which can be + used to specify an auxiliary encoding that was applied + to the data in order to allow it to pass through mail + transport mechanisms which may have data or character + set limitations. + + 4. Two optional header fields that can be used to further + describe the data in a message body, the Content-ID and + Content-Description header fields. + + + + Borenstein & Freed [Page 2] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + MIME has been carefully designed as an extensible mechanism, + and it is expected that the set of content-type/subtype + pairs and their associated parameters will grow + significantly with time. Several other MIME fields, notably + including character set names, are likely to have new values + defined over time. In order to ensure that the set of such + values is developed in an orderly, well-specified, and + public manner, MIME defines a registration process which + uses the Internet Assigned Numbers Authority (IANA) as a + central registry for such values. Appendix F provides + details about how IANA registration is accomplished. + + Finally, to specify and promote interoperability, Appendix A + of this document provides a basic applicability statement + for a subset of the above mechanisms that defines a minimal + level of "conformance" with this document. + + HISTORICAL NOTE: Several of the mechanisms described in + this document may seem somewhat strange or even baroque at + first reading. It is important to note that compatibility + with existing standards AND robustness across existing + practice were two of the highest priorities of the working + group that developed this document. In particular, + compatibility was always favored over elegance. + + 2 Notations, Conventions, and Generic BNF Grammar + + This document is being published in two versions, one as + plain ASCII text and one as PostScript. The latter is + recommended, though the textual contents are identical. An + Andrew-format copy of this document is also available from + the first author (Borenstein). + + Although the mechanisms specified in this document are all + described in prose, most are also described formally in the + modified BNF notation of RFC 822. Implementors will need to + be familiar with this notation in order to understand this + specification, and are referred to RFC 822 for a complete + explanation of the modified BNF notation. + + Some of the modified BNF in this document makes reference to + syntactic entities that are defined in RFC 822 and not in + this document. A complete formal grammar, then, is obtained + by combining the collected grammar appendix of this document + with that of RFC 822. + + The term CRLF, in this document, refers to the sequence of + the two ASCII characters CR (13) and LF (10) which, taken + together, in this order, denote a line break in RFC 822 + mail. + + The term "character set", wherever it is used in this + document, refers to a coded character set, in the sense of + ISO character set standardization work, and must not be + + + + Borenstein & Freed [Page 3] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + misinterpreted as meaning "a set of characters." + + The term "message", when not further qualified, means either + the (complete or "top-level") message being transferred on a + network, or a message encapsulated in a body of type + "message". + + The term "body part", in this document, means one of the + parts of the body of a multipart entity. A body part has a + header and a body, so it makes sense to speak about the body + of a body part. + + The term "entity", in this document, means either a message + or a body part. All kinds of entities share the property + that they have a header and a body. + + The term "body", when not further qualified, means the body + of an entity, that is the body of either a message or of a + body part. + + Note : the previous four definitions are clearly circular. + This is unavoidable, since the overal structure of a MIME + message is indeed recursive. + + In this document, all numeric and octet values are given in + decimal notation. + + It must be noted that Content-Type values, subtypes, and + parameter names as defined in this document are case- + insensitive. However, parameter values are case-sensitive + unless otherwise specified for the specific parameter. + + FORMATTING NOTE: This document has been carefully formatted + for ease of reading. The PostScript version of this + document, in particular, places notes like this one, which + may be skipped by the reader, in a smaller, italicized, + font, and indents it as well. In the text version, only the + indentation is preserved, so if you are reading the text + version of this you might consider using the PostScript + version instead. However, all such notes will be indented + and preceded by "NOTE:" or some similar introduction, even + in the text version. + + The primary purpose of these non-essential notes is to + convey information about the rationale of this document, or + to place this document in the proper historical or + evolutionary context. Such information may be skipped by + those who are focused entirely on building a compliant + implementation, but may be of use to those who wish to + understand why this document is written as it is. + + For ease of recognition, all BNF definitions have been + placed in a fixed-width font in the PostScript version of + this document. + + + + Borenstein & Freed [Page 4] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 3 The MIME-Version Header Field + + Since RFC 822 was published in 1982, there has really been + only one format standard for Internet messages, and there + has been little perceived need to declare the format + standard in use. This document is an independent document + that complements RFC 822. Although the extensions in this + document have been defined in such a way as to be compatible + with RFC 822, there are still circumstances in which it + might be desirable for a mail-processing agent to know + whether a message was composed with the new standard in + mind. + + Therefore, this document defines a new header field, "MIME- + Version", which is to be used to declare the version of the + Internet message body format standard in use. + + Messages composed in accordance with this document MUST + include such a header field, with the following verbatim + text: + + MIME-Version: 1.0 + + The presence of this header field is an assertion that the + message has been composed in compliance with this document. + + Since it is possible that a future document might extend the + message format standard again, a formal BNF is given for the + content of the MIME-Version field: + + MIME-Version := text + + Thus, future format specifiers, which might replace or + extend "1.0", are (minimally) constrained by the definition + of "text", which appears in RFC 822. + + Note that the MIME-Version header field is required at the + top level of a message. It is not required for each body + part of a multipart entity. It is required for the embedded + headers of a body of type "message" if and only if the + embedded message is itself claimed to be MIME-compliant. + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 5] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 4 The Content-Type Header Field + + The purpose of the Content-Type field is to describe the + data contained in the body fully enough that the receiving + user agent can pick an appropriate agent or mechanism to + present the data to the user, or otherwise deal with the + data in an appropriate manner. + + HISTORICAL NOTE: The Content-Type header field was first + defined in RFC 1049. RFC 1049 Content-types used a simpler + and less powerful syntax, but one that is largely compatible + with the mechanism given here. + + The Content-Type header field is used to specify the nature + of the data in the body of an entity, by giving type and + subtype identifiers, and by providing auxiliary information + that may be required for certain types. After the type and + subtype names, the remainder of the header field is simply a + set of parameters, specified in an attribute/value notation. + The set of meaningful parameters differs for the different + types. The ordering of parameters is not significant. + Among the defined parameters is a "charset" parameter by + which the character set used in the body may be declared. + Comments are allowed in accordance with RFC 822 rules for + structured header fields. + + In general, the top-level Content-Type is used to declare + the general type of data, while the subtype specifies a + specific format for that type of data. Thus, a Content-Type + of "image/xyz" is enough to tell a user agent that the data + is an image, even if the user agent has no knowledge of the + specific image format "xyz". Such information can be used, + for example, to decide whether or not to show a user the raw + data from an unrecognized subtype -- such an action might be + reasonable for unrecognized subtypes of text, but not for + unrecognized subtypes of image or audio. For this reason, + registered subtypes of audio, image, text, and video, should + not contain embedded information that is really of a + different type. Such compound types should be represented + using the "multipart" or "application" types. + + Parameters are modifiers of the content-subtype, and do not + fundamentally affect the requirements of the host system. + Although most parameters make sense only with certain + content-types, others are "global" in the sense that they + might apply to any subtype. For example, the "boundary" + parameter makes sense only for the "multipart" content-type, + but the "charset" parameter might make sense with several + content-types. + + An initial set of seven Content-Types is defined by this + document. This set of top-level names is intended to be + substantially complete. It is expected that additions to + the larger set of supported types can generally be + + + + Borenstein & Freed [Page 6] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + accomplished by the creation of new subtypes of these + initial types. In the future, more top-level types may be + defined only by an extension to this standard. If another + primary type is to be used for any reason, it must be given + a name starting with "X-" to indicate its non-standard + status and to avoid a potential conflict with a future + official name. + + In the Extended BNF notation of RFC 822, a Content-Type + header field value is defined as follows: + + Content-Type := type "/" subtype *[";" parameter] + + type := "application" / "audio" + / "image" / "message" + / "multipart" / "text" + / "video" / x-token + + x-token := <The two characters "X-" followed, with no + intervening white space, by any token> + + subtype := token + + parameter := attribute "=" value + + attribute := token + + value := token / quoted-string + + token := 1*<any CHAR except SPACE, CTLs, or tspecials> + + tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in + / "," / ";" / ":" / "\" / <"> ; quoted-string, + / "/" / "[" / "]" / "?" / "." ; to use within + / "=" ; parameter values + + Note that the definition of "tspecials" is the same as the + RFC 822 definition of "specials" with the addition of the + three characters "/", "?", and "=". + + Note also that a subtype specification is MANDATORY. There + are no default subtypes. + + The type, subtype, and parameter names are not case + sensitive. For example, TEXT, Text, and TeXt are all + equivalent. Parameter values are normally case sensitive, + but certain parameters are interpreted to be case- + insensitive, depending on the intended use. (For example, + multipart boundaries are case-sensitive, but the "access- + type" for message/External-body is not case-sensitive.) + + Beyond this syntax, the only constraint on the definition of + subtype names is the desire that their uses must not + conflict. That is, it would be undesirable to have two + + + + Borenstein & Freed [Page 7] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + different communities using "Content-Type: + application/foobar" to mean two different things. The + process of defining new content-subtypes, then, is not + intended to be a mechanism for imposing restrictions, but + simply a mechanism for publicizing the usages. There are, + therefore, two acceptable mechanisms for defining new + Content-Type subtypes: + + 1. Private values (starting with "X-") may be + defined bilaterally between two cooperating + agents without outside registration or + standardization. + + 2. New standard values must be documented, + registered with, and approved by IANA, as + described in Appendix F. Where intended for + public use, the formats they refer to must + also be defined by a published specification, + and possibly offered for standardization. + + The seven standard initial predefined Content-Types are + detailed in the bulk of this document. They are: + + text -- textual information. The primary subtype, + "plain", indicates plain (unformatted) text. No + special software is required to get the full + meaning of the text, aside from support for the + indicated character set. Subtypes are to be used + for enriched text in forms where application + software may enhance the appearance of the text, + but such software must not be required in order to + get the general idea of the content. Possible + subtypes thus include any readable word processor + format. A very simple and portable subtype, + richtext, is defined in this document. + multipart -- data consisting of multiple parts of + independent data types. Four initial subtypes + are defined, including the primary "mixed" + subtype, "alternative" for representing the same + data in multiple formats, "parallel" for parts + intended to be viewed simultaneously, and "digest" + for multipart entities in which each part is of + type "message". + message -- an encapsulated message. A body of + Content-Type "message" is itself a fully formatted + RFC 822 conformant message which may contain its + own different Content-Type header field. The + primary subtype is "rfc822". The "partial" + subtype is defined for partial messages, to permit + the fragmented transmission of bodies that are + thought to be too large to be passed through mail + transport facilities. Another subtype, + "External-body", is defined for specifying large + bodies by reference to an external data source. + + + + Borenstein & Freed [Page 8] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + image -- image data. Image requires a display device + (such as a graphical display, a printer, or a FAX + machine) to view the information. Initial + subtypes are defined for two widely-used image + formats, jpeg and gif. + audio -- audio data, with initial subtype "basic". + Audio requires an audio output device (such as a + speaker or a telephone) to "display" the contents. + video -- video data. Video requires the capability to + display moving images, typically including + specialized hardware and software. The initial + subtype is "mpeg". + application -- some other kind of data, typically + either uninterpreted binary data or information to + be processed by a mail-based application. The + primary subtype, "octet-stream", is to be used in + the case of uninterpreted binary data, in which + case the simplest recommended action is to offer + to write the information into a file for the user. + Two additional subtypes, "ODA" and "PostScript", + are defined for transporting ODA and PostScript + documents in bodies. Other expected uses for + "application" include spreadsheets, data for + mail-based scheduling systems, and languages for + "active" (computational) email. (Note that active + email entails several securityconsiderations, + which are discussed later in this memo, + particularly in the context of + application/PostScript.) + + Default RFC 822 messages are typed by this protocol as plain + text in the US-ASCII character set, which can be explicitly + specified as "Content-type: text/plain; charset=us-ascii". + If no Content-Type is specified, either by error or by an + older user agent, this default is assumed. In the presence + of a MIME-Version header field, a receiving User Agent can + also assume that plain US-ASCII text was the sender's + intent. In the absence of a MIME-Version specification, + plain US-ASCII text must still be assumed, but the sender's + intent might have been otherwise. + + RATIONALE: In the absence of any Content-Type header field + or MIME-Version header field, it is impossible to be certain + that a message is actually text in the US-ASCII character + set, since it might well be a message that, using the + conventions that predate this document, includes text in + another character set or non-textual data in a manner that + cannot be automatically recognized (e.g., a uuencoded + compressed UNIX tar file). Although there is no fully + acceptable alternative to treating such untyped messages as + "text/plain; charset=us-ascii", implementors should remain + aware that if a message lacks both the MIME-Version and the + Content-Type header fields, it may in practice contain + almost anything. + + + + Borenstein & Freed [Page 9] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + It should be noted that the list of Content-Type values + given here may be augmented in time, via the mechanisms + described above, and that the set of subtypes is expected to + grow substantially. + + When a mail reader encounters mail with an unknown Content- + type value, it should generally treat it as equivalent to + "application/octet-stream", as described later in this + document. + + 5 The Content-Transfer-Encoding Header Field + + Many Content-Types which could usefully be transported via + email are represented, in their "natural" format, as 8-bit + character or binary data. Such data cannot be transmitted + over some transport protocols. For example, RFC 821 + restricts mail messages to 7-bit US-ASCII data with 1000 + character lines. + + It is necessary, therefore, to define a standard mechanism + for re-encoding such data into a 7-bit short-line format. + This document specifies that such encodings will be + indicated by a new "Content-Transfer-Encoding" header field. + The Content-Transfer-Encoding field is used to indicate the + type of transformation that has been used in order to + represent the body in an acceptable manner for transport. + + Unlike Content-Types, a proliferation of Content-Transfer- + Encoding values is undesirable and unnecessary. However, + establishing only a single Content-Transfer-Encoding + mechanism does not seem possible. There is a tradeoff + between the desire for a compact and efficient encoding of + largely-binary data and the desire for a readable encoding + of data that is mostly, but not entirely, 7-bit data. For + this reason, at least two encoding mechanisms are necessary: + a "readable" encoding and a "dense" encoding. + + The Content-Transfer-Encoding field is designed to specify + an invertible mapping between the "native" representation of + a type of data and a representation that can be readily + exchanged using 7 bit mail transport protocols, such as + those defined by RFC 821 (SMTP). This field has not been + defined by any previous standard. The field's value is a + single token specifying the type of encoding, as enumerated + below. Formally: + + Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" / + "8BIT" / "7BIT" / + "BINARY" / x-token + + These values are not case sensitive. That is, Base64 and + BASE64 and bAsE64 are all equivalent. An encoding type of + 7BIT requires that the body is already in a seven-bit mail- + ready representation. This is the default value -- that is, + + + + Borenstein & Freed [Page 10] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + "Content-Transfer-Encoding: 7BIT" is assumed if the + Content-Transfer-Encoding header field is not present. + + The values "8bit", "7bit", and "binary" all imply that NO + encoding has been performed. However, they are potentially + useful as indications of the kind of data contained in the + object, and therefore of the kind of encoding that might + need to be performed for transmission in a given transport + system. "7bit" means that the data is all represented as + short lines of US-ASCII data. "8bit" means that the lines + are short, but there may be non-ASCII characters (octets + with the high-order bit set). "Binary" means that not only + may non-ASCII characters be present, but also that the lines + are not necessarily short enough for SMTP transport. + + The difference between "8bit" (or any other conceivable + bit-width token) and the "binary" token is that "binary" + does not require adherence to any limits on line length or + to the SMTP CRLF semantics, while the bit-width tokens do + require such adherence. If the body contains data in any + bit-width other than 7-bit, the appropriate bit-width + Content-Transfer-Encoding token must be used (e.g., "8bit" + for unencoded 8 bit wide data). If the body contains binary + data, the "binary" Content-Transfer-Encoding token must be + used. + + NOTE: The distinction between the Content-Transfer-Encoding + values of "binary," "8bit," etc. may seem unimportant, in + that all of them really mean "none" -- that is, there has + been no encoding of the data for transport. However, clear + labeling will be of enormous value to gateways between + future mail transport systems with differing capabilities in + transporting data that do not meet the restrictions of RFC + 821 transport. + + As of the publication of this document, there are no + standardized Internet transports for which it is legitimate + to include unencoded 8-bit or binary data in mail bodies. + Thus there are no circumstances in which the "8bit" or + "binary" Content-Transfer-Encoding is actually legal on the + Internet. However, in the event that 8-bit or binary mail + transport becomes a reality in Internet mail, or when this + document is used in conjunction with any other 8-bit or + binary-capable transport mechanism, 8-bit or binary bodies + should be labeled as such using this mechanism. + + NOTE: The five values defined for the Content-Transfer- + Encoding field imply nothing about the Content-Type other + than the algorithm by which it was encoded or the transport + system requirements if unencoded. + + Implementors may, if necessary, define new Content- + Transfer-Encoding values, but must use an x-token, which is + a name prefixed by "X-" to indicate its non-standard status, + + + + Borenstein & Freed [Page 11] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + e.g., "Content-Transfer-Encoding: x-my-new-encoding". + However, unlike Content-Types and subtypes, the creation of + new Content-Transfer-Encoding values is explicitly and + strongly discouraged, as it seems likely to hinder + interoperability with little potential benefit. Their use + is allowed only as the result of an agreement between + cooperating user agents. + + If a Content-Transfer-Encoding header field appears as part + of a message header, it applies to the entire body of that + message. If a Content-Transfer-Encoding header field + appears as part of a body part's headers, it applies only to + the body of that body part. If an entity is of type + "multipart" or "message", the Content-Transfer-Encoding is + not permitted to have any value other than a bit width + (e.g., "7bit", "8bit", etc.) or "binary". + + It should be noted that email is character-oriented, so that + the mechanisms described here are mechanisms for encoding + arbitrary byte streams, not bit streams. If a bit stream is + to be encoded via one of these mechanisms, it must first be + converted to an 8-bit byte stream using the network standard + bit order ("big-endian"), in which the earlier bits in a + stream become the higher-order bits in a byte. A bit stream + not ending at an 8-bit boundary must be padded with zeroes. + This document provides a mechanism for noting the addition + of such padding in the case of the application Content-Type, + which has a "padding" parameter. + + The encoding mechanisms defined here explicitly encode all + data in ASCII. Thus, for example, suppose an entity has + header fields such as: + + Content-Type: text/plain; charset=ISO-8859-1 + Content-transfer-encoding: base64 + + This should be interpreted to mean that the body is a base64 + ASCII encoding of data that was originally in ISO-8859-1, + and will be in that character set again after decoding. + + The following sections will define the two standard encoding + mechanisms. The definition of new content-transfer- + encodings is explicitly discouraged and should only occur + when absolutely necessary. All content-transfer-encoding + namespace except that beginning with "X-" is explicitly + reserved to the IANA for future use. Private agreements + about content-transfer-encodings are also explicitly + discouraged. + + Certain Content-Transfer-Encoding values may only be used on + certain Content-Types. In particular, it is expressly + forbidden to use any encodings other than "7bit", "8bit", or + "binary" with any Content-Type that recursively includes + other Content-Type fields, notably the "multipart" and + + + + Borenstein & Freed [Page 12] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + "message" Content-Types. All encodings that are desired for + bodies of type multipart or message must be done at the + innermost level, by encoding the actual body that needs to + be encoded. + + NOTE ON ENCODING RESTRICTIONS: Though the prohibition + against using content-transfer-encodings on data of type + multipart or message may seem overly restrictive, it is + necessary to prevent nested encodings, in which data are + passed through an encoding algorithm multiple times, and + must be decoded multiple times in order to be properly + viewed. Nested encodings add considerable complexity to + user agents: aside from the obvious efficiency problems + with such multiple encodings, they can obscure the basic + structure of a message. In particular, they can imply that + several decoding operations are necessary simply to find out + what types of objects a message contains. Banning nested + encodings may complicate the job of certain mail gateways, + but this seems less of a problem than the effect of nested + encodings on user agents. + + NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT- + TRANSFER-ENCODING: It may seem that the Content-Transfer- + Encoding could be inferred from the characteristics of the + Content-Type that is to be encoded, or, at the very least, + that certain Content-Transfer-Encodings could be mandated + for use with specific Content-Types. There are several + reasons why this is not the case. First, given the varying + types of transports used for mail, some encodings may be + appropriate for some Content-Type/transport combinations and + not for others. (For example, in an 8-bit transport, no + encoding would be required for text in certain character + sets, while such encodings are clearly required for 7-bit + SMTP.) Second, certain Content-Types may require different + types of transfer encoding under different circumstances. + For example, many PostScript bodies might consist entirely + of short lines of 7-bit data and hence require little or no + encoding. Other PostScript bodies (especially those using + Level 2 PostScript's binary encoding mechanism) may only be + reasonably represented using a binary transport encoding. + Finally, since Content-Type is intended to be an open-ended + specification mechanism, strict specification of an + association between Content-Types and encodings effectively + couples the specification of an application protocol with a + specific lower-level transport. This is not desirable since + the developers of a Content-Type should not have to be aware + of all the transports in use and what their limitations are. + + NOTE ON TRANSLATING ENCODINGS: The quoted-printable and + base64 encodings are designed so that conversion between + them is possible. The only issue that arises in such a + conversion is the handling of line breaks. When converting + from quoted-printable to base64 a line break must be + converted into a CRLF sequence. Similarly, a CRLF sequence + + + + Borenstein & Freed [Page 13] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + in base64 data should be converted to a quoted-printable + line break, but ONLY when converting text data. + + NOTE ON CANONICAL ENCODING MODEL: There was some + confusion, in earlier drafts of this memo, regarding the + model for when email data was to be converted to canonical + form and encoded, and in particular how this process would + affect the treatment of CRLFs, given that the representation + of newlines varies greatly from system to system. For this + reason, a canonical model for encoding is presented as + Appendix H. + + 5.1 Quoted-Printable Content-Transfer-Encoding + + The Quoted-Printable encoding is intended to represent data + that largely consists of octets that correspond to printable + characters in the ASCII character set. It encodes the data + in such a way that the resulting octets are unlikely to be + modified by mail transport. If the data being encoded are + mostly ASCII text, the encoded form of the data remains + largely recognizable by humans. A body which is entirely + ASCII may also be encoded in Quoted-Printable to ensure the + integrity of the data should the message pass through a + character-translating, and/or line-wrapping gateway. + + In this encoding, octets are to be represented as determined + by the following rules: + + Rule #1: (General 8-bit representation) Any octet, + except those indicating a line break according to the + newline convention of the canonical form of the data + being encoded, may be represented by an "=" followed by + a two digit hexadecimal representation of the octet's + value. The digits of the hexadecimal alphabet, for this + purpose, are "0123456789ABCDEF". Uppercase letters must + be + used when sending hexadecimal data, though a robust + implementation may choose to recognize lowercase + letters on receipt. Thus, for example, the value 12 + (ASCII form feed) can be represented by "=0C", and the + value 61 (ASCII EQUAL SIGN) can be represented by + "=3D". Except when the following rules allow an + alternative encoding, this rule is mandatory. + + Rule #2: (Literal representation) Octets with decimal + values of 33 through 60 inclusive, and 62 through 126, + inclusive, MAY be represented as the ASCII characters + which correspond to those octets (EXCLAMATION POINT + through LESS THAN, and GREATER THAN through TILDE, + respectively). + + Rule #3: (White Space): Octets with values of 9 and 32 + MAY be represented as ASCII TAB (HT) and SPACE + characters, respectively, but MUST NOT be so + + + + Borenstein & Freed [Page 14] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + represented at the end of an encoded line. Any TAB (HT) + or SPACE characters on an encoded line MUST thus be + followed on that line by a printable character. In + particular, an "=" at the end of an encoded line, + indicating a soft line break (see rule #5) may follow + one or more TAB (HT) or SPACE characters. It follows + that an octet with value 9 or 32 appearing at the end + of an encoded line must be represented according to + Rule #1. This rule is necessary because some MTAs + (Message Transport Agents, programs which transport + messages from one user to another, or perform a part of + such transfers) are known to pad lines of text with + SPACEs, and others are known to remove "white space" + characters from the end of a line. Therefore, when + decoding a Quoted-Printable body, any trailing white + space on a line must be deleted, as it will necessarily + have been added by intermediate transport agents. + + Rule #4 (Line Breaks): A line break in a text body + part, independent of what its representation is + following the canonical representation of the data + being encoded, must be represented by a (RFC 822) line + break, which is a CRLF sequence, in the Quoted- + Printable encoding. If isolated CRs and LFs, or LF CR + and CR LF sequences are allowed to appear in binary + data according to the canonical form, they must be + represented using the "=0D", "=0A", "=0A=0D" and + "=0D=0A" notations respectively. + + Note that many implementation may elect to encode the + local representation of various content types directly. + In particular, this may apply to plain text material on + systems that use newline conventions other than CRLF + delimiters. Such an implementation is permissible, but + the generation of line breaks must be generalized to + account for the case where alternate representations of + newline sequences are used. + + Rule #5 (Soft Line Breaks): The Quoted-Printable + encoding REQUIRES that encoded lines be no more than 76 + characters long. If longer lines are to be encoded with + the Quoted-Printable encoding, 'soft' line breaks must + be used. An equal sign as the last character on a + encoded line indicates such a non-significant ('soft') + line break in the encoded text. Thus if the "raw" form + of the line is a single unencoded line that says: + + Now's the time for all folk to come to the aid of + their country. + + This can be represented, in the Quoted-Printable + encoding, as + + + + + + Borenstein & Freed [Page 15] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Now's the time = + for all folk to come= + to the aid of their country. + + This provides a mechanism with which long lines are + encoded in such a way as to be restored by the user + agent. The 76 character limit does not count the + trailing CRLF, but counts all other characters, + including any equal signs. + + Since the hyphen character ("-") is represented as itself in + the Quoted-Printable encoding, care must be taken, when + encapsulating a quoted-printable encoded body in a multipart + entity, to ensure that the encapsulation boundary does not + appear anywhere in the encoded body. (A good strategy is to + choose a boundary that includes a character sequence such as + "=_" which can never appear in a quoted-printable body. See + the definition of multipart messages later in this + document.) + + NOTE: The quoted-printable encoding represents something of + a compromise between readability and reliability in + transport. Bodies encoded with the quoted-printable + encoding will work reliably over most mail gateways, but may + not work perfectly over a few gateways, notably those + involving translation into EBCDIC. (In theory, an EBCDIC + gateway could decode a quoted-printable body and re-encode + it using base64, but such gateways do not yet exist.) A + higher level of confidence is offered by the base64 + Content-Transfer-Encoding. A way to get reasonably reliable + transport through EBCDIC gateways is to also quote the ASCII + characters + + !"#$@[\]^`{|}~ + + according to rule #1. See Appendix B for more information. + + Because quoted-printable data is generally assumed to be + line-oriented, it is to be expected that the breaks between + the lines of quoted printable data may be altered in + transport, in the same manner that plain text mail has + always been altered in Internet mail when passing between + systems with differing newline conventions. If such + alterations are likely to constitute a corruption of the + data, it is probably more sensible to use the base64 + encoding rather than the quoted-printable encoding. + + + + + + + + + + + + Borenstein & Freed [Page 16] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 5.2 Base64 Content-Transfer-Encoding + + The Base64 Content-Transfer-Encoding is designed to + represent arbitrary sequences of octets in a form that is + not humanly readable. The encoding and decoding algorithms + are simple, but the encoded data are consistently only about + 33 percent larger than the unencoded data. This encoding is + based on the one used in Privacy Enhanced Mail applications, + as defined in RFC 1113. The base64 encoding is adapted + from RFC 1113, with one change: base64 eliminates the "*" + mechanism for embedded clear text. + + A 65-character subset of US-ASCII is used, enabling 6 bits + to be represented per printable character. (The extra 65th + character, "=", is used to signify a special processing + function.) + + NOTE: This subset has the important property that it is + represented identically in all versions of ISO 646, + including US ASCII, and all characters in the subset are + also represented identically in all versions of EBCDIC. + Other popular encodings, such as the encoding used by the + UUENCODE utility and the base85 encoding specified as part + of Level 2 PostScript, do not share these properties, and + thus do not fulfill the portability requirements a binary + transport encoding for mail must meet. + + The encoding process represents 24-bit groups of input bits + as output strings of 4 encoded characters. Proceeding from + left to right, a 24-bit input group is formed by + concatenating 3 8-bit input groups. These 24 bits are then + treated as 4 concatenated 6-bit groups, each of which is + translated into a single digit in the base64 alphabet. When + encoding a bit stream via the base64 encoding, the bit + stream must be presumed to be ordered with the most- + significant-bit first. That is, the first bit in the stream + will be the high-order bit in the first byte, and the eighth + bit will be the low-order bit in the first byte, and so on. + + Each 6-bit group is used as an index into an array of 64 + printable characters. The character referenced by the index + is placed in the output string. These characters, identified + in Table 1, below, are selected so as to be universally + representable, and the set excludes characters with + particular significance to SMTP (e.g., ".", "CR", "LF") and + to the encapsulation boundaries defined in this document + (e.g., "-"). + + + + + + + + + + + Borenstein & Freed [Page 17] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Table 1: The Base64 Alphabet + + Value Encoding Value Encoding Value Encoding Value + Encoding + 0 A 17 R 34 i 51 z + 1 B 18 S 35 j 52 0 + 2 C 19 T 36 k 53 1 + 3 D 20 U 37 l 54 2 + 4 E 21 V 38 m 55 3 + 5 F 22 W 39 n 56 4 + 6 G 23 X 40 o 57 5 + 7 H 24 Y 41 p 58 6 + 8 I 25 Z 42 q 59 7 + 9 J 26 a 43 r 60 8 + 10 K 27 b 44 s 61 9 + 11 L 28 c 45 t 62 + + 12 M 29 d 46 u 63 / + 13 N 30 e 47 v + 14 O 31 f 48 w (pad) = + 15 P 32 g 49 x + 16 Q 33 h 50 y + + The output stream (encoded bytes) must be represented in + lines of no more than 76 characters each. All line breaks + or other characters not found in Table 1 must be ignored by + decoding software. In base64 data, characters other than + those in Table 1, line breaks, and other white space + probably indicate a transmission error, about which a + warning message or even a message rejection might be + appropriate under some circumstances. + + Special processing is performed if fewer than 24 bits are + available at the end of the data being encoded. A full + encoding quantum is always completed at the end of a body. + When fewer than 24 input bits are available in an input + group, zero bits are added (on the right) to form an + integral number of 6-bit groups. Output character positions + which are not required to represent actual input data are + set to the character "=". Since all base64 input is an + integral number of octets, only the following cases can + arise: (1) the final quantum of encoding input is an + integral multiple of 24 bits; here, the final unit of + encoded output will be an integral multiple of 4 characters + with no "=" padding, (2) the final quantum of encoding input + is exactly 8 bits; here, the final unit of encoded output + will be two characters followed by two "=" padding + characters, or (3) the final quantum of encoding input is + exactly 16 bits; here, the final unit of encoded output will + be three characters followed by one "=" padding character. + + Care must be taken to use the proper octets for line breaks + if base64 encoding is applied directly to text material that + has not been converted to canonical form. In particular, + text line breaks should be converted into CRLF sequences + + + + Borenstein & Freed [Page 18] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + prior to base64 encoding. The important thing to note is + that this may be done directly by the encoder rather than in + a prior canonicalization step in some implementations. + + NOTE: There is no need to worry about quoting apparent + encapsulation boundaries within base64-encoded parts of + multipart entities because no hyphen characters are used in + the base64 encoding. + + 6 Additional Optional Content- Header Fields + + 6.1 Optional Content-ID Header Field + + In constructing a high-level user agent, it may be desirable + to allow one body to make reference to another. + Accordingly, bodies may be labeled using the "Content-ID" + header field, which is syntactically identical to the + "Message-ID" header field: + + Content-ID := msg-id + + Like the Message-ID values, Content-ID values must be + generated to be as unique as possible. + + 6.2 Optional Content-Description Header Field + + The ability to associate some descriptive information with a + given body is often desirable. For example, it may be useful + to mark an "image" body as "a picture of the Space Shuttle + Endeavor." Such text may be placed in the Content- + Description header field. + + Content-Description := *text + + The description is presumed to be given in the US-ASCII + character set, although the mechanism specified in [RFC- + 1342] may be used for non-US-ASCII Content-Description + values. + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 19] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7 The Predefined Content-Type Values + + This document defines seven initial Content-Type values and + an extension mechanism for private or experimental types. + Further standard types must be defined by new published + specifications. It is expected that most innovation in new + types of mail will take place as subtypes of the seven types + defined here. The most essential characteristics of the + seven content-types are summarized in Appendix G. + + 7.1 The Text Content-Type + + The text Content-Type is intended for sending material which + is principally textual in form. It is the default Content- + Type. A "charset" parameter may be used to indicate the + character set of the body text. The primary subtype of text + is "plain". This indicates plain (unformatted) text. The + default Content-Type for Internet mail is "text/plain; + charset=us-ascii". + + Beyond plain text, there are many formats for representing + what might be known as "extended text" -- text with embedded + formatting and presentation information. An interesting + characteristic of many such representations is that they are + to some extent readable even without the software that + interprets them. It is useful, then, to distinguish them, + at the highest level, from such unreadable data as images, + audio, or text represented in an unreadable form. In the + absence of appropriate interpretation software, it is + reasonable to show subtypes of text to the user, while it is + not reasonable to do so with most nontextual data. + + Such formatted textual data should be represented using + subtypes of text. Plausible subtypes of text are typically + given by the common name of the representation format, e.g., + "text/richtext". + + 7.1.1 The charset parameter + + A critical parameter that may be specified in the Content- + Type field for text data is the character set. This is + specified with a "charset" parameter, as in: + + Content-type: text/plain; charset=us-ascii + + Unlike some other parameter values, the values of the + charset parameter are NOT case sensitive. The default + character set, which must be assumed in the absence of a + charset parameter, is US-ASCII. + + An initial list of predefined character set names can be + found at the end of this section. Additional character sets + may be registered with IANA as described in Appendix F, + although the standardization of their use requires the usual + + + + Borenstein & Freed [Page 20] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + IAB review and approval. Note that if the specified + character set includes 8-bit data, a Content-Transfer- + Encoding header field and a corresponding encoding on the + data are required in order to transmit the body via some + mail transfer protocols, such as SMTP. + + The default character set, US-ASCII, has been the subject of + some confusion and ambiguity in the past. Not only were + there some ambiguities in the definition, there have been + wide variations in practice. In order to eliminate such + ambiguity and variations in the future, it is strongly + recommended that new user agents explicitly specify a + character set via the Content-Type header field. "US-ASCII" + does not indicate an arbitrary seven-bit character code, but + specifies that the body uses character coding that uses the + exact correspondence of codes to characters specified in + ASCII. National use variations of ISO 646 [ISO-646] are NOT + ASCII and their use in Internet mail is explicitly + discouraged. The omission of the ISO 646 character set is + deliberate in this regard. The character set name of "US- + ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only. + The character set name "ASCII" is reserved and must not be + used for any purpose. + + NOTE: RFC 821 explicitly specifies "ASCII", and references + an earlier version of the American Standard. Insofar as one + of the purposes of specifying a Content-Type and character + set is to permit the receiver to unambiguously determine how + the sender intended the coded message to be interpreted, + assuming anything other than "strict ASCII" as the default + would risk unintentional and incompatible changes to the + semantics of messages now being transmitted. This also + implies that messages containing characters coded according + to national variations on ISO 646, or using code-switching + procedures (e.g., those of ISO 2022), as well as 8-bit or + multiple octet character encodings MUST use an appropriate + character set specification to be consistent with this + specification. + + The complete US-ASCII character set is listed in [US-ASCII]. + Note that the control characters including DEL (0-31, 127) + have no defined meaning apart from the combination CRLF + (ASCII values 13 and 10) indicating a new line. Two of the + characters have de facto meanings in wide use: FF (12) often + means "start subsequent text on the beginning of a new + page"; and TAB or HT (9) often (though not always) means + "move the cursor to the next available column after the + current position where the column number is a multiple of 8 + (counting the first column as column 0)." Apart from this, + any use of the control characters or DEL in a body must be + part of a private agreement between the sender and + recipient. Such private agreements are discouraged and + should be replaced by the other capabilities of this + document. + + + + Borenstein & Freed [Page 21] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + NOTE: Beyond US-ASCII, an enormous proliferation of + character sets is possible. It is the opinion of the IETF + working group that a large number of character sets is NOT a + good thing. We would prefer to specify a single character + set that can be used universally for representing all of the + world's languages in electronic mail. Unfortunately, + existing practice in several communities seems to point to + the continued use of multiple character sets in the near + future. For this reason, we define names for a small number + of character sets for which a strong constituent base + exists. It is our hope that ISO 10646 or some other + effort will eventually define a single world character set + which can then be specified for use in Internet mail, but in + the advance of that definition we cannot specify the use of + ISO 10646, Unicode, or any other character set whose + definition is, as of this writing, incomplete. + + The defined charset values are: + + US-ASCII -- as defined in [US-ASCII]. + + ISO-8859-X -- where "X" is to be replaced, as + necessary, for the parts of ISO-8859 [ISO- + 8859]. Note that the ISO 646 character sets + have deliberately been omitted in favor of + their 8859 replacements, which are the + designated character sets for Internet mail. + As of the publication of this document, the + legitimate values for "X" are the digits 1 + through 9. + + Note that the character set used, if anything other than + US-ASCII, must always be explicitly specified in the + Content-Type field. + + No other character set name may be used in Internet mail + without the publication of a formal specification and its + registration with IANA as described in Appendix F, or by + private agreement, in which case the character set name must + begin with "X-". + + Implementors are discouraged from defining new character + sets for mail use unless absolutely necessary. + + The "charset" parameter has been defined primarily for the + purpose of textual data, and is described in this section + for that reason. However, it is conceivable that non- + textual data might also wish to specify a charset value for + some purpose, in which case the same syntax and values + should be used. + + In general, mail-sending software should always use the + "lowest common denominator" character set possible. For + example, if a body contains only US-ASCII characters, it + + + + Borenstein & Freed [Page 22] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + should be marked as being in the US-ASCII character set, not + ISO-8859-1, which, like all the ISO-8859 family of character + sets, is a superset of US-ASCII. More generally, if a + widely-used character set is a subset of another character + set, and a body contains only characters in the widely-used + subset, it should be labeled as being in that subset. This + will increase the chances that the recipient will be able to + view the mail correctly. + + 7.1.2 The Text/plain subtype + + The primary subtype of text is "plain". This indicates + plain (unformatted) text. The default Content-Type for + Internet mail, "text/plain; charset=us-ascii", describes + existing Internet practice, that is, it is the type of body + defined by RFC 822. + + 7.1.3 The Text/richtext subtype + + In order to promote the wider interoperability of simple + formatted text, this document defines an extremely simple + subtype of "text", the "richtext" subtype. This subtype was + designed to meet the following criteria: + + 1. The syntax must be extremely simple to parse, + so that even teletype-oriented mail systems can + easily strip away the formatting information and + leave only the readable text. + + 2. The syntax must be extensible to allow for new + formatting commands that are deemed essential. + + 3. The capabilities must be extremely limited, to + ensure that it can represent no more than is + likely to be representable by the user's primary + word processor. While this limits what can be + sent, it increases the likelihood that what is + sent can be properly displayed. + + 4. The syntax must be compatible with SGML, so + that, with an appropriate DTD (Document Type + Definition, the standard mechanism for defining a + document type using SGML), a general SGML parser + could be made to parse richtext. However, despite + this compatibility, the syntax should be far + simpler than full SGML, so that no SGML knowledge + is required in order to implement it. + + The syntax of "richtext" is very simple. It is assumed, at + the top-level, to be in the US-ASCII character set, unless + of course a different charset parameter was specified in the + Content-type field. All characters represent themselves, + with the exception of the "<" character (ASCII 60), which is + used to mark the beginning of a formatting command. + + + + Borenstein & Freed [Page 23] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Formatting instructions consist of formatting commands + surrounded by angle brackets ("<>", ASCII 60 and 62). Each + formatting command may be no more than 40 characters in + length, all in US-ASCII, restricted to the alphanumeric and + hyphen ("-") characters. Formatting commands may be preceded + by a forward slash or solidus ("/", ASCII 47), making them + negations, and such negations must always exist to balance + the initial opening commands, except as noted below. Thus, + if the formatting command "<bold>" appears at some point, + there must later be a "</bold>" to balance it. There are + only three exceptions to this "balancing" rule: First, the + command "<lt>" is used to represent a literal "<" character. + Second, the command "<nl>" is used to represent a required + line break. (Otherwise, CRLFs in the data are treated as + equivalent to a single SPACE character.) Finally, the + command "<np>" is used to represent a page break. (NOTE: + The 40 character limit on formatting commands does not + include the "<", ">", or "/" characters that might be + attached to such commands.) + + Initially defined formatting commands, not all of which will + be implemented by all richtext implementations, include: + + Bold -- causes the subsequent text to be in a bold + font. + Italic -- causes the subsequent text to be in an italic + font. + Fixed -- causes the subsequent text to be in a fixed + width font. + Smaller -- causes the subsequent text to be in a + smaller font. + Bigger -- causes the subsequent text to be in a bigger + font. + Underline -- causes the subsequent text to be + underlined. + Center -- causes the subsequent text to be centered. + FlushLeft -- causes the subsequent text to be left + justified. + FlushRight -- causes the subsequent text to be right + justified. + Indent -- causes the subsequent text to be indented at + the left margin. + IndentRight -- causes the subsequent text to be + indented at the right margin. + Outdent -- causes the subsequent text to be outdented + at the left margin. + OutdentRight -- causes the subsequent text to be + outdented at the right margin. + SamePage -- causes the subsequent text to be grouped, + if possible, on one page. + Subscript -- causes the subsequent text to be + interpreted as a subscript. + + + + + + Borenstein & Freed [Page 24] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Superscript -- causes the subsequent text to be + interpreted as a superscript. + Heading -- causes the subsequent text to be interpreted + as a page heading. + Footing -- causes the subsequent text to be interpreted + as a page footing. + ISO-8859-X (for any value of X that is legal as a + "charset" parameter) -- causes the subsequent text + to be interpreted as text in the appropriate + character set. + US-ASCII -- causes the subsequent text to be + interpreted as text in the US-ASCII character set. + Excerpt -- causes the subsequent text to be interpreted + as a textual excerpt from another source. + Typically this will be displayed using indentation + and an alternate font, but such decisions are up + to the viewer. + Paragraph -- causes the subsequent text to be + interpreted as a single paragraph, with + appropriate paragraph breaks (typically blank + space) before and after. + Signature -- causes the subsequent text to be + interpreted as a "signature". Some systems may + wish to display signatures in a smaller font or + otherwise set them apart from the main text of the + message. + Comment -- causes the subsequent text to be interpreted + as a comment, and hence not shown to the reader. + No-op -- has no effect on the subsequent text. + lt -- <lt> is replaced by a literal "<" character. No + balancing </lt> is allowed. + nl -- <nl> causes a line break. No balancing </nl> is + allowed. + np -- <np> causes a page break. No balancing </np> is + allowed. + + Each positive formatting command affects all subsequent text + until the matching negative formatting command. Such pairs + of formatting commands must be properly balanced and nested. + Thus, a proper way to describe text in bold italics is: + + <bold><italic>the-text</italic></bold> + + or, alternately, + + <italic><bold>the-text</bold></italic> + + but, in particular, the following is illegal + richtext: + + <bold><italic>the-text</bold></italic> + + NOTE: The nesting requirement for formatting commands + imposes a slightly higher burden upon the composers of + + + + Borenstein & Freed [Page 25] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + richtext bodies, but potentially simplifies richtext + displayers by allowing them to be stack-based. The main + goal of richtext is to be simple enough to make multifont, + formatted email widely readable, so that those with the + capability of sending it will be able to do so with + confidence. Thus slightly increased complexity in the + composing software was deemed a reasonable tradeoff for + simplified reading software. Nonetheless, implementors of + richtext readers are encouraged to follow the general + Internet guidelines of being conservative in what you send + and liberal in what you accept. Those implementations that + can do so are encouraged to deal reasonably with improperly + nested richtext. + + Implementations must regard any unrecognized formatting + command as equivalent to "No-op", thus facilitating future + extensions to "richtext". Private extensions may be defined + using formatting commands that begin with "X-", by analogy + to Internet mail header field names. + + It is worth noting that no special behavior is required for + the TAB (HT) character. It is recommended, however, that, at + least when fixed-width fonts are in use, the common + semantics of the TAB (HT) character should be observed, + namely that it moves to the next column position that is a + multiple of 8. (In other words, if a TAB (HT) occurs in + column n, where the leftmost column is column 0, then that + TAB (HT) should be replaced by 8-(n mod 8) SPACE + characters.) + + Richtext also differentiates between "hard" and "soft" line + breaks. A line break (CRLF) in the richtext data stream is + interpreted as a "soft" line break, one that is included + only for purposes of mail transport, and is to be treated as + white space by richtext interpreters. To include a "hard" + line break (one that must be displayed as such), the "<nl>" + or "<paragraph> formatting constructs should be used. In + general, a soft line break should be treated as white space, + but when soft line breaks immediately follow a <nl> or a + </paragraph> tag they should be ignored rather than treated + as white space. + + Putting all this together, the following "text/richtext" + body fragment: + + <bold>Now</bold> is the time for + <italic>all</italic> good men + <smaller>(and <lt>women>)</smaller> to + <ignoreme></ignoreme> come + + to the aid of their + <nl> + + + + + + Borenstein & Freed [Page 26] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + beloved <nl><nl>country. <comment> Stupid + quote! </comment> -- the end + + represents the following formatted text (which will, no + doubt, look cryptic in the text-only version of this + document): + + Now is the time for all good men (and <women>) to + come to the aid of their + beloved + + country. -- the end + + Richtext conformance: A minimal richtext implementation is + one that simply converts "<lt>" to "<", converts CRLFs to + SPACE, converts <nl> to a newline according to local newline + convention, removes everything between a <comment> command + and the next balancing </comment> command, and removes all + other formatting commands (all text enclosed in angle + brackets). + + NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML: Richtext is + decidedly not SGML, and must not be used to transport + arbitrary SGML documents. Those who wish to use SGML + document types as a mail transport format must define a new + text or application subtype, e.g., "text/sgml-dtd-whatever" + or "application/sgml-dtd-whatever", depending on the + perceived readability of the DTD in use. Richtext is + designed to be compatible with SGML, and specifically so + that it will be possible to define a richtext DTD if one is + needed. However, this does not imply that arbitrary SGML + can be called richtext, nor that richtext implementors have + any need to understand SGML; the description in this + document is a complete definition of richtext, which is far + simpler than complete SGML. + + NOTE ON THE INTENDED USE OF RICHTEXT: It is recognized that + implementors of future mail systems will want rich text + functionality far beyond that currently defined for + richtext. The intent of richtext is to provide a common + format for expressing that functionality in a form in which + much of it, at least, will be understood by interoperating + software. Thus, in particular, software with a richer + notion of formatted text than richtext can still use + richtext as its basic representation, but can extend it with + new formatting commands and by hiding information specific + to that software system in richtext comments. As such + systems evolve, it is expected that the definition of + richtext will be further refined by future published + specifications, but richtext as defined here provides a + platform on which evolutionary refinements can be based. + + IMPLEMENTATION NOTE: In some environments, it might be + impossible to combine certain richtext formatting commands, + + + + Borenstein & Freed [Page 27] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + whereas in others they might be combined easily. For + example, the combination of <bold> and <italic> might + produce bold italics on systems that support such fonts, but + there exist systems that can make text bold or italicized, + but not both. In such cases, the most recently issued + recognized formatting command should be preferred. + + One of the major goals in the design of richtext was to make + it so simple that even text-only mailers will implement + richtext-to-plain-text translators, thus increasing the + likelihood that multifont text will become "safe" to use + very widely. To demonstrate this simplicity, an extremely + simple 35-line C program that converts richtext input into + plain text output is included in Appendix D. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 28] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.2 The Multipart Content-Type + + In the case of multiple part messages, in which one or more + different sets of data are combined in a single body, a + "multipart" Content-Type field must appear in the entity's + header. The body must then contain one or more "body parts," + each preceded by an encapsulation boundary, and the last one + followed by a closing boundary. Each part starts with an + encapsulation boundary, and then contains a body part + consisting of header area, a blank line, and a body area. + Thus a body part is similar to an RFC 822 message in syntax, + but different in meaning. + + A body part is NOT to be interpreted as actually being an + RFC 822 message. To begin with, NO header fields are + actually required in body parts. A body part that starts + with a blank line, therefore, is allowed and is a body part + for which all default values are to be assumed. In such a + case, the absence of a Content-Type header field implies + that the encapsulation is plain US-ASCII text. The only + header fields that have defined meaning for body parts are + those the names of which begin with "Content-". All other + header fields are generally to be ignored in body parts. + Although they should generally be retained in mail + processing, they may be discarded by gateways if necessary. + Such other fields are permitted to appear in body parts but + should not be depended on. "X-" fields may be created for + experimental or private purposes, with the recognition that + the information they contain may be lost at some gateways. + + The distinction between an RFC 822 message and a body part + is subtle, but important. A gateway between Internet and + X.400 mail, for example, must be able to tell the difference + between a body part that contains an image and a body part + that contains an encapsulated message, the body of which is + an image. In order to represent the latter, the body part + must have "Content-Type: message", and its body (after the + blank line) must be the encapsulated message, with its own + "Content-Type: image" header field. The use of similar + syntax facilitates the conversion of messages to body parts, + and vice versa, but the distinction between the two must be + understood by implementors. (For the special case in which + all parts actually are messages, a "digest" subtype is also + defined.) + + As stated previously, each body part is preceded by an + encapsulation boundary. The encapsulation boundary MUST NOT + appear inside any of the encapsulated parts. Thus, it is + crucial that the composing agent be able to choose and + specify the unique boundary that will separate the parts. + + All present and future subtypes of the "multipart" type must + use an identical syntax. Subtypes may differ in their + semantics, and may impose additional restrictions on syntax, + + + + Borenstein & Freed [Page 29] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + but must conform to the required syntax for the multipart + type. This requirement ensures that all conformant user + agents will at least be able to recognize and separate the + parts of any multipart entity, even of an unrecognized + subtype. + + As stated in the definition of the Content-Transfer-Encoding + field, no encoding other than "7bit", "8bit", or "binary" is + permitted for entities of type "multipart". The multipart + delimiters and header fields are always 7-bit ASCII in any + case, and data within the body parts can be encoded on a + part-by-part basis, with Content-Transfer-Encoding fields + for each appropriate body part. + + Mail gateways, relays, and other mail handling agents are + commonly known to alter the top-level header of an RFC 822 + message. In particular, they frequently add, remove, or + reorder header fields. Such alterations are explicitly + forbidden for the body part headers embedded in the bodies + of messages of type "multipart." + + 7.2.1 Multipart: The common syntax + + All subtypes of "multipart" share a common syntax, defined + in this section. A simple example of a multipart message + also appears in this section. An example of a more complex + multipart message is given in Appendix C. + + The Content-Type field for multipart entities requires one + parameter, "boundary", which is used to specify the + encapsulation boundary. The encapsulation boundary is + defined as a line consisting entirely of two hyphen + characters ("-", decimal code 45) followed by the boundary + parameter value from the Content-Type header field. + + NOTE: The hyphens are for rough compatibility with the + earlier RFC 934 method of message encapsulation, and for + ease of searching for the boundaries in some + implementations. However, it should be noted that multipart + messages are NOT completely compatible with RFC 934 + encapsulations; in particular, they do not obey RFC 934 + quoting conventions for embedded lines that begin with + hyphens. This mechanism was chosen over the RFC 934 + mechanism because the latter causes lines to grow with each + level of quoting. The combination of this growth with the + fact that SMTP implementations sometimes wrap long lines + made the RFC 934 mechanism unsuitable for use in the event + that deeply-nested multipart structuring is ever desired. + + Thus, a typical multipart Content-Type header field might + look like this: + + Content-Type: multipart/mixed; + + + + + Borenstein & Freed [Page 30] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + boundary=gc0p4Jq0M2Yt08jU534c0p + + This indicates that the entity consists of several parts, + each itself with a structure that is syntactically identical + to an RFC 822 message, except that the header area might be + completely empty, and that the parts are each preceded by + the line + + --gc0p4Jq0M2Yt08jU534c0p + + Note that the encapsulation boundary must occur at the + beginning of a line, i.e., following a CRLF, and that that + initial CRLF is considered to be part of the encapsulation + boundary rather than part of the preceding part. The + boundary must be followed immediately either by another CRLF + and the header fields for the next part, or by two CRLFs, in + which case there are no header fields for the next part (and + it is therefore assumed to be of Content-Type text/plain). + + NOTE: The CRLF preceding the encapsulation line is + considered part of the boundary so that it is possible to + have a part that does not end with a CRLF (line break). + Body parts that must be considered to end with line breaks, + therefore, should have two CRLFs preceding the encapsulation + line, the first of which is part of the preceding body part, + and the second of which is part of the encapsulation + boundary. + + The requirement that the encapsulation boundary begins with + a CRLF implies that the body of a multipart entity must + itself begin with a CRLF before the first encapsulation line + -- that is, if the "preamble" area is not used, the entity + headers must be followed by TWO CRLFs. This is indeed how + such entities should be composed. A tolerant mail reading + program, however, may interpret a body of type multipart + that begins with an encapsulation line NOT initiated by a + CRLF as also being an encapsulation boundary, but a + compliant mail sending program must not generate such + entities. + + Encapsulation boundaries must not appear within the + encapsulations, and must be no longer than 70 characters, + not counting the two leading hyphens. + + The encapsulation boundary following the last body part is a + distinguished delimiter that indicates that no further body + parts will follow. Such a delimiter is identical to the + previous delimiters, with the addition of two more hyphens + at the end of the line: + + --gc0p4Jq0M2Yt08jU534c0p-- + + There appears to be room for additional information prior to + the first encapsulation boundary and following the final + + + + Borenstein & Freed [Page 31] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + boundary. These areas should generally be left blank, and + implementations should ignore anything that appears before + the first boundary or after the last one. + + NOTE: These "preamble" and "epilogue" areas are not used + because of the lack of proper typing of these parts and the + lack of clear semantics for handling these areas at + gateways, particularly X.400 gateways. + + NOTE: Because encapsulation boundaries must not appear in + the body parts being encapsulated, a user agent must + exercise care to choose a unique boundary. The boundary in + the example above could have been the result of an algorithm + designed to produce boundaries with a very low probability + of already existing in the data to be encapsulated without + having to prescan the data. Alternate algorithms might + result in more 'readable' boundaries for a recipient with an + old user agent, but would require more attention to the + possibility that the boundary might appear in the + encapsulated part. The simplest boundary possible is + something like "---", with a closing boundary of "-----". + + As a very simple example, the following multipart message + has two parts, both of them plain text, one of them + explicitly typed and one of them implicitly typed: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Subject: Sample message + MIME-Version: 1.0 + Content-type: multipart/mixed; boundary="simple + boundary" + + This is the preamble. It is to be ignored, though it + is a handy place for mail composers to include an + explanatory note to non-MIME compliant readers. + --simple boundary + + This is implicitly typed plain ASCII text. + It does NOT end with a linebreak. + --simple boundary + Content-type: text/plain; charset=us-ascii + + This is explicitly typed plain ASCII text. + It DOES end with a linebreak. + + --simple boundary-- + This is the epilogue. It is also to be ignored. + + The use of a Content-Type of multipart in a body part within + another multipart entity is explicitly allowed. In such + cases, for obvious reasons, care must be taken to ensure + that each nested multipart entity must use a different + boundary delimiter. See Appendix C for an example of nested + + + + Borenstein & Freed [Page 32] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + multipart entities. + + The use of the multipart Content-Type with only a single + body part may be useful in certain contexts, and is + explicitly permitted. + + The only mandatory parameter for the multipart Content-Type + is the boundary parameter, which consists of 1 to 70 + characters from a set of characters known to be very robust + through email gateways, and NOT ending with white space. + (If a boundary appears to end with white space, the white + space must be presumed to have been added by a gateway, and + should be deleted.) It is formally specified by the + following BNF: + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / + "_" + / "," / "-" / "." / "/" / ":" / "=" / "?" + + Overall, the body of a multipart entity may be specified as + follows: + + multipart-body := preamble 1*encapsulation + close-delimiter epilogue + + encapsulation := delimiter CRLF body-part + + delimiter := CRLF "--" boundary ; taken from Content-Type + field. + ; when content-type is + multipart + ; There must be no space + ; between "--" and boundary. + + close-delimiter := delimiter "--" ; Again, no space before + "--" + + preamble := *text ; to be ignored upon + receipt. + + epilogue := *text ; to be ignored upon + receipt. + + body-part = <"message" as defined in RFC 822, + with all header fields optional, and with the + specified delimiter not occurring anywhere in + the message body, either on a line by itself + or as a substring anywhere. Note that the + + + + + + Borenstein & Freed [Page 33] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + semantics of a part differ from the semantics + of a message, as described in the text.> + + NOTE: Conspicuously missing from the multipart type is a + notion of structured, related body parts. In general, it + seems premature to try to standardize interpart structure + yet. It is recommended that those wishing to provide a more + structured or integrated multipart messaging facility should + define a subtype of multipart that is syntactically + identical, but that always expects the inclusion of a + distinguished part that can be used to specify the structure + and integration of the other parts, probably referring to + them by their Content-ID field. If this approach is used, + other implementations will not recognize the new subtype, + but will treat it as the primary subtype (multipart/mixed) + and will thus be able to show the user the parts that are + recognized. + + 7.2.2 The Multipart/mixed (primary) subtype + + The primary subtype for multipart, "mixed", is intended for + use when the body parts are independent and intended to be + displayed serially. Any multipart subtypes that an + implementation does not recognize should be treated as being + of subtype "mixed". + + 7.2.3 The Multipart/alternative subtype + + The multipart/alternative type is syntactically identical to + multipart/mixed, but the semantics are different. In + particular, each of the parts is an "alternative" version of + the same information. User agents should recognize that the + content of the various parts are interchangeable. The user + agent should either choose the "best" type based on the + user's environment and preferences, or offer the user the + available alternatives. In general, choosing the best type + means displaying only the LAST part that can be displayed. + This may be used, for example, to send mail in a fancy text + format in such a way that it can easily be displayed + anywhere: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Subject: Formatted text mail + MIME-Version: 1.0 + Content-Type: multipart/alternative; boundary=boundary42 + + + --boundary42 + Content-Type: text/plain; charset=us-ascii + + ...plain text version of message goes here.... + + + + + + Borenstein & Freed [Page 34] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + --boundary42 + Content-Type: text/richtext + + .... richtext version of same message goes here ... + --boundary42 + Content-Type: text/x-whatever + + .... fanciest formatted version of same message goes here + ... + --boundary42-- + + In this example, users whose mail system understood the + "text/x-whatever" format would see only the fancy version, + while other users would see only the richtext or plain text + version, depending on the capabilities of their system. + + In general, user agents that compose multipart/alternative + entities should place the body parts in increasing order of + preference, that is, with the preferred format last. For + fancy text, the sending user agent should put the plainest + format first and the richest format last. Receiving user + agents should pick and display the last format they are + capable of displaying. In the case where one of the + alternatives is itself of type "multipart" and contains + unrecognized sub-parts, the user agent may choose either to + show that alternative, an earlier alternative, or both. + + NOTE: From an implementor's perspective, it might seem more + sensible to reverse this ordering, and have the plainest + alternative last. However, placing the plainest alternative + first is the friendliest possible option when + mutlipart/alternative entities are viewed using a non-MIME- + compliant mail reader. While this approach does impose some + burden on compliant mail readers, interoperability with + older mail readers was deemed to be more important in this + case. + + It may be the case that some user agents, if they can + recognize more than one of the formats, will prefer to offer + the user the choice of which format to view. This makes + sense, for example, if mail includes both a nicely-formatted + image version and an easily-edited text version. What is + most critical, however, is that the user not automatically + be shown multiple versions of the same data. Either the + user should be shown the last recognized version or should + explicitly be given the choice. + + + + + + + + + + + + Borenstein & Freed [Page 35] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.2.4 The Multipart/digest subtype + + This document defines a "digest" subtype of the multipart + Content-Type. This type is syntactically identical to + multipart/mixed, but the semantics are different. In + particular, in a digest, the default Content-Type value for + a body part is changed from "text/plain" to + "message/rfc822". This is done to allow a more readable + digest format that is largely compatible (except for the + quoting convention) with RFC 934. + + A digest in this format might, then, look something like + this: + + From: Moderator-Address + MIME-Version: 1.0 + Subject: Internet Digest, volume 42 + Content-Type: multipart/digest; + boundary="---- next message ----" + + + ------ next message ---- + + From: someone-else + Subject: my opinion + + ...body goes here ... + + ------ next message ---- + + From: someone-else-again + Subject: my different opinion + + ... another body goes here... + + ------ next message ------ + + 7.2.5 The Multipart/parallel subtype + + This document defines a "parallel" subtype of the multipart + Content-Type. This type is syntactically identical to + multipart/mixed, but the semantics are different. In + particular, in a parallel entity, all of the parts are + intended to be presented in parallel, i.e., simultaneously, + on hardware and software that are capable of doing so. + Composing agents should be aware that many mail readers will + lack this capability and will show the parts serially in any + event. + + + + + + + + + + Borenstein & Freed [Page 36] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.3 The Message Content-Type + + It is frequently desirable, in sending mail, to encapsulate + another mail message. For this common operation, a special + Content-Type, "message", is defined. The primary subtype, + message/rfc822, has no required parameters in the Content- + Type field. Additional subtypes, "partial" and "External- + body", do have required parameters. These subtypes are + explained below. + + NOTE: It has been suggested that subtypes of message might + be defined for forwarded or rejected messages. However, + forwarded and rejected messages can be handled as multipart + messages in which the first part contains any control or + descriptive information, and a second part, of type + message/rfc822, is the forwarded or rejected message. + Composing rejection and forwarding messages in this manner + will preserve the type information on the original message + and allow it to be correctly presented to the recipient, and + hence is strongly encouraged. + + As stated in the definition of the Content-Transfer-Encoding + field, no encoding other than "7bit", "8bit", or "binary" is + permitted for messages or parts of type "message". The + message header fields are always US-ASCII in any case, and + data within the body can still be encoded, in which case the + Content-Transfer-Encoding header field in the encapsulated + message will reflect this. Non-ASCII text in the headers of + an encapsulated message can be specified using the + mechanisms described in [RFC-1342]. + + Mail gateways, relays, and other mail handling agents are + commonly known to alter the top-level header of an RFC 822 + message. In particular, they frequently add, remove, or + reorder header fields. Such alterations are explicitly + forbidden for the encapsulated headers embedded in the + bodies of messages of type "message." + + 7.3.1 The Message/rfc822 (primary) subtype + + A Content-Type of "message/rfc822" indicates that the body + contains an encapsulated message, with the syntax of an RFC + 822 message. + + 7.3.2 The Message/Partial subtype + + A subtype of message, "partial", is defined in order to + allow large objects to be delivered as several separate + pieces of mail and automatically reassembled by the + receiving user agent. (The concept is similar to IP + fragmentation/reassembly in the basic Internet Protocols.) + This mechanism can be used when intermediate transport + agents limit the size of individual messages that can be + sent. Content-Type "message/partial" thus indicates that + + + + Borenstein & Freed [Page 37] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + the body contains a fragment of a larger message. + + Three parameters must be specified in the Content-Type field + of type message/partial: The first, "id", is a unique + identifier, as close to a world-unique identifier as + possible, to be used to match the parts together. (In + general, the identifier is essentially a message-id; if + placed in double quotes, it can be any message-id, in + accordance with the BNF for "parameter" given earlier in + this specification.) The second, "number", an integer, is + the part number, which indicates where this part fits into + the sequence of fragments. The third, "total", another + integer, is the total number of parts. This third subfield + is required on the final part, and is optional on the + earlier parts. Note also that these parameters may be given + in any order. + + Thus, part 2 of a 3-part message may have either of the + following header fields: + + Content-Type: Message/Partial; + number=2; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; + + Content-Type: Message/Partial; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; + number=2 + + But part 3 MUST specify the total number of parts: + + Content-Type: Message/Partial; + number=3; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; + + Note that part numbering begins with 1, not 0. + + When the parts of a message broken up in this manner are put + together, the result is a complete RFC 822 format message, + which may have its own Content-Type header field, and thus + may contain any other data type. + + Message fragmentation and reassembly: The semantics of a + reassembled partial message must be those of the "inner" + message, rather than of a message containing the inner + message. This makes it possible, for example, to send a + large audio message as several partial messages, and still + have it appear to the recipient as a simple audio message + rather than as an encapsulated message containing an audio + message. That is, the encapsulation of the message is + considered to be "transparent". + + When generating and reassembling the parts of a + message/partial message, the headers of the encapsulated + message must be merged with the headers of the enclosing + + + + Borenstein & Freed [Page 38] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + entities. In this process the following rules must be + observed: + + (1) All of the headers from the initial enclosing + entity (part one), except those that start with + "Content-" and "Message-ID", must be copied, in + order, to the new message. + + (2) Only those headers in the enclosed message + which start with "Content-" and "Message-ID" must + be appended, in order, to the headers of the new + message. Any headers in the enclosed message + which do not start with "Content-" (except for + "Message-ID") will be ignored. + + (3) All of the headers from the second and any + subsequent messages will be ignored. + + For example, if an audio message is broken into two parts, + the first part might look something like this: + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Subject: Audio mail + Message-ID: id1@host.com + MIME-Version: 1.0 + Content-type: message/partial; + id="ABC@host.com"; + number=1; total=2 + + X-Weird-Header-1: Bar + X-Weird-Header-2: Hello + Message-ID: anotherid@foo.com + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here... + + and the second half might look something like this: + + From: Bill@host.com + To: joe@otherhost.com + Subject: Audio mail + MIME-Version: 1.0 + Message-ID: id2@host.com + Content-type: message/partial; + id="ABC@host.com"; number=2; total=2 + + ... second half of encoded audio data goes here... + + Then, when the fragmented message is reassembled, the + resulting message to be displayed to the user should look + something like this: + + + + Borenstein & Freed [Page 39] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Subject: Audio mail + Message-ID: anotherid@foo.com + MIME-Version: 1.0 + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here... + ... second half of encoded audio data goes here... + + It should be noted that, because some message transfer + agents may choose to automatically fragment large messages, + and because such agents may use different fragmentation + thresholds, it is possible that the pieces of a partial + message, upon reassembly, may prove themselves to comprise a + partial message. This is explicitly permitted. + + It should also be noted that the inclusion of a "References" + field in the headers of the second and subsequent pieces of + a fragmented message that references the Message-Id on the + previous piece may be of benefit to mail readers that + understand and track references. However, the generation of + such "References" fields is entirely optional. + + 7.3.3 The Message/External-Body subtype + + The external-body subtype indicates that the actual body + data are not included, but merely referenced. In this case, + the parameters describe a mechanism for accessing the + external data. + + When a message body or body part is of type + "message/external-body", it consists of a header, two + consecutive CRLFs, and the message header for the + encapsulated message. If another pair of consecutive CRLFs + appears, this of course ends the message header for the + encapsulated message. However, since the encapsulated + message's body is itself external, it does NOT appear in the + area that follows. For example, consider the following + message: + + Content-type: message/external-body; access- + type=local-file; + name=/u/nsb/Me.gif + + Content-type: image/gif + + THIS IS NOT REALLY THE BODY! + + The area at the end, which might be called the "phantom + body", is ignored for most external-body messages. However, + it may be used to contain auxilliary information for some + + + + Borenstein & Freed [Page 40] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + such messages, as indeed it is when the access-type is + "mail-server". Of the access-types defined by this + document, the phantom body is used only when the access-type + is "mail-server". In all other cases, the phantom body is + ignored. + + The only always-mandatory parameter for message/external- + body is "access-type"; all of the other parameters may be + mandatory or optional depending on the value of access-type. + + ACCESS-TYPE -- One or more case-insensitive words, + comma-separated, indicating supported access + mechanisms by which the file or data may be + obtained. Values include, but are not limited to, + "FTP", "ANON-FTP", "TFTP", "AFS", "LOCAL-FILE", + and "MAIL-SERVER". Future values, except for + experimental values beginning with "X-", must be + registered with IANA, as described in Appendix F . + + In addition, the following two parameters are optional for + ALL access-types: + + EXPIRATION -- The date (in the RFC 822 "date-time" + syntax, as extended by RFC 1123 to permit 4 digits + in the date field) after which the existence of + the external data is not guaranteed. + + SIZE -- The size (in octets) of the data. The + intent of this parameter is to help the recipient + decide whether or not to expend the necessary + resources to retrieve the external data. + + PERMISSION -- A field that indicates whether or + not it is expected that clients might also attempt + to overwrite the data. By default, or if + permission is "read", the assumption is that they + are not, and that if the data is retrieved once, + it is never needed again. If PERMISSION is "read- + write", this assumption is invalid, and any local + copy must be considered no more than a cache. + "Read" and "Read-write" are the only defined + values of permission. + + The precise semantics of the access-types defined here are + described in the sections that follow. + + 7.3.3.1 The "ftp" and "tftp" access-types + + An access-type of FTP or TFTP indicates that the message + body is accessible as a file using the FTP [RFC-959] or TFTP + [RFC-783] protocols, respectively. For these access-types, + the following additional parameters are mandatory: + + + + + + Borenstein & Freed [Page 41] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + NAME -- The name of the file that contains the + actual body data. + + SITE -- A machine from which the file may be + obtained, using the given protocol + + Before the data is retrieved, using these protocols, the + user will generally need to be asked to provide a login id + and a password for the machine named by the site parameter. + + In addition, the following optional parameters may also + appear when the access-type is FTP or ANON-FTP: + + DIRECTORY -- A directory from which the data named + by NAME should be retrieved. + + MODE -- A transfer mode for retrieving the + information, e.g. "image". + + 7.3.3.2 The "anon-ftp" access-type + + The "anon-ftp" access-type is identical to the "ftp" access + type, except that the user need not be asked to provide a + name and password for the specified site. Instead, the ftp + protocol will be used with login "anonymous" and a password + that corresponds to the user's email address. + + 7.3.3.3 The "local-file" and "afs" access-types + + An access-type of "local-file" indicates that the actual + body is accessible as a file on the local machine. An + access-type of "afs" indicates that the file is accessible + via the global AFS file system. In both cases, only a + single parameter is required: + + NAME -- The name of the file that contains the + actual body data. + + The following optional parameter may be used to describe the + locality of reference for the data, that is, the site or + sites at which the file is expected to be visible: + + SITE -- A domain specifier for a machine or set of + machines that are known to have access to the data + file. Asterisks may be used for wildcard matching + to a part of a domain name, such as + "*.bellcore.com", to indicate a set of machines on + which the data should be directly visible, while a + single asterisk may be used to indicate a file + that is expected to be universally available, + e.g., via a global file system. + + 7.3.3.4 The "mail-server" access-type + + + + + Borenstein & Freed [Page 42] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + The "mail-server" access-type indicates that the actual body + is available from a mail server. The mandatory parameter + for this access-type is: + + SERVER -- The email address of the mail server + from which the actual body data can be obtained. + + Because mail servers accept a variety of syntax, some of + which is multiline, the full command to be sent to a mail + server is not included as a parameter on the content-type + line. Instead, it may be provided as the "phantom body" + when the content-type is message/external-body and the + access-type is mail-server. + + Note that MIME does not define a mail server syntax. + Rather, it allows the inclusion of arbitrary mail server + commands in the phantom body. Implementations should + include the phantom body in the body of the message it sends + to the mail server address to retrieve the relevant data. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 43] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.3.3.5 Examples and Further Explanations + + With the emerging possibility of very wide-area file + systems, it becomes very hard to know in advance the set of + machines where a file will and will not be accessible + directly from the file system. Therefore it may make sense + to provide both a file name, to be tried directly, and the + name of one or more sites from which the file is known to be + accessible. An implementation can try to retrieve remote + files using FTP or any other protocol, using anonymous file + retrieval or prompting the user for the necessary name and + password. If an external body is accessible via multiple + mechanisms, the sender may include multiple parts of type + message/external-body within an entity of type + multipart/alternative. + + However, the external-body mechanism is not intended to be + limited to file retrieval, as shown by the mail-server + access-type. Beyond this, one can imagine, for example, + using a video server for external references to video clips. + + If an entity is of type "message/external-body", then the + body of the entity will contain the header fields of the + encapsulated message. The body itself is to be found in the + external location. This means that if the body of the + "message/external-body" message contains two consecutive + CRLFs, everything after those pairs is NOT part of the + message itself. For most message/external-body messages, + this trailing area must simply be ignored. However, it is a + convenient place for additional data that cannot be included + in the content-type header field. In particular, if the + "access-type" value is "mail-server", then the trailing area + must contain commands to be sent to the mail server at the + address given by NAME@SITE, where NAME and SITE are the + values of the NAME and SITE parameters, respectively. + + The embedded message header fields which appear in the body + of the message/external-body data can be used to declare the + Content-type of the external body. Thus a complete + message/external-body message, referring to a document in + PostScript format, might look like this: + + From: Whomever + Subject: whatever + MIME-Version: 1.0 + Message-ID: id1@host.com + Content-Type: multipart/alternative; boundary=42 + + + --42 + Content-Type: message/external-body; + name="BodyFormats.ps"; + + + + + + Borenstein & Freed [Page 44] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + site="thumper.bellcore.com"; + access-type=ANON-FTP; + directory="pub"; + mode="image"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + + --42 + Content-Type: message/external-body; + name="/u/nsb/writing/rfcs/RFC-XXXX.ps"; + site="thumper.bellcore.com"; + access-type=AFS + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + + --42 + Content-Type: message/external-body; + access-type=mail-server + server="listserv@bogus.bitnet"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + + get rfc-xxxx doc + + --42-- + + Like the message/partial type, the message/external-body + type is intended to be transparent, that is, to convey the + data type in the external body rather than to convey a + message with a body of that type. Thus the headers on the + outer and inner parts must be merged using the same rules as + for message/partial. In particular, this means that the + Content-type header is overridden, but the From and Subject + headers are preserved. + + Note that since the external bodies are not transported as + mail, they need not conform to the 7-bit and line length + requirements, but might in fact be binary files. Thus a + Content-Transfer-Encoding is not generally necessary, though + it is permitted. + + Note that the body of a message of type "message/external- + body" is governed by the basic syntax for an RFC 822 + message. In particular, anything before the first + consecutive pair of CRLFs is header information, while + anything after it is body information, which is ignored for + most access-types. + + + + + + + + Borenstein & Freed [Page 45] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.4 The Application Content-Type + + The "application" Content-Type is to be used for data which + do not fit in any of the other categories, and particularly + for data to be processed by mail-based uses of application + programs. This is information which must be processed by an + application before it is viewable or usable to a user. + Expected uses for Content-Type application include mail- + based file transfer, spreadsheets, data for mail-based + scheduling systems, and languages for "active" + (computational) email. (The latter, in particular, can pose + security problems which should be understood by + implementors, and are considered in detail in the discussion + of the application/PostScript content-type.) + + For example, a meeting scheduler might define a standard + representation for information about proposed meeting dates. + An intelligent user agent would use this information to + conduct a dialog with the user, and might then send further + mail based on that dialog. More generally, there have been + several "active" messaging languages developed in which + programs in a suitably specialized language are sent through + the mail and automatically run in the recipient's + environment. + + Such applications may be defined as subtypes of the + "application" Content-Type. This document defines three + subtypes: octet-stream, ODA, and PostScript. + + In general, the subtype of application will often be the + name of the application for which the data are intended. + This does not mean, however, that any application program + name may be used freely as a subtype of application. Such + usages must be registered with IANA, as described in + Appendix F. + + 7.4.1 The Application/Octet-Stream (primary) subtype + + The primary subtype of application, "octet-stream", may be + used to indicate that a body contains binary data. The set + of possible parameters includes, but is not limited to: + + NAME -- a suggested name for the binary data if + stored as a file. + + TYPE -- the general type or category of binary + data. This is intended as information for the + human recipient rather than for any automatic + processing. + + CONVERSIONS -- the set of operations that have + been performed on the data before putting it in + the mail (and before any Content-Transfer-Encoding + that might have been applied). If multiple + + + + Borenstein & Freed [Page 46] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + conversions have occurred, they must be separated + by commas and specified in the order they were + applied -- that is, the leftmost conversion must + have occurred first, and conversions are undone + from right to left. Note that NO conversion + values are defined by this document. Any + conversion values that that do not begin with "X-" + must be preceded by a published specification and + by registration with IANA, as described in + Appendix F. + + PADDING -- the number of bits of padding that were + appended to the bitstream comprising the actual + contents to produce the enclosed byte-oriented + data. This is useful for enclosing a bitstream in + a body when the total number of bits is not a + multiple of the byte size. + + The values for these attributes are left undefined at + present, but may require specification in the future. An + example of a common (though UNIX-specific) usage might be: + + Content-Type: application/octet-stream; + name=foo.tar.Z; type=tar; + conversions="x-encrypt,x-compress" + + However, it should be noted that the use of such conversions + is explicitly discouraged due to a lack of portability and + standardization. The use of uuencode is particularly + discouraged, in favor of the Content-Transfer-Encoding + mechanism, which is both more standardized and more portable + across mail boundaries. + + The recommended action for an implementation that receives + application/octet-stream mail is to simply offer to put the + data in a file, with any Content-Transfer-Encoding undone, + or perhaps to use it as input to a user-specified process. + + To reduce the danger of transmitting rogue programs through + the mail, it is strongly recommended that implementations + NOT implement a path-search mechanism whereby an arbitrary + program named in the Content-Type parameter (e.g., an + "interpreter=" parameter) is found and executed using the + mail body as input. + + 7.4.2 The Application/PostScript subtype + + A Content-Type of "application/postscript" indicates a + PostScript program. The language is defined in + [POSTSCRIPT]. It is recommended that Postscript as sent + through email should use Postscript document structuring + conventions if at all possible, and correctly. + + + + + + Borenstein & Freed [Page 47] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + The execution of general-purpose PostScript interpreters + entails serious security risks, and implementors are + discouraged from simply sending PostScript email bodies to + "off-the-shelf" interpreters. While it is usually safe to + send PostScript to a printer, where the potential for harm + is greatly constrained, implementors should consider all of + the following before they add interactive display of + PostScript bodies to their mail readers. + + The remainder of this section outlines some, though probably + not all, of the possible problems with sending PostScript + through the mail. + + Dangerous operations in the PostScript language include, but + may not be limited to, the PostScript operators deletefile, + renamefile, filenameforall, and file. File is only + dangerous when applied to something other than standard + input or output. Implementations may also define additional + nonstandard file operators; these may also pose a threat to + security. Filenameforall, the wildcard file search + operator, may appear at first glance to be harmless. Note, + however, that this operator has the potential to reveal + information about what files the recipient has access to, + and this information may itself be sensitive. Message + senders should avoid the use of potentially dangerous file + operators, since these operators are quite likely to be + unavailable in secure PostScript implementations. Message- + receiving and -displaying software should either completely + disable all potentially dangerous file operators or take + special care not to delegate any special authority to their + operation. These operators should be viewed as being done by + an outside agency when interpreting PostScript documents. + Such disabling and/or checking should be done completely + outside of the reach of the PostScript language itself; care + should be taken to insure that no method exists for + reenabling full-function versions of these operators. + + The PostScript language provides facilities for exiting the + normal interpreter, or server, loop. Changes made in this + "outer" environment are customarily retained across + documents, and may in some cases be retained semipermanently + in nonvolatile memory. The operators associated with exiting + the interpreter loop have the potential to interfere with + subsequent document processing. As such, their unrestrained + use constitutes a threat of service denial. PostScript + operators that exit the interpreter loop include, but may + not be limited to, the exitserver and startjob operators. + Message-sending software should not generate PostScript that + depends on exiting the interpreter loop to operate. The + ability to exit will probably be unavailable in secure + PostScript implementations. Message-receiving and + -displaying software should, if possible, disable the + ability to make retained changes to the PostScript + environment. Eliminate the startjob and exitserver commands. + + + + Borenstein & Freed [Page 48] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + If these commands cannot be eliminated, at least set the + password associated with them to a hard-to-guess value. + + PostScript provides operators for setting system-wide and + device-specific parameters. These parameter settings may be + retained across jobs and may potentially pose a threat to + the correct operation of the interpreter. The PostScript + operators that set system and device parameters include, but + may not be limited to, the setsystemparams and setdevparams + operators. Message-sending software should not generate + PostScript that depends on the setting of system or device + parameters to operate correctly. The ability to set these + parameters will probably be unavailable in secure PostScript + implementations. Message-receiving and -displaying software + should, if possible, disable the ability to change system + and device parameters. If these operators cannot be + disabled, at least set the password associated with them to + a hard-to-guess value. + + Some PostScript implementations provide nonstandard + facilities for the direct loading and execution of machine + code. Such facilities are quite obviously open to + substantial abuse. Message-sending software should not + make use of such features. Besides being totally hardware- + specific, they are also likely to be unavailable in secure + implementations of PostScript. Message-receiving and + -displaying software should not allow such operators to be + used if they exist. + + PostScript is an extensible language, and many, if not most, + implementations of it provide a number of their own + extensions. This document does not deal with such extensions + explicitly since they constitute an unknown factor. + Message-sending software should not make use of nonstandard + extensions; they are likely to be missing from some + implementations. Message-receiving and -displaying software + should make sure that any nonstandard PostScript operators + are secure and don't present any kind of threat. + + It is possible to write PostScript that consumes huge + amounts of various system resources. It is also possible to + write PostScript programs that loop infinitely. Both types + of programs have the potential to cause damage if sent to + unsuspecting recipients. Message-sending software should + avoid the construction and dissemination of such programs, + which is antisocial. Message-receiving and -displaying + software should provide appropriate mechanisms to abort + processing of a document after a reasonable amount of time + has elapsed. In addition, PostScript interpreters should be + limited to the consumption of only a reasonable amount of + any given system resource. + + Finally, bugs may exist in some PostScript interpreters + which could possibly be exploited to gain unauthorized + + + + Borenstein & Freed [Page 49] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + access to a recipient's system. Apart from noting this + possibility, there is no specific action to take to prevent + this, apart from the timely correction of such bugs if any + are found. + + 7.4.3 The Application/ODA subtype + + The "ODA" subtype of application is used to indicate that a + body contains information encoded according to the Office + Document Architecture [ODA] standards, using the ODIF + representation format. For application/oda, the Content- + Type line should also specify an attribute/value pair that + indicates the document application profile (DAP), using the + key word "profile". Thus an appropriate header field might + look like this: + + Content-Type: application/oda; profile=Q112 + + Consult the ODA standard [ODA] for further information. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 50] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 7.5 The Image Content-Type + + A Content-Type of "image" indicates that the bodycontains an + image. The subtype names the specific image format. These + names are case insensitive. Two initial subtypes are "jpeg" + for the JPEG format, JFIF encoding, and "gif" for GIF format + [GIF]. + + The list of image subtypes given here is neither exclusive + nor exhaustive, and is expected to grow as more types are + registered with IANA, as described in Appendix F. + + 7.6 The Audio Content-Type + + A Content-Type of "audio" indicates that the body contains + audio data. Although there is not yet a consensus on an + "ideal" audio format for use with computers, there is a + pressing need for a format capable of providing + interoperable behavior. + + The initial subtype of "basic" is specified to meet this + requirement by providing an absolutely minimal lowest common + denominator audio format. It is expected that richer + formats for higher quality and/or lower bandwidth audio will + be defined by a later document. + + The content of the "audio/basic" subtype is audio encoded + using 8-bit ISDN u-law [PCM]. When this subtype is present, + a sample rate of 8000 Hz and a single channel is assumed. + + 7.7 The Video Content-Type + + A Content-Type of "video" indicates that the body contains a + time-varying-picture image, possibly with color and + coordinated sound. The term "video" is used extremely + generically, rather than with reference to any particular + technology or format, and is not meant to preclude subtypes + such as animated drawings encoded compactly. The subtype + "mpeg" refers to video coded according to the MPEG standard + [MPEG]. + + Note that although in general this document strongly + discourages the mixing of multiple media in a single body, + it is recognized that many so-called "video" formats include + a representation for synchronized audio, and this is + explicitly permitted for subtypes of "video". + + 7.8 Experimental Content-Type Values + + A Content-Type value beginning with the characters "X-" is a + private value, to be used by consenting mail systems by + mutual agreement. Any format without a rigorous and public + definition must be named with an "X-" prefix, and publicly + specified values shall never begin with "X-". (Older + + + + Borenstein & Freed [Page 51] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + versions of the widely-used Andrew system use the "X-BE2" + name, so new systems should probably choose a different + name.) + + In general, the use of "X-" top-level types is strongly + discouraged. Implementors should invent subtypes of the + existing types whenever possible. The invention of new + types is intended to be restricted primarily to the + development of new media types for email, such as digital + odors or holography, and not for new data formats in + general. In many cases, a subtype of application will be + more appropriate than a new top-level type. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 52] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Summary + + Using the MIME-Version, Content-Type, and Content-Transfer- + Encoding header fields, it is possible to include, in a + standardized way, arbitrary types of data objects with RFC + 822 conformant mail messages. No restrictions imposed by + either RFC 821 or RFC 822 are violated, and care has been + taken to avoid problems caused by additional restrictions + imposed by the characteristics of some Internet mail + transport mechanisms (see Appendix B). The "multipart" and + "message" Content-Types allow mixing and hierarchical + structuring of objects of different types in a single + message. Further Content-Types provide a standardized + mechanism for tagging messages or body parts as audio, + image, or several other kinds of data. A distinguished + parameter syntax allows further specification of data format + details, particularly the specification of alternate + character sets. Additional optional header fields provide + mechanisms for certain extensions deemed desirable by many + implementors. Finally, a number of useful Content-Types are + defined for general use by consenting user agents, notably + text/richtext, message/partial, and message/external-body. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 53] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Acknowledgements + + This document is the result of the collective effort of a + large number of people, at several IETF meetings, on the + IETF-SMTP and IETF-822 mailing lists, and elsewhere. + Although any enumeration seems doomed to suffer from + egregious omissions, the following are among the many + contributors to this effort: + + Harald Tveit Alvestrand Timo Lehtinen + Randall Atkinson John R. MacMillan + Philippe Brandon Rick McGowan + Kevin Carosso Leo Mclaughlin + Uhhyung Choi Goli Montaser-Kohsari + Cristian Constantinof Keith Moore + Mark Crispin Tom Moore + Dave Crocker Erik Naggum + Terry Crowley Mark Needleman + Walt Daniels John Noerenberg + Frank Dawson Mats Ohrman + Hitoshi Doi Julian Onions + Kevin Donnelly Michael Patton + Keith Edwards David J. Pepper + Chris Eich Blake C. Ramsdell + Johnny Eriksson Luc Rooijakkers + Craig Everhart Marshall T. Rose + Patrik Faeltstroem Jonathan Rosenberg + Erik E. Fair Jan Rynning + Roger Fajman Harri Salminen + Alain Fontaine Michael Sanderson + James M. Galvin Masahiro Sekiguchi + Philip Gladstone Mark Sherman + Thomas Gordon Keld Simonsen + Phill Gross Bob Smart + James Hamilton Peter Speck + Steve Hardcastle-Kille Henry Spencer + David Herron Einar Stefferud + Bruce Howard Michael Stein + Bill Janssen Klaus Steinberger + Olle Jaernefors Peter Svanberg + Risto Kankkunen James Thompson + Phil Karn Steve Uhler + Alan Katz Stuart Vance + Tim Kehres Erik van der Poel + Neil Katin Guido van Rossum + Kyuho Kim Peter Vanderbilt + Anders Klemets Greg Vaudreuil + John Klensin Ed Vielmetti + Valdis Kletniek Ryan Waldron + Jim Knowles Wally Wedel + Stev Knowles Sven-Ove Westberg + Bob Kummerfeld Brian Wideen + + + + + + Borenstein & Freed [Page 54] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Pekka Kytolaakso John Wobus + Stellan Lagerstr.m Glenn Wright + Vincent Lau Rayan Zachariassen + Donald Lindsay David Zimmerman + The authors apologize for any omissions from this list, + which are certainly unintentional. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 55] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix A -- Minimal MIME-Conformance + + The mechanisms described in this document are open-ended. + It is definitely not expected that all implementations will + support all of the Content-Types described, nor that they + will all share the same extensions. In order to promote + interoperability, however, it is useful to define the + concept of "MIME-conformance" to define a certain level of + implementation that allows the useful interworking of + messages with content that differs from US ASCII text. In + this section, we specify the requirements for such + conformance. + + A mail user agent that is MIME-conformant MUST: + + 1. Always generate a "MIME-Version: 1.0" header + field. + + 2. Recognize the Content-Transfer-Encoding header + field, and decode all received data encoded with + either the quoted-printable or base64 + implementations. Encode any data sent that is + not in seven-bit mail-ready representation using + one of these transformations and include the + appropriate Content-Transfer-Encoding header + field, unless the underlying transport mechanism + supports non-seven-bit data, as SMTP does not. + + 3. Recognize and interpret the Content-Type + header field, and avoid showing users raw data + with a Content-Type field other than text. Be + able to send at least text/plain messages, with + the character set specified as a parameter if it + is not US-ASCII. + + 4. Explicitly handle the following Content-Type + values, to at least the following extents: + + Text: + -- Recognize and display "text" mail + with the character set "US-ASCII." + -- Recognize other character sets at + least to the extent of being able + to inform the user about what + character set the message uses. + -- Recognize the "ISO-8859-*" character + sets to the extent of being able to + display those characters that are + common to ISO-8859-* and US-ASCII, + namely all characters represented + by octet values 0-127. + -- For unrecognized subtypes, show or + offer to show the user the "raw" + version of the data. An ability at + + + + Borenstein & Freed [Page 56] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + least to convert "text/richtext" to + plain text, as shown in Appendix D, + is encouraged, but not required for + conformance. + Message: + --Recognize and display at least the + primary (822) encapsulation. + Multipart: + -- Recognize the primary (mixed) + subtype. Display all relevant + information on the message level + and the body part header level and + then display or offer to display + each of the body parts + individually. + -- Recognize the "alternative" subtype, + and avoid showing the user + redundant parts of + multipart/alternative mail. + -- Treat any unrecognized subtypes as if + they were "mixed". + Application: + -- Offer the ability to remove either of + the two types of Content-Transfer- + Encoding defined in this document + and put the resulting information + in a user file. + + 5. Upon encountering any unrecognized Content- + Type, an implementation must treat it as if it had + a Content-Type of "application/octet-stream" with + no parameter sub-arguments. How such data are + handled is up to an implementation, but likely + options for handling such unrecognized data + include offering the user to write it into a file + (decoded from its mail transport format) or + offering the user to name a program to which the + decoded data should be passed as input. + Unrecognized predefined types, which in a MIME- + conformant mailer might still include audio, + image, or video, should also be treated in this + way. + + A user agent that meets the above conditions is said to be + MIME-conformant. The meaning of this phrase is that it is + assumed to be "safe" to send virtually any kind of + properly-marked data to users of such mail systems, because + such systems will at least be able to treat the data as + undifferentiated binary, and will not simply splash it onto + the screen of unsuspecting users. There is another sense + in which it is always "safe" to send data in a format that + is MIME-conformant, which is that such data will not break + or be broken by any known systems that are conformant with + RFC 821 and RFC 822. User agents that are MIME-conformant + + + + Borenstein & Freed [Page 57] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + have the additional guarantee that the user will not be + shown data that were never intended to be viewed as text. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 58] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix B -- General Guidelines For Sending Email Data + + Internet email is not a perfect, homogeneous system. Mail + may become corrupted at several stages in its travel to a + final destination. Specifically, email sent throughout the + Internet may travel across many networking technologies. + Many networking and mail technologies do not support the + full functionality possible in the SMTP transport + environment. Mail traversing these systems is likely to be + modified in such a way that it can be transported. + + There exist many widely-deployed non-conformant MTAs in the + Internet. These MTAs, speaking the SMTP protocol, alter + messages on the fly to take advantage of the internal data + structure of the hosts they are implemented on, or are just + plain broken. + + The following guidelines may be useful to anyone devising a + data format (Content-Type) that will survive the widest + range of networking technologies and known broken MTAs + unscathed. Note that anything encoded in the base64 + encoding will satisfy these rules, but that some well-known + mechanisms, notably the UNIX uuencode facility, will not. + Note also that anything encoded in the Quoted-Printable + encoding will survive most gateways intact, but possibly not + some gateways to systems that use the EBCDIC character set. + + (1) Under some circumstances the encoding used for + data may change as part of normal gateway or user + agent operation. In particular, conversion from + base64 to quoted-printable and vice versa may be + necessary. This may result in the confusion of + CRLF sequences with line breaks in text body + parts. As such, the persistence of CRLF as + something other than a line break should not be + relied on. + + (2) Many systems may elect to represent and store + text data using local newline conventions. Local + newline conventions may not match the RFC822 CRLF + convention -- systems are known that use plain CR, + plain LF, CRLF, or counted records. The result is + that isolated CR and LF characters are not well + tolerated in general; they may be lost or + converted to delimiters on some systems, and hence + should not be relied on. + + (3) TAB (HT) characters may be misinterpreted or + may be automatically converted to variable numbers + of spaces. This is unavoidable in some + environments, notably those not based on the ASCII + character set. Such conversion is STRONGLY + DISCOURAGED, but it may occur, and mail formats + should not rely on the persistence of TAB (HT) + + + + Borenstein & Freed [Page 59] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + characters. + + (4) Lines longer than 76 characters may be wrapped + or truncated in some environments. Line wrapping + and line truncation are STRONGLY DISCOURAGED, but + unavoidable in some cases. Applications which + require long lines should somehow differentiate + between soft and hard line breaks. (A simple way + to do this is to use the quoted-printable + encoding.) + + (5) Trailing "white space" characters (SPACE, TAB + (HT)) on a line may be discarded by some transport + agents, while other transport agents may pad lines + with these characters so that all lines in a mail + file are of equal length. The persistence of + trailing white space, therefore, should not be + relied on. + + (6) Many mail domains use variations on the ASCII + character set, or use character sets such as + EBCDIC which contain most but not all of the US- + ASCII characters. The correct translation of + characters not in the "invariant" set cannot be + depended on across character converting gateways. + For example, this situation is a problem when + sending uuencoded information across BITNET, an + EBCDIC system. Similar problems can occur without + crossing a gateway, since many Internet hosts use + character sets other than ASCII internally. The + definition of Printable Strings in X.400 adds + further restrictions in certain special cases. In + particular, the only characters that are known to + be consistent across all gateways are the 73 + characters that correspond to the upper and lower + case letters A-Z and a-z, the 10 digits 0-9, and + the following eleven special characters: + + "'" (ASCII code 39) + "(" (ASCII code 40) + ")" (ASCII code 41) + "+" (ASCII code 43) + "," (ASCII code 44) + "-" (ASCII code 45) + "." (ASCII code 46) + "/" (ASCII code 47) + ":" (ASCII code 58) + "=" (ASCII code 61) + "?" (ASCII code 63) + + A maximally portable mail representation, such as + the base64 encoding, will confine itself to + relatively short lines of text in which the only + meaningful characters are taken from this set of + + + + Borenstein & Freed [Page 60] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + 73 characters. + + Please note that the above list is NOT a list of recommended + practices for MTAs. RFC 821 MTAs are prohibited from + altering the character of white space or wrapping long + lines. These BAD and illegal practices are known to occur + on established networks, and implementions should be robust + in dealing with the bad effects they can cause. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 61] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix C -- A Complex Multipart Example + + What follows is the outline of a complex multipart message. + This message has five parts to be displayed serially: two + introductory plain text parts, an embedded multipart + message, a richtext part, and a closing encapsulated text + message in a non-ASCII character set. The embedded + multipart message has two parts to be displayed in parallel, + a picture and an audio fragment. + + MIME-Version: 1.0 + From: Nathaniel Borenstein <nsb@bellcore.com> + Subject: A multipart example + Content-Type: multipart/mixed; + boundary=unique-boundary-1 + + This is the preamble area of a multipart message. + Mail readers that understand multipart format + should ignore this preamble. + If you are reading this text, you might want to + consider changing to a mail reader that understands + how to properly display multipart messages. + --unique-boundary-1 + + ...Some text appears here... + [Note that the preceding blank line means + no header fields were given and this is text, + with charset US ASCII. It could have been + done with explicit typing as in the next part.] + + --unique-boundary-1 + Content-type: text/plain; charset=US-ASCII + + This could have been part of the previous part, + but illustrates explicit versus implicit + typing of body parts. + + --unique-boundary-1 + Content-Type: multipart/parallel; + boundary=unique-boundary-2 + + + --unique-boundary-2 + Content-Type: audio/basic + Content-Transfer-Encoding: base64 + + ... base64-encoded 8000 Hz single-channel + u-law-format audio data goes here.... + + --unique-boundary-2 + Content-Type: image/gif + Content-Transfer-Encoding: Base64 + + + + + + Borenstein & Freed [Page 62] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + ... base64-encoded image data goes here.... + + --unique-boundary-2-- + + --unique-boundary-1 + Content-type: text/richtext + + This is <bold><italic>richtext.</italic></bold> + <nl><nl>Isn't it + <bigger><bigger>cool?</bigger></bigger> + + --unique-boundary-1 + Content-Type: message/rfc822 + + From: (name in US-ASCII) + Subject: (subject in US-ASCII) + Content-Type: Text/plain; charset=ISO-8859-1 + Content-Transfer-Encoding: Quoted-printable + + ... Additional text in ISO-8859-1 goes here ... + + --unique-boundary-1-- + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 63] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix D -- A Simple Richtext-to-Text Translator in C + + One of the major goals in the design of the richtext subtype + of the text Content-Type is to make formatted text so simple + that even text-only mailers will implement richtext-to- + plain-text translators, thus increasing the likelihood that + multifont text will become "safe" to use very widely. To + demonstrate this simplicity, what follows is an extremely + simple 44-line C program that converts richtext input into + plain text output: + + #include <stdio.h> + #include <ctype.h> + main() { + int c, i; + char token[50]; + + while((c = getc(stdin)) != EOF) { + if (c == '<') { + for (i=0; (i<49 && (c = getc(stdin)) != '>' + && c != EOF); ++i) { + token[i] = isupper(c) ? tolower(c) : c; + } + if (c == EOF) break; + if (c != '>') while ((c = getc(stdin)) != + '>' + && c != EOF) {;} + if (c == EOF) break; + token[i] = '\0'; + if (!strcmp(token, "lt")) { + putc('<', stdout); + } else if (!strcmp(token, "nl")) { + putc('\n', stdout); + } else if (!strcmp(token, "/paragraph")) { + fputs("\n\n", stdout); + } else if (!strcmp(token, "comment")) { + int commct=1; + while (commct > 0) { + while ((c = getc(stdin)) != '<' + && c != EOF) ; + if (c == EOF) break; + for (i=0; (c = getc(stdin)) != '>' + && c != EOF; ++i) { + token[i] = isupper(c) ? + tolower(c) : c; + } + if (c== EOF) break; + token[i] = NULL; + if (!strcmp(token, "/comment")) -- + commct; + if (!strcmp(token, "comment")) + ++commct; + + + + + + Borenstein & Freed [Page 64] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + } + } /* Ignore all other tokens */ + } else if (c != '\n') putc(c, stdout); + } + putc('\n', stdout); /* for good measure */ + } + It should be noted that one can do considerably better than + this in displaying richtext data on a dumb terminal. In + particular, one can replace font information such as "bold" + with textual emphasis (like *this* or _T_H_I_S_). One can + also properly handle the richtext formatting commands + regarding indentation, justification, and others. However, + the above program is all that is necessary in order to + present richtext on a dumb terminal. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 65] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix E -- Collected Grammar + + This appendix contains the complete BNF grammar for all the + syntax specified by this document. + + By itself, however, this grammar is incomplete. It refers + to several entities that are defined by RFC 822. Rather + than reproduce those definitions here, and risk + unintentional differences between the two, this document + simply refers the reader to RFC 822 for the remaining + definitions. Wherever a term is undefined, it refers to the + RFC 822 definition. + + attribute := token + + body-part = <"message" as defined in RFC 822, + with all header fields optional, and with the + specified delimiter not occurring anywhere in + the message body, either on a line by itself + or as a substring anywhere.> + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / + "_" + / "," / "-" / "." / "/" / ":" / "=" / "?" + + close-delimiter := delimiter "--" + + Content-Description := *text + + Content-ID := msg-id + + Content-Transfer-Encoding := "BASE64" / "QUOTED- + PRINTABLE" / + "8BIT" / "7BIT" / + "BINARY" / x-token + + Content-Type := type "/" subtype *[";" parameter] + + delimiter := CRLF "--" boundary ; taken from Content-Type + field. + ; when content-type is + multipart + ; There should be no space + ; between "--" and boundary. + + encapsulation := delimiter CRLF body-part + + epilogue := *text ; to be ignored upon + receipt. + + + + + Borenstein & Freed [Page 66] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + MIME-Version := 1*text + + multipart-body := preamble 1*encapsulation close-delimiter + epilogue + + parameter := attribute "=" value + + preamble := *text ; to be ignored upon + receipt. + + subtype := token + + token := 1*<any CHAR except SPACE, CTLs, or tspecials> + + tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in + / "," / ";" / ":" / "\" / <"> ; quoted-string, + / "/" / "[" / "]" / "?" / "." ; to use within + / "=" ; parameter values + + + type := "application" / "audio" ; case- + insensitive + / "image" / "message" + / "multipart" / "text" + / "video" / x-token + + value := token / quoted-string + + x-token := <The two characters "X-" followed, with no + intervening white space, by any token> + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 67] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix F -- IANA Registration Procedures + + MIME has been carefully designed to have extensible + mechanisms, and it is expected that the set of content- + type/subtype pairs and their associated parameters will grow + significantly with time. Several other MIME fields, notably + character set names, access-type parameters for the + message/external-body type, conversions parameters for the + application type, and possibly even Content-Transfer- + Encoding values, are likely to have new values defined over + time. In order to ensure that the set of such values is + developed in an orderly, well-specified, and public manner, + MIME defines a registration process which uses the Internet + Assigned Numbers Authority (IANA) as a central registry for + such values. + + In general, parameters in the content-type header field are + used to convey supplemental information for various content + types, and their use is defined when the content-type and + subtype are defined. New parameters should not be defined + as a way to introduce new functionality. + + In order to simplify and standardize the registration + process, this appendix gives templates for the registration + of new values with IANA. Each of these is given in the form + of an email message template, to be filled in by the + registering party. + + F.1 Registration of New Content-type/subtype Values + + Note that MIME is generally expected to be extended by + subtypes. If a new fundamental top-level type is needed, + its specification should be published as an RFC or + submitted in a form suitable to become an RFC, and be + subject to the Internet standards process. + + To: IANA@isi.edu + Subject: Registration of new MIME content-type/subtype + + MIME type name: + + (If the above is not an existing top-level MIME type, + please explain why an existing type cannot be used.) + + MIME subtype name: + + Required parameters: + + Optional parameters: + + Encoding considerations: + + Security considerations: + + + + + Borenstein & Freed [Page 68] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Published specification: + + (The published specification must be an Internet RFC or + RFC-to-be if a new top-level type is being defined, and + must be a publicly available specification in any + case.) + + Person & email address to contact for further + information: + F.2 Registration of New Character Set Values + + To: IANA@isi.edu + Subject: Registration of new MIME character set value + + MIME character set name: + + Published specification: + + (The published specification must be an Internet RFC or + RFC-to-be or an international standard.) + + Person & email address to contact for further + information: + + F.3 Registration of New Access-type Values for + Message/external-body + + To: IANA@isi.edu + Subject: Registration of new MIME Access-type for + Message/external-body content-type + + MIME access-type name: + + Required parameters: + + Optional parameters: + + Published specification: + + (The published specification must be an Internet RFC or + RFC-to-be.) + + Person & email address to contact for further + information: + + + F.4 Registration of New Conversions Values for Application + + To: IANA@isi.edu + Subject: Registration of new MIME Conversions value + for Application content-type + + MIME Conversions name: + + + + + Borenstein & Freed [Page 69] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Published specification: + + (The published specification must be an Internet RFC or + RFC-to-be.) + + Person & email address to contact for further + information: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 70] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix G -- Summary of the Seven Content-types + + Content-type: text + + Subtypes defined by this document: plain, richtext + + Important Parameters: charset + + Encoding notes: quoted-printable generally preferred if an + encoding is needed and the character set is mostly an + ASCII superset. + + Security considerations: Rich text formats such as TeX and + Troff often contain mechanisms for executing arbitrary + commands or file system operations, and should not be + used automatically unless these security problems have + been addressed. Even plain text may contain control + characters that can be used to exploit the capabilities + of "intelligent" terminals and cause security + violations. User interfaces designed to run on such + terminals should be aware of and try to prevent such + problems. + ________________________________________________________________ + + Content-type: multipart + + Subtypes defined by this document: mixed, alternative, + digest, parallel. + + Important Parameters: boundary + + Encoding notes: No content-transfer-encoding is permitted. + + ________________________________________________________________ + + Content-type: message + + Subtypes defined by this document: rfc822, partial, + external-body + + Important Parameters: id, number, total + + Encoding notes: No content-transfer-encoding is permitted. + + ________________________________________________________________ + + Content-type: application + + Subtypes defined by this document: octet-stream, + postscript, oda + + Important Parameters: profile + + + + + + Borenstein & Freed [Page 71] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Encoding notes: base64 generally preferred for octet-stream + or other unreadable subtypes. + + Security considerations: This type is intended for the + transmission of data to be interpreted by locally-installed + programs. If used, for example, to transmit executable + binary programs or programs in general-purpose interpreted + languages, such as LISP programs or shell scripts, severe + security problems could result. In general, authors of + mail-reading agents are cautioned against giving their + systems the power to execute mail-based application data + without carefully considering the security implications. + While it is certainly possible to define safe application + formats and even safe interpreters for unsafe formats, each + interpreter should be evaluated separately for possible + security problems. + ________________________________________________________________ + + Content-type: image + + Subtypes defined by this document: jpeg, gif + + Important Parameters: none + + Encoding notes: base64 generally preferred + + ________________________________________________________________ + + Content-type: audio + + Subtypes defined by this document: basic + + Important Parameters: none + + Encoding notes: base64 generally preferred + + ________________________________________________________________ + + Content-type: video + + Subtypes defined by this document: mpeg + + Important Parameters: none + + Encoding notes: base64 generally preferred + + + + + + + + + + + + + Borenstein & Freed [Page 72] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Appendix H -- Canonical Encoding Model + + + + There was some confusion, in earlier drafts of this memo, + regarding the model for when email data was to be converted + to canonical form and encoded, and in particular how this + process would affect the treatment of CRLFs, given that the + representation of newlines varies greatly from system to + system. For this reason, a canonical model for encoding is + presented below. + + The process of composing a MIME message part can be modelled + as being done in a number of steps. Note that these steps + are roughly similar to those steps used in RFC1113: + + Step 1. Creation of local form. + + The body part to be transmitted is created in the system's + native format. The native character set is used, and where + appropriate local end of line conventions are used as well. + The may be a UNIX-style text file, or a Sun raster image, or + a VMS indexed file, or audio data in a system-dependent + format stored only in memory, or anything else that + corresponds to the local model for the representation of + some form of information. + + Step 2. Conversion to canonical form. + + The entire body part, including "out-of-band" information + such as record lengths and possibly file attribute + information, is converted to a universal canonical form. + The specific content type of the body part as well as its + associated attributes dictate the nature of the canonical + form that is used. Conversion to the proper canonical form + may involve character set conversion, transformation of + audio data, compression, or various other operations + specific to the various content types. + + For example, in the case of text/plain data, the text must + be converted to a supported character set and lines must be + delimited with CRLF delimiters in accordance with RFC822. + Note that the restriction on line lengths implied by RFC822 + is eliminated if the next step employs either quoted- + printable or base64 encoding. + + Step 3. Apply transfer encoding. + + A Content-Transfer-Encoding appropriate for this body part + is applied. Note that there is no fixed relationship + between the content type and the transfer encoding. In + particular, it may be appropriate to base the choice of + base64 or quoted-printable on character frequency counts + which are specific to a given instance of body part. + + + + Borenstein & Freed [Page 73] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Step 4. Insertion into message. + + The encoded object is inserted into a MIME message with + appropriate body part headers and boundary markers. + + It is vital to note that these steps are only a model; they + are specifically NOT a blueprint for how an actual system + would be built. In particular, the model fails to account + for two common designs: + + 1. In many cases the conversion to a canonical + form prior to encoding will be subsumed into the + encoder itself, which understands local formats + directly. For example, the local newline + convention for text bodyparts might be carried + through to the encoder itself along with knowledge + of what that format is. + + 2. The output of the encoders may have to pass + through one or more additional steps prior to + being transmitted as a message. As such, the + output of the encoder may not be compliant with + the formats specified by RFC822. In particular, + once again it may be appropriate for the + converter's output to be expressed using local + newline conventions rather than using the standard + RFC822 CRLF delimiters. + + Other implementation variations are conceivable as well. + The only important aspect of this discussion is that the + resulting messages are consistent with those produced by the + model described here. + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 74] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + References + + [US-ASCII] Coded Character Set--7-Bit American Standard Code + for Information Interchange, ANSI X3.4-1986. + + [ATK] Borenstein, Nathaniel S., Multimedia Applications + Development with the Andrew Toolkit, Prentice-Hall, 1990. + + [GIF] Graphics Interchange Format (Version 89a), Compuserve, + Inc., Columbus, Ohio, 1990. + + [ISO-2022] International Standard--Information Processing-- + ISO 7-bit and 8-bit coded character sets--Code extension + techniques, ISO 2022:1986. + + [ISO-8859] Information Processing -- 8-bit Single-Byte Coded + Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO + 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, + 1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part + 4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5: + Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6: + Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: + Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: + Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin + alphabet No. 5, ISO 8859-9, 1990. + + [ISO-646] International Standard--Information Processing-- + ISO 7-bit coded character set for information interchange, + ISO 646:1983. + + [MPEG] Video Coding Draft Standard ISO 11172 CD, ISO + IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991. + + [ODA] ISO 8613; Information Processing: Text and Office + System; Office Document Architecture (ODA) and Interchange + Format (ODIF), Part 1-8, 1989. + + [PCM] CCITT, Fascicle III.4 - Recommendation G.711, Geneva, + 1972, "Pulse Code Modulation (PCM) of Voice Frequencies". + + [POSTSCRIPT] Adobe Systems, Inc., PostScript Language + Reference Manual, Addison-Wesley, 1985. + + [X400] Schicker, Pietro, "Message Handling Systems, X.400", + Message Handling Systems and Distributed Applications, E. + Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- + Holland, 1989, pp. 3-41. + + [RFC-783] Sollins, K.R. TFTP Protocol (revision 2). June, + 1981, MIT, RFC-783. + + [RFC-821] Postel, J.B. Simple Mail Transfer Protocol. + August, 1982, USC/Information Sciences Institute, RFC-821. + + + + + Borenstein & Freed [Page 75] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + [RFC-822] Crocker, D. Standard for the format of ARPA + Internet text messages. August, 1982, UDEL, RFC-822. + + [RFC-934] Rose, M.T.; Stefferud, E.A. Proposed standard + for message encapsulation. January, 1985, Delaware + and NMA, RFC-934. + + [RFC-959] Postel, J.B.; Reynolds, J.K. File Transfer + Protocol. October, 1985, USC/Information Sciences + Institute, RFC-959. + + [RFC-1049] Sirbu, M.A. Content-Type header field for + Internet messages. March, 1988, CMU, RFC-1049. + + [RFC-1113] Linn, J. Privacy enhancement for Internet + electronic mail: Part I - message encipherment and + authentication procedures. August, 1989, IAB Privacy Task + Force, RFC-1113. + + [RFC-1154] Robinson, D.; Ullmann, R. Encoding header field + for Internet messages. April, 1990, Prime Computer, + Inc., RFC-1154. + + [RFC-1342] Moore, Keith, Representation of Non-Ascii Text in + Internet Message Headers. June, 1992, University of + Tennessee, RFC-1342. + + Security Considerations + + Security issues are discussed in Section 7.4.2 and in + Appendix G. Implementors should pay special attention to + the security implications of any mail content-types that can + cause the remote execution of any actions in the recipient's + environment. In such cases, the discussion of the + applicaton/postscript content-type in Section 7.4.2 may + serve as a model for considering other content-types with + remote execution capabilities. + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 76] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + Authors' Addresses + + For more information, the authors of this document may be + contacted via Internet mail: + + Nathaniel S. Borenstein + MRE 2D-296, Bellcore + 445 South St. + Morristown, NJ 07962-1910 + + Phone: +1 201 829 4270 + Fax: +1 201 829 7019 + Email: nsb@bellcore.com + + + Ned Freed + Innosoft International, Inc. + 250 West First Street + Suite 240 + Claremont, CA 91711 + + Phone: +1 714 624 7907 + Fax: +1 714 621 5319 + Email: ned@innosoft.com + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page 77] + + + + + RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992 + + + + + + THIS PAGE INTENTIONALLY LEFT BLANK. + + Please discard this page and place the following table of + contents after the title page. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page i] + + + + + + + + + Table of Contents + + + 1 Introduction....................................... 1 + 2 Notations, Conventions, and Generic BNF Grammar.... 3 + 3 The MIME-Version Header Field...................... 5 + 4 The Content-Type Header Field...................... 6 + 5 The Content-Transfer-Encoding Header Field......... 10 + 5.1 Quoted-Printable Content-Transfer-Encoding......... 14 + 5.2 Base64 Content-Transfer-Encoding................... 17 + 6 Additional Optional Content- Header Fields......... 19 + 6.1 Optional Content-ID Header Field................... 19 + 6.2 Optional Content-Description Header Field.......... 19 + 7 The Predefined Content-Type Values................. 20 + 7.1 The Text Content-Type.............................. 20 + 7.1.1 The charset parameter.............................. 20 + 7.1.2 The Text/plain subtype............................. 23 + 7.1.3 The Text/richtext subtype.......................... 23 + 7.2 The Multipart Content-Type......................... 29 + 7.2.1 Multipart: The common syntax...................... 30 + 7.2.2 The Multipart/mixed (primary) subtype.............. 34 + 7.2.3 The Multipart/alternative subtype.................. 34 + 7.2.4 The Multipart/digest subtype....................... 36 + 7.2.5 The Multipart/parallel subtype..................... 36 + 7.3 The Message Content-Type........................... 37 + 7.3.1 The Message/rfc822 (primary) subtype............... 37 + 7.3.2 The Message/Partial subtype........................ 37 + 7.3.3 The Message/External-Body subtype.................. 40 + 7.4 The Application Content-Type....................... 46 + 7.4.1 The Application/Octet-Stream (primary) subtype..... 46 + 7.4.2 The Application/PostScript subtype................. 47 + 7.4.3 The Application/ODA subtype........................ 50 + 7.5 The Image Content-Type............................. 51 + 7.6 The Audio Content-Type............................. 51 + 7.7 The Video Content-Type............................. 51 + 7.8 Experimental Content-Type Values................... 51 + Summary............................................ 53 + Acknowledgements................................... 54 + Appendix A -- Minimal MIME-Conformance............. 56 + Appendix B -- General Guidelines For Sending Email Data59 + Appendix C -- A Complex Multipart Example.......... 62 + Appendix D -- A Simple Richtext-to-Text Translator in C64 + Appendix E -- Collected Grammar.................... 66 + Appendix F -- IANA Registration Procedures......... 68 + F.1 Registration of New Content-type/subtype Values..68 + F.2 Registration of New Character Set Values...... 69 + F.3 Registration of New Access-type Values for Message/external-body69 + F.4 Registration of New Conversions Values for Application69 + Appendix G -- Summary of the Seven Content-types... 71 + Appendix H -- Canonical Encoding Model............. 73 + References......................................... 75 + Security Considerations............................ 76 + Authors' Addresses................................. 77 + + + + Borenstein & Freed [Page ii] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Borenstein & Freed [Page iii] + diff --git a/rfc/rfc2045.txt b/rfc/rfc2045.txt @@ -0,0 +1,1739 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2045 Innosoft +Obsoletes: 1521, 1522, 1590 N. Borenstein +Category: Standards Track First Virtual + November 1996 + + + Multipurpose Internet Mail Extensions + (MIME) Part One: + Format of Internet Message Bodies + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822, defines a message representation protocol specifying + considerable detail about US-ASCII message headers, and leaves the + message content, or message body, as flat US-ASCII text. This set of + documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) an extensible set of different formats for non-textual + message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + This initial document specifies the various headers used to describe + the structure of MIME messages. The second document, RFC 2046, + defines the general structure of the MIME media typing system and + defines an initial set of media types. The third document, RFC 2047, + describes extensions to RFC 822 to allow non-US-ASCII text data in + + + +Freed & Borenstein Standards Track [Page 1] + +RFC 2045 Internet Message Bodies November 1996 + + + Internet mail header fields. The fourth document, RFC 2048, specifies + various IANA registration procedures for MIME-related facilities. The + fifth and final document, RFC 2049, describes MIME conformance + criteria as well as providing some illustrative examples of MIME + message formats, acknowledgements, and the bibliography. + + These documents are revisions of RFCs 1521, 1522, and 1590, which + themselves were revisions of RFCs 1341 and 1342. An appendix in RFC + 2049 describes differences and changes from previous versions. + +Table of Contents + + 1. Introduction ......................................... 3 + 2. Definitions, Conventions, and Generic BNF Grammar .... 5 + 2.1 CRLF ................................................ 5 + 2.2 Character Set ....................................... 6 + 2.3 Message ............................................. 6 + 2.4 Entity .............................................. 6 + 2.5 Body Part ........................................... 7 + 2.6 Body ................................................ 7 + 2.7 7bit Data ........................................... 7 + 2.8 8bit Data ........................................... 7 + 2.9 Binary Data ......................................... 7 + 2.10 Lines .............................................. 7 + 3. MIME Header Fields ................................... 8 + 4. MIME-Version Header Field ............................ 8 + 5. Content-Type Header Field ............................ 10 + 5.1 Syntax of the Content-Type Header Field ............. 12 + 5.2 Content-Type Defaults ............................... 14 + 6. Content-Transfer-Encoding Header Field ............... 14 + 6.1 Content-Transfer-Encoding Syntax .................... 14 + 6.2 Content-Transfer-Encodings Semantics ................ 15 + 6.3 New Content-Transfer-Encodings ...................... 16 + 6.4 Interpretation and Use .............................. 16 + 6.5 Translating Encodings ............................... 18 + 6.6 Canonical Encoding Model ............................ 19 + 6.7 Quoted-Printable Content-Transfer-Encoding .......... 19 + 6.8 Base64 Content-Transfer-Encoding .................... 24 + 7. Content-ID Header Field .............................. 26 + 8. Content-Description Header Field ..................... 27 + 9. Additional MIME Header Fields ........................ 27 + 10. Summary ............................................. 27 + 11. Security Considerations ............................. 27 + 12. Authors' Addresses .................................. 28 + A. Collected Grammar .................................... 29 + + + + + + +Freed & Borenstein Standards Track [Page 2] + +RFC 2045 Internet Message Bodies November 1996 + + +1. Introduction + + Since its publication in 1982, RFC 822 has defined the standard + format of textual mail messages on the Internet. Its success has + been such that the RFC 822 format has been adopted, wholly or + partially, well beyond the confines of the Internet and the Internet + SMTP transport defined by RFC 821. As the format has seen wider use, + a number of limitations have proven increasingly restrictive for the + user community. + + RFC 822 was intended to specify a format for text messages. As such, + non-text messages, such as multimedia messages that might include + audio or images, are simply not mentioned. Even in the case of text, + however, RFC 822 is inadequate for the needs of mail users whose + languages require the use of character sets richer than US-ASCII. + Since RFC 822 does not specify mechanisms for mail containing audio, + video, Asian language text, or even text in most European languages, + additional specifications are needed. + + One of the notable limitations of RFC 821/822 based mail systems is + the fact that they limit the contents of electronic mail messages to + relatively short lines (e.g. 1000 characters or less [RFC-821]) of + 7bit US-ASCII. This forces users to convert any non-textual data + that they may wish to send into seven-bit bytes representable as + printable US-ASCII characters before invoking a local mail UA (User + Agent, a program with which human users send and receive mail). + Examples of such encodings currently used in the Internet include + pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in + RFC 1421, the Andrew Toolkit Representation [ATK], and many others. + + The limitations of RFC 822 mail become even more apparent as gateways + are designed to allow for the exchange of mail messages between RFC + 822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the + inclusion of non-textual material within electronic mail messages. + The current standards for the mapping of X.400 messages to RFC 822 + messages specify either that X.400 non-textual material must be + converted to (not encoded in) IA5Text format, or that they must be + discarded, notifying the RFC 822 user that discarding has occurred. + This is clearly undesirable, as information that a user may wish to + receive is lost. Even though a user agent may not have the + capability of dealing with the non-textual material, the user might + have some mechanism external to the UA that can extract useful + information from the material. Moreover, it does not allow for the + fact that the message may eventually be gatewayed back into an X.400 + message handling system (i.e., the X.400 message is "tunneled" + through Internet mail), where the non-textual information would + definitely become useful again. + + + + +Freed & Borenstein Standards Track [Page 3] + +RFC 2045 Internet Message Bodies November 1996 + + + This document describes several mechanisms that combine to solve most + of these problems without introducing any serious incompatibilities + with the existing world of RFC 822 mail. In particular, it + describes: + + (1) A MIME-Version header field, which uses a version + number to declare a message to be conformant with MIME + and allows mail processing agents to distinguish + between such messages and those generated by older or + non-conformant software, which are presumed to lack + such a field. + + (2) A Content-Type header field, generalized from RFC 1049, + which can be used to specify the media type and subtype + of data in the body of a message and to fully specify + the native representation (canonical form) of such + data. + + (3) A Content-Transfer-Encoding header field, which can be + used to specify both the encoding transformation that + was applied to the body and the domain of the result. + Encoding transformations other than the identity + transformation are usually applied to data in order to + allow it to pass through mail transport mechanisms + which may have data or character set limitations. + + (4) Two additional header fields that can be used to + further describe the data in a body, the Content-ID and + Content-Description header fields. + + All of the header fields defined in this document are subject to the + general syntactic rules for header fields specified in RFC 822. In + particular, all of these header fields except for Content-Disposition + can include RFC 822 comments, which have no semantic content and + should be ignored during MIME processing. + + Finally, to specify and promote interoperability, RFC 2049 provides a + basic applicability statement for a subset of the above mechanisms + that defines a minimal level of "conformance" with this document. + + HISTORICAL NOTE: Several of the mechanisms described in this set of + documents may seem somewhat strange or even baroque at first reading. + It is important to note that compatibility with existing standards + AND robustness across existing practice were two of the highest + priorities of the working group that developed this set of documents. + In particular, compatibility was always favored over elegance. + + + + + +Freed & Borenstein Standards Track [Page 4] + +RFC 2045 Internet Message Bodies November 1996 + + + Please refer to the current edition of the "Internet Official + Protocol Standards" for the standardization state and status of this + protocol. RFC 822 and STD 3, RFC 1123 also provide essential + background for MIME since no conforming implementation of MIME can + violate them. In addition, several other informational RFC documents + will be of interest to the MIME implementor, in particular RFC 1344, + RFC 1345, and RFC 1524. + +2. Definitions, Conventions, and Generic BNF Grammar + + Although the mechanisms specified in this set of documents are all + described in prose, most are also described formally in the augmented + BNF notation of RFC 822. Implementors will need to be familiar with + this notation in order to understand this set of documents, and are + referred to RFC 822 for a complete explanation of the augmented BNF + notation. + + Some of the augmented BNF in this set of documents makes named + references to syntax rules defined in RFC 822. A complete formal + grammar, then, is obtained by combining the collected grammar + appendices in each document in this set with the BNF of RFC 822 plus + the modifications to RFC 822 defined in RFC 1123 (which specifically + changes the syntax for `return', `date' and `mailbox'). + + All numeric and octet values are given in decimal notation in this + set of documents. All media type values, subtype values, and + parameter names as defined are case-insensitive. However, parameter + values are case-sensitive unless otherwise specified for the specific + parameter. + + FORMATTING NOTE: Notes, such at this one, provide additional + nonessential information which may be skipped by the reader without + missing anything essential. The primary purpose of these non- + essential notes is to convey information about the rationale of this + set of documents, or to place these documents in the proper + historical or evolutionary context. Such information may in + particular be skipped by those who are focused entirely on building a + conformant implementation, but may be of use to those who wish to + understand why certain design choices were made. + +2.1. CRLF + + The term CRLF, in this set of documents, refers to the sequence of + octets corresponding to the two US-ASCII characters CR (decimal value + 13) and LF (decimal value 10) which, taken together, in this order, + denote a line break in RFC 822 mail. + + + + + +Freed & Borenstein Standards Track [Page 5] + +RFC 2045 Internet Message Bodies November 1996 + + +2.2. Character Set + + The term "character set" is used in MIME to refer to a method of + converting a sequence of octets into a sequence of characters. Note + that unconditional and unambiguous conversion in the other direction + is not required, in that not all characters may be representable by a + given character set and a character set may provide more than one + sequence of octets to represent a particular sequence of characters. + + This definition is intended to allow various kinds of character + encodings, from simple single-table mappings such as US-ASCII to + complex table switching methods such as those that use ISO 2022's + techniques, to be used as character sets. However, the definition + associated with a MIME character set name must fully specify the + mapping to be performed. In particular, use of external profiling + information to determine the exact mapping is not permitted. + + NOTE: The term "character set" was originally to describe such + straightforward schemes as US-ASCII and ISO-8859-1 which have a + simple one-to-one mapping from single octets to single characters. + Multi-octet coded character sets and switching techniques make the + situation more complex. For example, some communities use the term + "character encoding" for what MIME calls a "character set", while + using the phrase "coded character set" to denote an abstract mapping + from integers (not octets) to characters. + +2.3. Message + + The term "message", when not further qualified, means either a + (complete or "top-level") RFC 822 message being transferred on a + network, or a message encapsulated in a body of type "message/rfc822" + or "message/partial". + +2.4. Entity + + The term "entity", refers specifically to the MIME-defined header + fields and contents of either a message or one of the parts in the + body of a multipart entity. The specification of such entities is + the essence of MIME. Since the contents of an entity are often + called the "body", it makes sense to speak about the body of an + entity. Any sort of field may be present in the header of an entity, + but only those fields whose names begin with "content-" actually have + any MIME-related meaning. Note that this does NOT imply thay they + have no meaning at all -- an entity that is also a message has non- + MIME header fields whose meanings are defined by RFC 822. + + + + + + +Freed & Borenstein Standards Track [Page 6] + +RFC 2045 Internet Message Bodies November 1996 + + +2.5. Body Part + + The term "body part" refers to an entity inside of a multipart + entity. + +2.6. Body + + The term "body", when not further qualified, means the body of an + entity, that is, the body of either a message or of a body part. + + NOTE: The previous four definitions are clearly circular. This is + unavoidable, since the overall structure of a MIME message is indeed + recursive. + +2.7. 7bit Data + + "7bit data" refers to data that is all represented as relatively + short lines with 998 octets or less between CRLF line separation + sequences [RFC-821]. No octets with decimal values greater than 127 + are allowed and neither are NULs (octets with decimal value 0). CR + (decimal value 13) and LF (decimal value 10) octets only occur as + part of CRLF line separation sequences. + +2.8. 8bit Data + + "8bit data" refers to data that is all represented as relatively + short lines with 998 octets or less between CRLF line separation + sequences [RFC-821]), but octets with decimal values greater than 127 + may be used. As with "7bit data" CR and LF octets only occur as part + of CRLF line separation sequences and no NULs are allowed. + +2.9. Binary Data + + "Binary data" refers to data where any sequence of octets whatsoever + is allowed. + +2.10. Lines + + "Lines" are defined as sequences of octets separated by a CRLF + sequences. This is consistent with both RFC 821 and RFC 822. + "Lines" only refers to a unit of data in a message, which may or may + not correspond to something that is actually displayed by a user + agent. + + + + + + + + +Freed & Borenstein Standards Track [Page 7] + +RFC 2045 Internet Message Bodies November 1996 + + +3. MIME Header Fields + + MIME defines a number of new RFC 822 header fields that are used to + describe the content of a MIME entity. These header fields occur in + at least two contexts: + + (1) As part of a regular RFC 822 message header. + + (2) In a MIME body part header within a multipart + construct. + + The formal definition of these header fields is as follows: + + entity-headers := [ content CRLF ] + [ encoding CRLF ] + [ id CRLF ] + [ description CRLF ] + *( MIME-extension-field CRLF ) + + MIME-message-headers := entity-headers + fields + version CRLF + ; The ordering of the header + ; fields implied by this BNF + ; definition should be ignored. + + MIME-part-headers := entity-headers + [ fields ] + ; Any field not beginning with + ; "content-" can have no defined + ; meaning and may be ignored. + ; The ordering of the header + ; fields implied by this BNF + ; definition should be ignored. + + The syntax of the various specific MIME header fields will be + described in the following sections. + +4. MIME-Version Header Field + + Since RFC 822 was published in 1982, there has really been only one + format standard for Internet messages, and there has been little + perceived need to declare the format standard in use. This document + is an independent specification that complements RFC 822. Although + the extensions in this document have been defined in such a way as to + be compatible with RFC 822, there are still circumstances in which it + might be desirable for a mail-processing agent to know whether a + message was composed with the new standard in mind. + + + +Freed & Borenstein Standards Track [Page 8] + +RFC 2045 Internet Message Bodies November 1996 + + + Therefore, this document defines a new header field, "MIME-Version", + which is to be used to declare the version of the Internet message + body format standard in use. + + Messages composed in accordance with this document MUST include such + a header field, with the following verbatim text: + + MIME-Version: 1.0 + + The presence of this header field is an assertion that the message + has been composed in compliance with this document. + + Since it is possible that a future document might extend the message + format standard again, a formal BNF is given for the content of the + MIME-Version field: + + version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT + + Thus, future format specifiers, which might replace or extend "1.0", + are constrained to be two integer fields, separated by a period. If + a message is received with a MIME-version value other than "1.0", it + cannot be assumed to conform with this document. + + Note that the MIME-Version header field is required at the top level + of a message. It is not required for each body part of a multipart + entity. It is required for the embedded headers of a body of type + "message/rfc822" or "message/partial" if and only if the embedded + message is itself claimed to be MIME-conformant. + + It is not possible to fully specify how a mail reader that conforms + with MIME as defined in this document should treat a message that + might arrive in the future with some value of MIME-Version other than + "1.0". + + It is also worth noting that version control for specific media types + is not accomplished using the MIME-Version mechanism. In particular, + some formats (such as application/postscript) have version numbering + conventions that are internal to the media format. Where such + conventions exist, MIME does nothing to supersede them. Where no + such conventions exist, a MIME media type might use a "version" + parameter in the content-type field if necessary. + + + + + + + + + + +Freed & Borenstein Standards Track [Page 9] + +RFC 2045 Internet Message Bodies November 1996 + + + NOTE TO IMPLEMENTORS: When checking MIME-Version values any RFC 822 + comment strings that are present must be ignored. In particular, the + following four MIME-Version fields are equivalent: + + MIME-Version: 1.0 + + MIME-Version: 1.0 (produced by MetaSend Vx.x) + + MIME-Version: (produced by MetaSend Vx.x) 1.0 + + MIME-Version: 1.(produced by MetaSend Vx.x)0 + + In the absence of a MIME-Version field, a receiving mail user agent + (whether conforming to MIME requirements or not) may optionally + choose to interpret the body of the message according to local + conventions. Many such conventions are currently in use and it + should be noted that in practice non-MIME messages can contain just + about anything. + + It is impossible to be certain that a non-MIME mail message is + actually plain text in the US-ASCII character set since it might well + be a message that, using some set of nonstandard local conventions + that predate MIME, includes text in another character set or non- + textual data presented in a manner that cannot be automatically + recognized (e.g., a uuencoded compressed UNIX tar file). + +5. Content-Type Header Field + + The purpose of the Content-Type field is to describe the data + contained in the body fully enough that the receiving user agent can + pick an appropriate agent or mechanism to present the data to the + user, or otherwise deal with the data in an appropriate manner. The + value in this field is called a media type. + + HISTORICAL NOTE: The Content-Type header field was first defined in + RFC 1049. RFC 1049 used a simpler and less powerful syntax, but one + that is largely compatible with the mechanism given here. + + The Content-Type header field specifies the nature of the data in the + body of an entity by giving media type and subtype identifiers, and + by providing auxiliary information that may be required for certain + media types. After the media type and subtype names, the remainder + of the header field is simply a set of parameters, specified in an + attribute=value notation. The ordering of parameters is not + significant. + + + + + + +Freed & Borenstein Standards Track [Page 10] + +RFC 2045 Internet Message Bodies November 1996 + + + In general, the top-level media type is used to declare the general + type of data, while the subtype specifies a specific format for that + type of data. Thus, a media type of "image/xyz" is enough to tell a + user agent that the data is an image, even if the user agent has no + knowledge of the specific image format "xyz". Such information can + be used, for example, to decide whether or not to show a user the raw + data from an unrecognized subtype -- such an action might be + reasonable for unrecognized subtypes of text, but not for + unrecognized subtypes of image or audio. For this reason, registered + subtypes of text, image, audio, and video should not contain embedded + information that is really of a different type. Such compound + formats should be represented using the "multipart" or "application" + types. + + Parameters are modifiers of the media subtype, and as such do not + fundamentally affect the nature of the content. The set of + meaningful parameters depends on the media type and subtype. Most + parameters are associated with a single specific subtype. However, a + given top-level media type may define parameters which are applicable + to any subtype of that type. Parameters may be required by their + defining content type or subtype or they may be optional. MIME + implementations must ignore any parameters whose names they do not + recognize. + + For example, the "charset" parameter is applicable to any subtype of + "text", while the "boundary" parameter is required for any subtype of + the "multipart" media type. + + There are NO globally-meaningful parameters that apply to all media + types. Truly global mechanisms are best addressed, in the MIME + model, by the definition of additional Content-* header fields. + + An initial set of seven top-level media types is defined in RFC 2046. + Five of these are discrete types whose content is essentially opaque + as far as MIME processing is concerned. The remaining two are + composite types whose contents require additional handling by MIME + processors. + + This set of top-level media types is intended to be substantially + complete. It is expected that additions to the larger set of + supported types can generally be accomplished by the creation of new + subtypes of these initial types. In the future, more top-level types + may be defined only by a standards-track extension to this standard. + If another top-level type is to be used for any reason, it must be + given a name starting with "X-" to indicate its non-standard status + and to avoid a potential conflict with a future official name. + + + + + +Freed & Borenstein Standards Track [Page 11] + +RFC 2045 Internet Message Bodies November 1996 + + +5.1. Syntax of the Content-Type Header Field + + In the Augmented BNF notation of RFC 822, a Content-Type header field + value is defined as follows: + + content := "Content-Type" ":" type "/" subtype + *(";" parameter) + ; Matching of media type and subtype + ; is ALWAYS case-insensitive. + + type := discrete-type / composite-type + + discrete-type := "text" / "image" / "audio" / "video" / + "application" / extension-token + + composite-type := "message" / "multipart" / extension-token + + extension-token := ietf-token / x-token + + ietf-token := <An extension token defined by a + standards-track RFC and registered + with IANA.> + + x-token := <The two characters "X-" or "x-" followed, with + no intervening white space, by any token> + + subtype := extension-token / iana-token + + iana-token := <A publicly-defined extension token. Tokens + of this form must be registered with IANA + as specified in RFC 2048.> + + parameter := attribute "=" value + + attribute := token + ; Matching of attributes + ; is ALWAYS case-insensitive. + + value := token / quoted-string + + token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, + or tspecials> + + tspecials := "(" / ")" / "<" / ">" / "@" / + "," / ";" / ":" / "\" / <"> + "/" / "[" / "]" / "?" / "=" + ; Must be in quoted-string, + ; to use within parameter values + + + +Freed & Borenstein Standards Track [Page 12] + +RFC 2045 Internet Message Bodies November 1996 + + + Note that the definition of "tspecials" is the same as the RFC 822 + definition of "specials" with the addition of the three characters + "/", "?", and "=", and the removal of ".". + + Note also that a subtype specification is MANDATORY -- it may not be + omitted from a Content-Type header field. As such, there are no + default subtypes. + + The type, subtype, and parameter names are not case sensitive. For + example, TEXT, Text, and TeXt are all equivalent top-level media + types. Parameter values are normally case sensitive, but sometimes + are interpreted in a case-insensitive fashion, depending on the + intended use. (For example, multipart boundaries are case-sensitive, + but the "access-type" parameter for message/External-body is not + case-sensitive.) + + Note that the value of a quoted string parameter does not include the + quotes. That is, the quotation marks in a quoted-string are not a + part of the value of the parameter, but are merely used to delimit + that parameter value. In addition, comments are allowed in + accordance with RFC 822 rules for structured header fields. Thus the + following two forms + + Content-type: text/plain; charset=us-ascii (Plain text) + + Content-type: text/plain; charset="us-ascii" + + are completely equivalent. + + Beyond this syntax, the only syntactic constraint on the definition + of subtype names is the desire that their uses must not conflict. + That is, it would be undesirable to have two different communities + using "Content-Type: application/foobar" to mean two different + things. The process of defining new media subtypes, then, is not + intended to be a mechanism for imposing restrictions, but simply a + mechanism for publicizing their definition and usage. There are, + therefore, two acceptable mechanisms for defining new media subtypes: + + (1) Private values (starting with "X-") may be defined + bilaterally between two cooperating agents without + outside registration or standardization. Such values + cannot be registered or standardized. + + (2) New standard values should be registered with IANA as + described in RFC 2048. + + The second document in this set, RFC 2046, defines the initial set of + media types for MIME. + + + +Freed & Borenstein Standards Track [Page 13] + +RFC 2045 Internet Message Bodies November 1996 + + +5.2. Content-Type Defaults + + Default RFC 822 messages without a MIME Content-Type header are taken + by this protocol to be plain text in the US-ASCII character set, + which can be explicitly specified as: + + Content-type: text/plain; charset=us-ascii + + This default is assumed if no Content-Type header field is specified. + It is also recommend that this default be assumed when a + syntactically invalid Content-Type header field is encountered. In + the presence of a MIME-Version header field and the absence of any + Content-Type header field, a receiving User Agent can also assume + that plain US-ASCII text was the sender's intent. Plain US-ASCII + text may still be assumed in the absence of a MIME-Version or the + presence of an syntactically invalid Content-Type header field, but + the sender's intent might have been otherwise. + +6. Content-Transfer-Encoding Header Field + + Many media types which could be usefully transported via email are + represented, in their "natural" format, as 8bit character or binary + data. Such data cannot be transmitted over some transfer protocols. + For example, RFC 821 (SMTP) restricts mail messages to 7bit US-ASCII + data with lines no longer than 1000 characters including any trailing + CRLF line separator. + + It is necessary, therefore, to define a standard mechanism for + encoding such data into a 7bit short line format. Proper labelling + of unencoded material in less restrictive formats for direct use over + less restrictive transports is also desireable. This document + specifies that such encodings will be indicated by a new "Content- + Transfer-Encoding" header field. This field has not been defined by + any previous standard. + +6.1. Content-Transfer-Encoding Syntax + + The Content-Transfer-Encoding field's value is a single token + specifying the type of encoding, as enumerated below. Formally: + + encoding := "Content-Transfer-Encoding" ":" mechanism + + mechanism := "7bit" / "8bit" / "binary" / + "quoted-printable" / "base64" / + ietf-token / x-token + + These values are not case sensitive -- Base64 and BASE64 and bAsE64 + are all equivalent. An encoding type of 7BIT requires that the body + + + +Freed & Borenstein Standards Track [Page 14] + +RFC 2045 Internet Message Bodies November 1996 + + + is already in a 7bit mail-ready representation. This is the default + value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the + Content-Transfer-Encoding header field is not present. + +6.2. Content-Transfer-Encodings Semantics + + This single Content-Transfer-Encoding token actually provides two + pieces of information. It specifies what sort of encoding + transformation the body was subjected to and hence what decoding + operation must be used to restore it to its original form, and it + specifies what the domain of the result is. + + The transformation part of any Content-Transfer-Encodings specifies, + either explicitly or implicitly, a single, well-defined decoding + algorithm, which for any sequence of encoded octets either transforms + it to the original sequence of octets which was encoded, or shows + that it is illegal as an encoded sequence. Content-Transfer- + Encodings transformations never depend on any additional external + profile information for proper operation. Note that while decoders + must produce a single, well-defined output for a valid encoding no + such restrictions exist for encoders: Encoding a given sequence of + octets to different, equivalent encoded sequences is perfectly legal. + + Three transformations are currently defined: identity, the "quoted- + printable" encoding, and the "base64" encoding. The domains are + "binary", "8bit" and "7bit". + + The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all + mean that the identity (i.e. NO) encoding transformation has been + performed. As such, they serve simply as indicators of the domain of + the body data, and provide useful information about the sort of + encoding that might be needed for transmission in a given transport + system. The terms "7bit data", "8bit data", and "binary data" are + all defined in Section 2. + + The quoted-printable and base64 encodings transform their input from + an arbitrary domain into material in the "7bit" range, thus making it + safe to carry over restricted transports. The specific definition of + the transformations are given below. + + The proper Content-Transfer-Encoding label must always be used. + Labelling unencoded data containing 8bit characters as "7bit" is not + allowed, nor is labelling unencoded non-line-oriented data as + anything other than "binary" allowed. + + Unlike media subtypes, a proliferation of Content-Transfer-Encoding + values is both undesirable and unnecessary. However, establishing + only a single transformation into the "7bit" domain does not seem + + + +Freed & Borenstein Standards Track [Page 15] + +RFC 2045 Internet Message Bodies November 1996 + + + possible. There is a tradeoff between the desire for a compact and + efficient encoding of largely- binary data and the desire for a + somewhat readable encoding of data that is mostly, but not entirely, + 7bit. For this reason, at least two encoding mechanisms are + necessary: a more or less readable encoding (quoted-printable) and a + "dense" or "uniform" encoding (base64). + + Mail transport for unencoded 8bit data is defined in RFC 1652. As of + the initial publication of this document, there are no standardized + Internet mail transports for which it is legitimate to include + unencoded binary data in mail bodies. Thus there are no + circumstances in which the "binary" Content-Transfer-Encoding is + actually valid in Internet mail. However, in the event that binary + mail transport becomes a reality in Internet mail, or when MIME is + used in conjunction with any other binary-capable mail transport + mechanism, binary bodies must be labelled as such using this + mechanism. + + NOTE: The five values defined for the Content-Transfer-Encoding field + imply nothing about the media type other than the algorithm by which + it was encoded or the transport system requirements if unencoded. + +6.3. New Content-Transfer-Encodings + + Implementors may, if necessary, define private Content-Transfer- + Encoding values, but must use an x-token, which is a name prefixed by + "X-", to indicate its non-standard status, e.g., "Content-Transfer- + Encoding: x-my-new-encoding". Additional standardized Content- + Transfer-Encoding values must be specified by a standards-track RFC. + The requirements such specifications must meet are given in RFC 2048. + As such, all content-transfer-encoding namespace except that + beginning with "X-" is explicitly reserved to the IETF for future + use. + + Unlike media types and subtypes, the creation of new Content- + Transfer-Encoding values is STRONGLY discouraged, as it seems likely + to hinder interoperability with little potential benefit + +6.4. Interpretation and Use + + If a Content-Transfer-Encoding header field appears as part of a + message header, it applies to the entire body of that message. If a + Content-Transfer-Encoding header field appears as part of an entity's + headers, it applies only to the body of that entity. If an entity is + of type "multipart" the Content-Transfer-Encoding is not permitted to + have any value other than "7bit", "8bit" or "binary". Even more + severe restrictions apply to some subtypes of the "message" type. + + + + +Freed & Borenstein Standards Track [Page 16] + +RFC 2045 Internet Message Bodies November 1996 + + + It should be noted that most media types are defined in terms of + octets rather than bits, so that the mechanisms described here are + mechanisms for encoding arbitrary octet streams, not bit streams. If + a bit stream is to be encoded via one of these mechanisms, it must + first be converted to an 8bit byte stream using the network standard + bit order ("big-endian"), in which the earlier bits in a stream + become the higher-order bits in a 8bit byte. A bit stream not ending + at an 8bit boundary must be padded with zeroes. RFC 2046 provides a + mechanism for noting the addition of such padding in the case of the + application/octet-stream media type, which has a "padding" parameter. + + The encoding mechanisms defined here explicitly encode all data in + US-ASCII. Thus, for example, suppose an entity has header fields + such as: + + Content-Type: text/plain; charset=ISO-8859-1 + Content-transfer-encoding: base64 + + This must be interpreted to mean that the body is a base64 US-ASCII + encoding of data that was originally in ISO-8859-1, and will be in + that character set again after decoding. + + Certain Content-Transfer-Encoding values may only be used on certain + media types. In particular, it is EXPRESSLY FORBIDDEN to use any + encodings other than "7bit", "8bit", or "binary" with any composite + media type, i.e. one that recursively includes other Content-Type + fields. Currently the only composite media types are "multipart" and + "message". All encodings that are desired for bodies of type + multipart or message must be done at the innermost level, by encoding + the actual body that needs to be encoded. + + It should also be noted that, by definition, if a composite entity + has a transfer-encoding value such as "7bit", but one of the enclosed + entities has a less restrictive value such as "8bit", then either the + outer "7bit" labelling is in error, because 8bit data are included, + or the inner "8bit" labelling placed an unnecessarily high demand on + the transport system because the actual included data were actually + 7bit-safe. + + NOTE ON ENCODING RESTRICTIONS: Though the prohibition against using + content-transfer-encodings on composite body data may seem overly + restrictive, it is necessary to prevent nested encodings, in which + data are passed through an encoding algorithm multiple times, and + must be decoded multiple times in order to be properly viewed. + Nested encodings add considerable complexity to user agents: Aside + from the obvious efficiency problems with such multiple encodings, + they can obscure the basic structure of a message. In particular, + they can imply that several decoding operations are necessary simply + + + +Freed & Borenstein Standards Track [Page 17] + +RFC 2045 Internet Message Bodies November 1996 + + + to find out what types of bodies a message contains. Banning nested + encodings may complicate the job of certain mail gateways, but this + seems less of a problem than the effect of nested encodings on user + agents. + + Any entity with an unrecognized Content-Transfer-Encoding must be + treated as if it has a Content-Type of "application/octet-stream", + regardless of what the Content-Type header field actually says. + + NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-TRANSFER- + ENCODING: It may seem that the Content-Transfer-Encoding could be + inferred from the characteristics of the media that is to be encoded, + or, at the very least, that certain Content-Transfer-Encodings could + be mandated for use with specific media types. There are several + reasons why this is not the case. First, given the varying types of + transports used for mail, some encodings may be appropriate for some + combinations of media types and transports but not for others. (For + example, in an 8bit transport, no encoding would be required for text + in certain character sets, while such encodings are clearly required + for 7bit SMTP.) + + Second, certain media types may require different types of transfer + encoding under different circumstances. For example, many PostScript + bodies might consist entirely of short lines of 7bit data and hence + require no encoding at all. Other PostScript bodies (especially + those using Level 2 PostScript's binary encoding mechanism) may only + be reasonably represented using a binary transport encoding. + Finally, since the Content-Type field is intended to be an open-ended + specification mechanism, strict specification of an association + between media types and encodings effectively couples the + specification of an application protocol with a specific lower-level + transport. This is not desirable since the developers of a media + type should not have to be aware of all the transports in use and + what their limitations are. + +6.5. Translating Encodings + + The quoted-printable and base64 encodings are designed so that + conversion between them is possible. The only issue that arises in + such a conversion is the handling of hard line breaks in quoted- + printable encoding output. When converting from quoted-printable to + base64 a hard line break in the quoted-printable form represents a + CRLF sequence in the canonical form of the data. It must therefore be + converted to a corresponding encoded CRLF in the base64 form of the + data. Similarly, a CRLF sequence in the canonical form of the data + obtained after base64 decoding must be converted to a quoted- + printable hard line break, but ONLY when converting text data. + + + + +Freed & Borenstein Standards Track [Page 18] + +RFC 2045 Internet Message Bodies November 1996 + + +6.6. Canonical Encoding Model + + There was some confusion, in the previous versions of this RFC, + regarding the model for when email data was to be converted to + canonical form and encoded, and in particular how this process would + affect the treatment of CRLFs, given that the representation of + newlines varies greatly from system to system, and the relationship + between content-transfer-encodings and character sets. A canonical + model for encoding is presented in RFC 2049 for this reason. + +6.7. Quoted-Printable Content-Transfer-Encoding + + The Quoted-Printable encoding is intended to represent data that + largely consists of octets that correspond to printable characters in + the US-ASCII character set. It encodes the data in such a way that + the resulting octets are unlikely to be modified by mail transport. + If the data being encoded are mostly US-ASCII text, the encoded form + of the data remains largely recognizable by humans. A body which is + entirely US-ASCII may also be encoded in Quoted-Printable to ensure + the integrity of the data should the message pass through a + character-translating, and/or line-wrapping gateway. + + In this encoding, octets are to be represented as determined by the + following rules: + + (1) (General 8bit representation) Any octet, except a CR or + LF that is part of a CRLF line break of the canonical + (standard) form of the data being encoded, may be + represented by an "=" followed by a two digit + hexadecimal representation of the octet's value. The + digits of the hexadecimal alphabet, for this purpose, + are "0123456789ABCDEF". Uppercase letters must be + used; lowercase letters are not allowed. Thus, for + example, the decimal value 12 (US-ASCII form feed) can + be represented by "=0C", and the decimal value 61 (US- + ASCII EQUAL SIGN) can be represented by "=3D". This + rule must be followed except when the following rules + allow an alternative encoding. + + (2) (Literal representation) Octets with decimal values of + 33 through 60 inclusive, and 62 through 126, inclusive, + MAY be represented as the US-ASCII characters which + correspond to those octets (EXCLAMATION POINT through + LESS THAN, and GREATER THAN through TILDE, + respectively). + + (3) (White Space) Octets with values of 9 and 32 MAY be + represented as US-ASCII TAB (HT) and SPACE characters, + + + +Freed & Borenstein Standards Track [Page 19] + +RFC 2045 Internet Message Bodies November 1996 + + + respectively, but MUST NOT be so represented at the end + of an encoded line. Any TAB (HT) or SPACE characters + on an encoded line MUST thus be followed on that line + by a printable character. In particular, an "=" at the + end of an encoded line, indicating a soft line break + (see rule #5) may follow one or more TAB (HT) or SPACE + characters. It follows that an octet with decimal + value 9 or 32 appearing at the end of an encoded line + must be represented according to Rule #1. This rule is + necessary because some MTAs (Message Transport Agents, + programs which transport messages from one user to + another, or perform a portion of such transfers) are + known to pad lines of text with SPACEs, and others are + known to remove "white space" characters from the end + of a line. Therefore, when decoding a Quoted-Printable + body, any trailing white space on a line must be + deleted, as it will necessarily have been added by + intermediate transport agents. + + (4) (Line Breaks) A line break in a text body, represented + as a CRLF sequence in the text canonical form, must be + represented by a (RFC 822) line break, which is also a + CRLF sequence, in the Quoted-Printable encoding. Since + the canonical representation of media types other than + text do not generally include the representation of + line breaks as CRLF sequences, no hard line breaks + (i.e. line breaks that are intended to be meaningful + and to be displayed to the user) can occur in the + quoted-printable encoding of such types. Sequences + like "=0D", "=0A", "=0A=0D" and "=0D=0A" will routinely + appear in non-text data represented in quoted- + printable, of course. + + Note that many implementations may elect to encode the + local representation of various content types directly + rather than converting to canonical form first, + encoding, and then converting back to local + representation. In particular, this may apply to plain + text material on systems that use newline conventions + other than a CRLF terminator sequence. Such an + implementation optimization is permissible, but only + when the combined canonicalization-encoding step is + equivalent to performing the three steps separately. + + (5) (Soft Line Breaks) The Quoted-Printable encoding + REQUIRES that encoded lines be no more than 76 + characters long. If longer lines are to be encoded + with the Quoted-Printable encoding, "soft" line breaks + + + +Freed & Borenstein Standards Track [Page 20] + +RFC 2045 Internet Message Bodies November 1996 + + + must be used. An equal sign as the last character on a + encoded line indicates such a non-significant ("soft") + line break in the encoded text. + + Thus if the "raw" form of the line is a single unencoded line that + says: + + Now's the time for all folk to come to the aid of their country. + + This can be represented, in the Quoted-Printable encoding, as: + + Now's the time = + for all folk to come= + to the aid of their country. + + This provides a mechanism with which long lines are encoded in such a + way as to be restored by the user agent. The 76 character limit does + not count the trailing CRLF, but counts all other characters, + including any equal signs. + + Since the hyphen character ("-") may be represented as itself in the + Quoted-Printable encoding, care must be taken, when encapsulating a + quoted-printable encoded body inside one or more multipart entities, + to ensure that the boundary delimiter does not appear anywhere in the + encoded body. (A good strategy is to choose a boundary that includes + a character sequence such as "=_" which can never appear in a + quoted-printable body. See the definition of multipart messages in + RFC 2046.) + + NOTE: The quoted-printable encoding represents something of a + compromise between readability and reliability in transport. Bodies + encoded with the quoted-printable encoding will work reliably over + most mail gateways, but may not work perfectly over a few gateways, + notably those involving translation into EBCDIC. A higher level of + confidence is offered by the base64 Content-Transfer-Encoding. A way + to get reasonably reliable transport through EBCDIC gateways is to + also quote the US-ASCII characters + + !"#$@[\]^`{|}~ + + according to rule #1. + + Because quoted-printable data is generally assumed to be line- + oriented, it is to be expected that the representation of the breaks + between the lines of quoted-printable data may be altered in + transport, in the same manner that plain text mail has always been + altered in Internet mail when passing between systems with differing + newline conventions. If such alterations are likely to constitute a + + + +Freed & Borenstein Standards Track [Page 21] + +RFC 2045 Internet Message Bodies November 1996 + + + corruption of the data, it is probably more sensible to use the + base64 encoding rather than the quoted-printable encoding. + + NOTE: Several kinds of substrings cannot be generated according to + the encoding rules for the quoted-printable content-transfer- + encoding, and hence are formally illegal if they appear in the output + of a quoted-printable encoder. This note enumerates these cases and + suggests ways to handle such illegal substrings if any are + encountered in quoted-printable data that is to be decoded. + + (1) An "=" followed by two hexadecimal digits, one or both + of which are lowercase letters in "abcdef", is formally + illegal. A robust implementation might choose to + recognize them as the corresponding uppercase letters. + + (2) An "=" followed by a character that is neither a + hexadecimal digit (including "abcdef") nor the CR + character of a CRLF pair is illegal. This case can be + the result of US-ASCII text having been included in a + quoted-printable part of a message without itself + having been subjected to quoted-printable encoding. A + reasonable approach by a robust implementation might be + to include the "=" character and the following + character in the decoded data without any + transformation and, if possible, indicate to the user + that proper decoding was not possible at this point in + the data. + + (3) An "=" cannot be the ultimate or penultimate character + in an encoded object. This could be handled as in case + (2) above. + + (4) Control characters other than TAB, or CR and LF as + parts of CRLF pairs, must not appear. The same is true + for octets with decimal values greater than 126. If + found in incoming quoted-printable data by a decoder, a + robust implementation might exclude them from the + decoded data and warn the user that illegal characters + were discovered. + + (5) Encoded lines must not be longer than 76 characters, + not counting the trailing CRLF. If longer lines are + found in incoming, encoded data, a robust + implementation might nevertheless decode the lines, and + might report the erroneous encoding to the user. + + + + + + +Freed & Borenstein Standards Track [Page 22] + +RFC 2045 Internet Message Bodies November 1996 + + + WARNING TO IMPLEMENTORS: If binary data is encoded in quoted- + printable, care must be taken to encode CR and LF characters as "=0D" + and "=0A", respectively. In particular, a CRLF sequence in binary + data should be encoded as "=0D=0A". Otherwise, if CRLF were + represented as a hard line break, it might be incorrectly decoded on + platforms with different line break conventions. + + For formalists, the syntax of quoted-printable data is described by + the following grammar: + + quoted-printable := qp-line *(CRLF qp-line) + + qp-line := *(qp-segment transport-padding CRLF) + qp-part transport-padding + + qp-part := qp-section + ; Maximum length of 76 characters + + qp-segment := qp-section *(SPACE / TAB) "=" + ; Maximum length of 76 characters + + qp-section := [*(ptext / SPACE / TAB) ptext] + + ptext := hex-octet / safe-char + + safe-char := <any octet with decimal value of 33 through + 60 inclusive, and 62 through 126> + ; Characters not listed as "mail-safe" in + ; RFC 2049 are also not recommended. + + hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") + ; Octet must be used for characters > 127, =, + ; SPACEs or TABs at the ends of lines, and is + ; recommended for any character not listed in + ; RFC 2049 as "mail-safe". + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + IMPORTANT: The addition of LWSP between the elements shown in this + BNF is NOT allowed since this BNF does not specify a structured + header field. + + + + + +Freed & Borenstein Standards Track [Page 23] + +RFC 2045 Internet Message Bodies November 1996 + + +6.8. Base64 Content-Transfer-Encoding + + The Base64 Content-Transfer-Encoding is designed to represent + arbitrary sequences of octets in a form that need not be humanly + readable. The encoding and decoding algorithms are simple, but the + encoded data are consistently only about 33 percent larger than the + unencoded data. This encoding is virtually identical to the one used + in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421. + + A 65-character subset of US-ASCII is used, enabling 6 bits to be + represented per printable character. (The extra 65th character, "=", + is used to signify a special processing function.) + + NOTE: This subset has the important property that it is represented + identically in all versions of ISO 646, including US-ASCII, and all + characters in the subset are also represented identically in all + versions of EBCDIC. Other popular encodings, such as the encoding + used by the uuencode utility, Macintosh binhex 4.0 [RFC-1741], and + the base85 encoding specified as part of Level 2 PostScript, do not + share these properties, and thus do not fulfill the portability + requirements a binary transport encoding for mail must meet. + + The encoding process represents 24-bit groups of input bits as output + strings of 4 encoded characters. Proceeding from left to right, a + 24-bit input group is formed by concatenating 3 8bit input groups. + These 24 bits are then treated as 4 concatenated 6-bit groups, each + of which is translated into a single digit in the base64 alphabet. + When encoding a bit stream via the base64 encoding, the bit stream + must be presumed to be ordered with the most-significant-bit first. + That is, the first bit in the stream will be the high-order bit in + the first 8bit byte, and the eighth bit will be the low-order bit in + the first 8bit byte, and so on. + + Each 6-bit group is used as an index into an array of 64 printable + characters. The character referenced by the index is placed in the + output string. These characters, identified in Table 1, below, are + selected so as to be universally representable, and the set excludes + characters with particular significance to SMTP (e.g., ".", CR, LF) + and to the multipart boundary delimiters defined in RFC 2046 (e.g., + "-"). + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 24] + +RFC 2045 Internet Message Bodies November 1996 + + + Table 1: The Base64 Alphabet + + Value Encoding Value Encoding Value Encoding Value Encoding + 0 A 17 R 34 i 51 z + 1 B 18 S 35 j 52 0 + 2 C 19 T 36 k 53 1 + 3 D 20 U 37 l 54 2 + 4 E 21 V 38 m 55 3 + 5 F 22 W 39 n 56 4 + 6 G 23 X 40 o 57 5 + 7 H 24 Y 41 p 58 6 + 8 I 25 Z 42 q 59 7 + 9 J 26 a 43 r 60 8 + 10 K 27 b 44 s 61 9 + 11 L 28 c 45 t 62 + + 12 M 29 d 46 u 63 / + 13 N 30 e 47 v + 14 O 31 f 48 w (pad) = + 15 P 32 g 49 x + 16 Q 33 h 50 y + + The encoded output stream must be represented in lines of no more + than 76 characters each. All line breaks or other characters not + found in Table 1 must be ignored by decoding software. In base64 + data, characters other than those in Table 1, line breaks, and other + white space probably indicate a transmission error, about which a + warning message or even a message rejection might be appropriate + under some circumstances. + + Special processing is performed if fewer than 24 bits are available + at the end of the data being encoded. A full encoding quantum is + always completed at the end of a body. When fewer than 24 input bits + are available in an input group, zero bits are added (on the right) + to form an integral number of 6-bit groups. Padding at the end of + the data is performed using the "=" character. Since all base64 + input is an integral number of octets, only the following cases can + arise: (1) the final quantum of encoding input is an integral + multiple of 24 bits; here, the final unit of encoded output will be + an integral multiple of 4 characters with no "=" padding, (2) the + final quantum of encoding input is exactly 8 bits; here, the final + unit of encoded output will be two characters followed by two "=" + padding characters, or (3) the final quantum of encoding input is + exactly 16 bits; here, the final unit of encoded output will be three + characters followed by one "=" padding character. + + Because it is used only for padding at the end of the data, the + occurrence of any "=" characters may be taken as evidence that the + end of the data has been reached (without truncation in transit). No + + + +Freed & Borenstein Standards Track [Page 25] + +RFC 2045 Internet Message Bodies November 1996 + + + such assurance is possible, however, when the number of octets + transmitted was a multiple of three and no "=" characters are + present. + + Any characters outside of the base64 alphabet are to be ignored in + base64-encoded data. + + Care must be taken to use the proper octets for line breaks if base64 + encoding is applied directly to text material that has not been + converted to canonical form. In particular, text line breaks must be + converted into CRLF sequences prior to base64 encoding. The + important thing to note is that this may be done directly by the + encoder rather than in a prior canonicalization step in some + implementations. + + NOTE: There is no need to worry about quoting potential boundary + delimiters within base64-encoded bodies within multipart entities + because no hyphen characters are used in the base64 encoding. + +7. Content-ID Header Field + + In constructing a high-level user agent, it may be desirable to allow + one body to make reference to another. Accordingly, bodies may be + labelled using the "Content-ID" header field, which is syntactically + identical to the "Message-ID" header field: + + id := "Content-ID" ":" msg-id + + Like the Message-ID values, Content-ID values must be generated to be + world-unique. + + The Content-ID value may be used for uniquely identifying MIME + entities in several contexts, particularly for caching data + referenced by the message/external-body mechanism. Although the + Content-ID header is generally optional, its use is MANDATORY in + implementations which generate data of the optional MIME media type + "message/external-body". That is, each message/external-body entity + must have a Content-ID field to permit caching of such data. + + It is also worth noting that the Content-ID value has special + semantics in the case of the multipart/alternative media type. This + is explained in the section of RFC 2046 dealing with + multipart/alternative. + + + + + + + + +Freed & Borenstein Standards Track [Page 26] + +RFC 2045 Internet Message Bodies November 1996 + + +8. Content-Description Header Field + + The ability to associate some descriptive information with a given + body is often desirable. For example, it may be useful to mark an + "image" body as "a picture of the Space Shuttle Endeavor." Such text + may be placed in the Content-Description header field. This header + field is always optional. + + description := "Content-Description" ":" *text + + The description is presumed to be given in the US-ASCII character + set, although the mechanism specified in RFC 2047 may be used for + non-US-ASCII Content-Description values. + +9. Additional MIME Header Fields + + Future documents may elect to define additional MIME header fields + for various purposes. Any new header field that further describes + the content of a message should begin with the string "Content-" to + allow such fields which appear in a message header to be + distinguished from ordinary RFC 822 message header fields. + + MIME-extension-field := <Any RFC 822 header field which + begins with the string + "Content-"> + +10. Summary + + Using the MIME-Version, Content-Type, and Content-Transfer-Encoding + header fields, it is possible to include, in a standardized way, + arbitrary types of data with RFC 822 conformant mail messages. No + restrictions imposed by either RFC 821 or RFC 822 are violated, and + care has been taken to avoid problems caused by additional + restrictions imposed by the characteristics of some Internet mail + transport mechanisms (see RFC 2049). + + The next document in this set, RFC 2046, specifies the initial set of + media types that can be labelled and transported using these headers. + +11. Security Considerations + + Security issues are discussed in the second document in this set, RFC + 2046. + + + + + + + + +Freed & Borenstein Standards Track [Page 27] + +RFC 2045 Internet Message Bodies November 1996 + + +12. Authors' Addresses + + For more information, the authors of this document are best contacted + via Internet mail: + + Ned Freed + Innosoft International, Inc. + 1050 East Garvey Avenue South + West Covina, CA 91790 + USA + + Phone: +1 818 919 3600 + Fax: +1 818 919 3614 + EMail: ned@innosoft.com + + + Nathaniel S. Borenstein + First Virtual Holdings + 25 Washington Avenue + Morristown, NJ 07960 + USA + + Phone: +1 201 540 8967 + Fax: +1 201 993 3032 + EMail: nsb@nsb.fv.com + + + MIME is a result of the work of the Internet Engineering Task Force + Working Group on RFC 822 Extensions. The chairman of that group, + Greg Vaudreuil, may be reached at: + + Gregory M. Vaudreuil + Octel Network Services + 17080 Dallas Parkway + Dallas, TX 75248-1905 + USA + + EMail: Greg.Vaudreuil@Octel.Com + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 28] + +RFC 2045 Internet Message Bodies November 1996 + + +Appendix A -- Collected Grammar + + This appendix contains the complete BNF grammar for all the syntax + specified by this document. + + By itself, however, this grammar is incomplete. It refers by name to + several syntax rules that are defined by RFC 822. Rather than + reproduce those definitions here, and risk unintentional differences + between the two, this document simply refers the reader to RFC 822 + for the remaining definitions. Wherever a term is undefined, it + refers to the RFC 822 definition. + + attribute := token + ; Matching of attributes + ; is ALWAYS case-insensitive. + + composite-type := "message" / "multipart" / extension-token + + content := "Content-Type" ":" type "/" subtype + *(";" parameter) + ; Matching of media type and subtype + ; is ALWAYS case-insensitive. + + description := "Content-Description" ":" *text + + discrete-type := "text" / "image" / "audio" / "video" / + "application" / extension-token + + encoding := "Content-Transfer-Encoding" ":" mechanism + + entity-headers := [ content CRLF ] + [ encoding CRLF ] + [ id CRLF ] + [ description CRLF ] + *( MIME-extension-field CRLF ) + + extension-token := ietf-token / x-token + + hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") + ; Octet must be used for characters > 127, =, + ; SPACEs or TABs at the ends of lines, and is + ; recommended for any character not listed in + ; RFC 2049 as "mail-safe". + + iana-token := <A publicly-defined extension token. Tokens + of this form must be registered with IANA + as specified in RFC 2048.> + + + + +Freed & Borenstein Standards Track [Page 29] + +RFC 2045 Internet Message Bodies November 1996 + + + ietf-token := <An extension token defined by a + standards-track RFC and registered + with IANA.> + + id := "Content-ID" ":" msg-id + + mechanism := "7bit" / "8bit" / "binary" / + "quoted-printable" / "base64" / + ietf-token / x-token + + MIME-extension-field := <Any RFC 822 header field which + begins with the string + "Content-"> + + MIME-message-headers := entity-headers + fields + version CRLF + ; The ordering of the header + ; fields implied by this BNF + ; definition should be ignored. + + MIME-part-headers := entity-headers + [fields] + ; Any field not beginning with + ; "content-" can have no defined + ; meaning and may be ignored. + ; The ordering of the header + ; fields implied by this BNF + ; definition should be ignored. + + parameter := attribute "=" value + + ptext := hex-octet / safe-char + + qp-line := *(qp-segment transport-padding CRLF) + qp-part transport-padding + + qp-part := qp-section + ; Maximum length of 76 characters + + qp-section := [*(ptext / SPACE / TAB) ptext] + + qp-segment := qp-section *(SPACE / TAB) "=" + ; Maximum length of 76 characters + + quoted-printable := qp-line *(CRLF qp-line) + + + + + +Freed & Borenstein Standards Track [Page 30] + +RFC 2045 Internet Message Bodies November 1996 + + + safe-char := <any octet with decimal value of 33 through + 60 inclusive, and 62 through 126> + ; Characters not listed as "mail-safe" in + ; RFC 2049 are also not recommended. + + subtype := extension-token / iana-token + + token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, + or tspecials> + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + tspecials := "(" / ")" / "<" / ">" / "@" / + "," / ";" / ":" / "\" / <"> + "/" / "[" / "]" / "?" / "=" + ; Must be in quoted-string, + ; to use within parameter values + + type := discrete-type / composite-type + + value := token / quoted-string + + version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT + + x-token := <The two characters "X-" or "x-" followed, with + no intervening white space, by any token> + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 31] + diff --git a/rfc/rfc2046.txt b/rfc/rfc2046.txt @@ -0,0 +1,2467 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2046 Innosoft +Obsoletes: 1521, 1522, 1590 N. Borenstein +Category: Standards Track First Virtual + November 1996 + + + Multipurpose Internet Mail Extensions + (MIME) Part Two: + Media Types + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822 defines a message representation protocol specifying + considerable detail about US-ASCII message headers, but which leaves + the message content, or message body, as flat US-ASCII text. This + set of documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) an extensible set of different formats for non-textual + message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + The initial document in this set, RFC 2045, specifies the various + headers used to describe the structure of MIME messages. This second + document defines the general structure of the MIME media typing + system and defines an initial set of media types. The third document, + RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text + + + +Freed & Borenstein Standards Track [Page 1] + +RFC 2046 Media Types November 1996 + + + data in Internet mail header fields. The fourth document, RFC 2048, + specifies various IANA registration procedures for MIME-related + facilities. The fifth and final document, RFC 2049, describes MIME + conformance criteria as well as providing some illustrative examples + of MIME message formats, acknowledgements, and the bibliography. + + These documents are revisions of RFCs 1521 and 1522, which themselves + were revisions of RFCs 1341 and 1342. An appendix in RFC 2049 + describes differences and changes from previous versions. + +Table of Contents + + 1. Introduction ......................................... 3 + 2. Definition of a Top-Level Media Type ................. 4 + 3. Overview Of The Initial Top-Level Media Types ........ 4 + 4. Discrete Media Type Values ........................... 6 + 4.1 Text Media Type ..................................... 6 + 4.1.1 Representation of Line Breaks ..................... 7 + 4.1.2 Charset Parameter ................................. 7 + 4.1.3 Plain Subtype ..................................... 11 + 4.1.4 Unrecognized Subtypes ............................. 11 + 4.2 Image Media Type .................................... 11 + 4.3 Audio Media Type .................................... 11 + 4.4 Video Media Type .................................... 12 + 4.5 Application Media Type .............................. 12 + 4.5.1 Octet-Stream Subtype .............................. 13 + 4.5.2 PostScript Subtype ................................ 14 + 4.5.3 Other Application Subtypes ........................ 17 + 5. Composite Media Type Values .......................... 17 + 5.1 Multipart Media Type ................................ 17 + 5.1.1 Common Syntax ..................................... 19 + 5.1.2 Handling Nested Messages and Multiparts ........... 24 + 5.1.3 Mixed Subtype ..................................... 24 + 5.1.4 Alternative Subtype ............................... 24 + 5.1.5 Digest Subtype .................................... 26 + 5.1.6 Parallel Subtype .................................. 27 + 5.1.7 Other Multipart Subtypes .......................... 28 + 5.2 Message Media Type .................................. 28 + 5.2.1 RFC822 Subtype .................................... 28 + 5.2.2 Partial Subtype ................................... 29 + 5.2.2.1 Message Fragmentation and Reassembly ............ 30 + 5.2.2.2 Fragmentation and Reassembly Example ............ 31 + 5.2.3 External-Body Subtype ............................. 33 + 5.2.4 Other Message Subtypes ............................ 40 + 6. Experimental Media Type Values ....................... 40 + 7. Summary .............................................. 41 + 8. Security Considerations .............................. 41 + 9. Authors' Addresses ................................... 42 + + + +Freed & Borenstein Standards Track [Page 2] + +RFC 2046 Media Types November 1996 + + + A. Collected Grammar .................................... 43 + +1. Introduction + + The first document in this set, RFC 2045, defines a number of header + fields, including Content-Type. The Content-Type field is used to + specify the nature of the data in the body of a MIME entity, by + giving media type and subtype identifiers, and by providing auxiliary + information that may be required for certain media types. After the + type and subtype names, the remainder of the header field is simply a + set of parameters, specified in an attribute/value notation. The + ordering of parameters is not significant. + + In general, the top-level media type is used to declare the general + type of data, while the subtype specifies a specific format for that + type of data. Thus, a media type of "image/xyz" is enough to tell a + user agent that the data is an image, even if the user agent has no + knowledge of the specific image format "xyz". Such information can + be used, for example, to decide whether or not to show a user the raw + data from an unrecognized subtype -- such an action might be + reasonable for unrecognized subtypes of "text", but not for + unrecognized subtypes of "image" or "audio". For this reason, + registered subtypes of "text", "image", "audio", and "video" should + not contain embedded information that is really of a different type. + Such compound formats should be represented using the "multipart" or + "application" types. + + Parameters are modifiers of the media subtype, and as such do not + fundamentally affect the nature of the content. The set of + meaningful parameters depends on the media type and subtype. Most + parameters are associated with a single specific subtype. However, a + given top-level media type may define parameters which are applicable + to any subtype of that type. Parameters may be required by their + defining media type or subtype or they may be optional. MIME + implementations must also ignore any parameters whose names they do + not recognize. + + MIME's Content-Type header field and media type mechanism has been + carefully designed to be extensible, and it is expected that the set + of media type/subtype pairs and their associated parameters will grow + significantly over time. Several other MIME facilities, such as + transfer encodings and "message/external-body" access types, are + likely to have new values defined over time. In order to ensure that + the set of such values is developed in an orderly, well-specified, + and public manner, MIME sets up a registration process which uses the + Internet Assigned Numbers Authority (IANA) as a central registry for + MIME's various areas of extensibility. The registration process for + these areas is described in a companion document, RFC 2048. + + + +Freed & Borenstein Standards Track [Page 3] + +RFC 2046 Media Types November 1996 + + + The initial seven standard top-level media type are defined and + described in the remainder of this document. + +2. Definition of a Top-Level Media Type + + The definition of a top-level media type consists of: + + (1) a name and a description of the type, including + criteria for whether a particular type would qualify + under that type, + + (2) the names and definitions of parameters, if any, which + are defined for all subtypes of that type (including + whether such parameters are required or optional), + + (3) how a user agent and/or gateway should handle unknown + subtypes of this type, + + (4) general considerations on gatewaying entities of this + top-level type, if any, and + + (5) any restrictions on content-transfer-encodings for + entities of this top-level type. + +3. Overview Of The Initial Top-Level Media Types + + The five discrete top-level media types are: + + (1) text -- textual information. The subtype "plain" in + particular indicates plain text containing no + formatting commands or directives of any sort. Plain + text is intended to be displayed "as-is". No special + software is required to get the full meaning of the + text, aside from support for the indicated character + set. Other subtypes are to be used for enriched text in + forms where application software may enhance the + appearance of the text, but such software must not be + required in order to get the general idea of the + content. Possible subtypes of "text" thus include any + word processor format that can be read without + resorting to software that understands the format. In + particular, formats that employ embeddded binary + formatting information are not considered directly + readable. A very simple and portable subtype, + "richtext", was defined in RFC 1341, with a further + revision in RFC 1896 under the name "enriched". + + + + + +Freed & Borenstein Standards Track [Page 4] + +RFC 2046 Media Types November 1996 + + + (2) image -- image data. "Image" requires a display device + (such as a graphical display, a graphics printer, or a + FAX machine) to view the information. An initial + subtype is defined for the widely-used image format + JPEG. . subtypes are defined for two widely-used image + formats, jpeg and gif. + + (3) audio -- audio data. "Audio" requires an audio output + device (such as a speaker or a telephone) to "display" + the contents. An initial subtype "basic" is defined in + this document. + + (4) video -- video data. "Video" requires the capability + to display moving images, typically including + specialized hardware and software. An initial subtype + "mpeg" is defined in this document. + + (5) application -- some other kind of data, typically + either uninterpreted binary data or information to be + processed by an application. The subtype "octet- + stream" is to be used in the case of uninterpreted + binary data, in which case the simplest recommended + action is to offer to write the information into a file + for the user. The "PostScript" subtype is also defined + for the transport of PostScript material. Other + expected uses for "application" include spreadsheets, + data for mail-based scheduling systems, and languages + for "active" (computational) messaging, and word + processing formats that are not directly readable. + Note that security considerations may exist for some + types of application data, most notably + "application/PostScript" and any form of active + messaging. These issues are discussed later in this + document. + + The two composite top-level media types are: + + (1) multipart -- data consisting of multiple entities of + independent data types. Four subtypes are initially + defined, including the basic "mixed" subtype specifying + a generic mixed set of parts, "alternative" for + representing the same data in multiple formats, + "parallel" for parts intended to be viewed + simultaneously, and "digest" for multipart entities in + which each part has a default type of "message/rfc822". + + + + + + +Freed & Borenstein Standards Track [Page 5] + +RFC 2046 Media Types November 1996 + + + (2) message -- an encapsulated message. A body of media + type "message" is itself all or a portion of some kind + of message object. Such objects may or may not in turn + contain other entities. The "rfc822" subtype is used + when the encapsulated content is itself an RFC 822 + message. The "partial" subtype is defined for partial + RFC 822 messages, to permit the fragmented transmission + of bodies that are thought to be too large to be passed + through transport facilities in one piece. Another + subtype, "external-body", is defined for specifying + large bodies by reference to an external data source. + + It should be noted that the list of media type values given here may + be augmented in time, via the mechanisms described above, and that + the set of subtypes is expected to grow substantially. + +4. Discrete Media Type Values + + Five of the seven initial media type values refer to discrete bodies. + The content of these types must be handled by non-MIME mechanisms; + they are opaque to MIME processors. + +4.1. Text Media Type + + The "text" media type is intended for sending material which is + principally textual in form. A "charset" parameter may be used to + indicate the character set of the body text for "text" subtypes, + notably including the subtype "text/plain", which is a generic + subtype for plain text. Plain text does not provide for or allow + formatting commands, font attribute specifications, processing + instructions, interpretation directives, or content markup. Plain + text is seen simply as a linear sequence of characters, possibly + interrupted by line breaks or page breaks. Plain text may allow the + stacking of several characters in the same position in the text. + Plain text in scripts like Arabic and Hebrew may also include + facilitites that allow the arbitrary mixing of text segments with + opposite writing directions. + + Beyond plain text, there are many formats for representing what might + be known as "rich text". An interesting characteristic of many such + representations is that they are to some extent readable even without + the software that interprets them. It is useful, then, to + distinguish them, at the highest level, from such unreadable data as + images, audio, or text represented in an unreadable form. In the + absence of appropriate interpretation software, it is reasonable to + show subtypes of "text" to the user, while it is not reasonable to do + so with most nontextual data. Such formatted textual data should be + represented using subtypes of "text". + + + +Freed & Borenstein Standards Track [Page 6] + +RFC 2046 Media Types November 1996 + + +4.1.1. Representation of Line Breaks + + The canonical form of any MIME "text" subtype MUST always represent a + line break as a CRLF sequence. Similarly, any occurrence of CRLF in + MIME "text" MUST represent a line break. Use of CR and LF outside of + line break sequences is also forbidden. + + This rule applies regardless of format or character set or sets + involved. + + NOTE: The proper interpretation of line breaks when a body is + displayed depends on the media type. In particular, while it is + appropriate to treat a line break as a transition to a new line when + displaying a "text/plain" body, this treatment is actually incorrect + for other subtypes of "text" like "text/enriched" [RFC-1896]. + Similarly, whether or not line breaks should be added during display + operations is also a function of the media type. It should not be + necessary to add any line breaks to display "text/plain" correctly, + whereas proper display of "text/enriched" requires the appropriate + addition of line breaks. + + NOTE: Some protocols defines a maximum line length. E.g. SMTP [RFC- + 821] allows a maximum of 998 octets before the next CRLF sequence. + To be transported by such protocols, data which includes too long + segments without CRLF sequences must be encoded with a suitable + content-transfer-encoding. + +4.1.2. Charset Parameter + + A critical parameter that may be specified in the Content-Type field + for "text/plain" data is the character set. This is specified with a + "charset" parameter, as in: + + Content-type: text/plain; charset=iso-8859-1 + + Unlike some other parameter values, the values of the charset + parameter are NOT case sensitive. The default character set, which + must be assumed in the absence of a charset parameter, is US-ASCII. + + The specification for any future subtypes of "text" must specify + whether or not they will also utilize a "charset" parameter, and may + possibly restrict its values as well. For other subtypes of "text" + than "text/plain", the semantics of the "charset" parameter should be + defined to be identical to those specified here for "text/plain", + i.e., the body consists entirely of characters in the given charset. + In particular, definers of future "text" subtypes should pay close + attention to the implications of multioctet character sets for their + subtype definitions. + + + +Freed & Borenstein Standards Track [Page 7] + +RFC 2046 Media Types November 1996 + + + The charset parameter for subtypes of "text" gives a name of a + character set, as "character set" is defined in RFC 2045. The rules + regarding line breaks detailed in the previous section must also be + observed -- a character set whose definition does not conform to + these rules cannot be used in a MIME "text" subtype. + + An initial list of predefined character set names can be found at the + end of this section. Additional character sets may be registered + with IANA. + + Other media types than subtypes of "text" might choose to employ the + charset parameter as defined here, but with the CRLF/line break + restriction removed. Therefore, all character sets that conform to + the general definition of "character set" in RFC 2045 can be + registered for MIME use. + + Note that if the specified character set includes 8-bit characters + and such characters are used in the body, a Content-Transfer-Encoding + header field and a corresponding encoding on the data are required in + order to transmit the body via some mail transfer protocols, such as + SMTP [RFC-821]. + + The default character set, US-ASCII, has been the subject of some + confusion and ambiguity in the past. Not only were there some + ambiguities in the definition, there have been wide variations in + practice. In order to eliminate such ambiguity and variations in the + future, it is strongly recommended that new user agents explicitly + specify a character set as a media type parameter in the Content-Type + header field. "US-ASCII" does not indicate an arbitrary 7-bit + character set, but specifies that all octets in the body must be + interpreted as characters according to the US-ASCII character set. + National and application-oriented versions of ISO 646 [ISO-646] are + usually NOT identical to US-ASCII, and in that case their use in + Internet mail is explicitly discouraged. The omission of the ISO 646 + character set from this document is deliberate in this regard. The + character set name of "US-ASCII" explicitly refers to the character + set defined in ANSI X3.4-1986 [US- ASCII]. The new international + reference version (IRV) of the 1991 edition of ISO 646 is identical + to US-ASCII. The character set name "ASCII" is reserved and must not + be used for any purpose. + + NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier + version of the American Standard. Insofar as one of the purposes of + specifying a media type and character set is to permit the receiver + to unambiguously determine how the sender intended the coded message + to be interpreted, assuming anything other than "strict ASCII" as the + default would risk unintentional and incompatible changes to the + semantics of messages now being transmitted. This also implies that + + + +Freed & Borenstein Standards Track [Page 8] + +RFC 2046 Media Types November 1996 + + + messages containing characters coded according to other versions of + ISO 646 than US-ASCII and the 1991 IRV, or using code-switching + procedures (e.g., those of ISO 2022), as well as 8bit or multiple + octet character encodings MUST use an appropriate character set + specification to be consistent with MIME. + + The complete US-ASCII character set is listed in ANSI X3.4- 1986. + Note that the control characters including DEL (0-31, 127) have no + defined meaning in apart from the combination CRLF (US-ASCII values + 13 and 10) indicating a new line. Two of the characters have de + facto meanings in wide use: FF (12) often means "start subsequent + text on the beginning of a new page"; and TAB or HT (9) often (though + not always) means "move the cursor to the next available column after + the current position where the column number is a multiple of 8 + (counting the first column as column 0)." Aside from these + conventions, any use of the control characters or DEL in a body must + either occur + + (1) because a subtype of text other than "plain" + specifically assigns some additional meaning, or + + (2) within the context of a private agreement between the + sender and recipient. Such private agreements are + discouraged and should be replaced by the other + capabilities of this document. + + NOTE: An enormous proliferation of character sets exist beyond US- + ASCII. A large number of partially or totally overlapping character + sets is NOT a good thing. A SINGLE character set that can be used + universally for representing all of the world's languages in Internet + mail would be preferrable. Unfortunately, existing practice in + several communities seems to point to the continued use of multiple + character sets in the near future. A small number of standard + character sets are, therefore, defined for Internet use in this + document. + + The defined charset values are: + + (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII]. + + (2) ISO-8859-X -- where "X" is to be replaced, as + necessary, for the parts of ISO-8859 [ISO-8859]. Note + that the ISO 646 character sets have deliberately been + omitted in favor of their 8859 replacements, which are + the designated character sets for Internet mail. As of + the publication of this document, the legitimate values + for "X" are the digits 1 through 10. + + + + +Freed & Borenstein Standards Track [Page 9] + +RFC 2046 Media Types November 1996 + + + Characters in the range 128-159 has no assigned meaning in ISO-8859- + X. Characters with values below 128 in ISO-8859-X have the same + assigned meaning as they do in US-ASCII. + + Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew + alphabet) includes both characters for which the normal writing + direction is right to left and characters for which it is left to + right, but do not define a canonical ordering method for representing + bi-directional text. The charset values "ISO-8859-6" and "ISO-8859- + 8", however, specify that the visual method is used [RFC-1556]. + + All of these character sets are used as pure 7bit or 8bit sets + without any shift or escape functions. The meaning of shift and + escape sequences in these character sets is not defined. + + The character sets specified above are the ones that were relatively + uncontroversial during the drafting of MIME. This document does not + endorse the use of any particular character set other than US-ASCII, + and recognizes that the future evolution of world character sets + remains unclear. + + Note that the character set used, if anything other than US- ASCII, + must always be explicitly specified in the Content-Type field. + + No character set name other than those defined above may be used in + Internet mail without the publication of a formal specification and + its registration with IANA, or by private agreement, in which case + the character set name must begin with "X-". + + Implementors are discouraged from defining new character sets unless + absolutely necessary. + + The "charset" parameter has been defined primarily for the purpose of + textual data, and is described in this section for that reason. + However, it is conceivable that non-textual data might also wish to + specify a charset value for some purpose, in which case the same + syntax and values should be used. + + In general, composition software should always use the "lowest common + denominator" character set possible. For example, if a body contains + only US-ASCII characters, it SHOULD be marked as being in the US- + ASCII character set, not ISO-8859-1, which, like all the ISO-8859 + family of character sets, is a superset of US-ASCII. More generally, + if a widely-used character set is a subset of another character set, + and a body contains only characters in the widely-used subset, it + should be labelled as being in that subset. This will increase the + chances that the recipient will be able to view the resulting entity + correctly. + + + +Freed & Borenstein Standards Track [Page 10] + +RFC 2046 Media Types November 1996 + + +4.1.3. Plain Subtype + + The simplest and most important subtype of "text" is "plain". This + indicates plain text that does not contain any formatting commands or + directives. Plain text is intended to be displayed "as-is", that is, + no interpretation of embedded formatting commands, font attribute + specifications, processing instructions, interpretation directives, + or content markup should be necessary for proper display. The + default media type of "text/plain; charset=us-ascii" for Internet + mail describes existing Internet practice. That is, it is the type + of body defined by RFC 822. + + No other "text" subtype is defined by this document. + +4.1.4. Unrecognized Subtypes + + Unrecognized subtypes of "text" should be treated as subtype "plain" + as long as the MIME implementation knows how to handle the charset. + Unrecognized subtypes which also specify an unrecognized charset + should be treated as "application/octet- stream". + +4.2. Image Media Type + + A media type of "image" indicates that the body contains an image. + The subtype names the specific image format. These names are not + case sensitive. An initial subtype is "jpeg" for the JPEG format + using JFIF encoding [JPEG]. + + The list of "image" subtypes given here is neither exclusive nor + exhaustive, and is expected to grow as more types are registered with + IANA, as described in RFC 2048. + + Unrecognized subtypes of "image" should at a miniumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "image" that they do not specifically recognize to a + secure and robust general-purpose image viewing application, if such + an application is available. + + NOTE: Using of a generic-purpose image viewing application this way + inherits the security problems of the most dangerous type supported + by the application. + +4.3. Audio Media Type + + A media type of "audio" indicates that the body contains audio data. + Although there is not yet a consensus on an "ideal" audio format for + use with computers, there is a pressing need for a format capable of + providing interoperable behavior. + + + +Freed & Borenstein Standards Track [Page 11] + +RFC 2046 Media Types November 1996 + + + The initial subtype of "basic" is specified to meet this requirement + by providing an absolutely minimal lowest common denominator audio + format. It is expected that richer formats for higher quality and/or + lower bandwidth audio will be defined by a later document. + + The content of the "audio/basic" subtype is single channel audio + encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. + + Unrecognized subtypes of "audio" should at a miniumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "audio" that they do not specifically recognize to a + robust general-purpose audio playing application, if such an + application is available. + +4.4. Video Media Type + + A media type of "video" indicates that the body contains a time- + varying-picture image, possibly with color and coordinated sound. + The term 'video' is used in its most generic sense, rather than with + reference to any particular technology or format, and is not meant to + preclude subtypes such as animated drawings encoded compactly. The + subtype "mpeg" refers to video coded according to the MPEG standard + [MPEG]. + + Note that although in general this document strongly discourages the + mixing of multiple media in a single body, it is recognized that many + so-called video formats include a representation for synchronized + audio, and this is explicitly permitted for subtypes of "video". + + Unrecognized subtypes of "video" should at a minumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "video" that they do not specifically recognize to a + robust general-purpose video display application, if such an + application is available. + +4.5. Application Media Type + + The "application" media type is to be used for discrete data which do + not fit in any of the other categories, and particularly for data to + be processed by some type of application program. This is + information which must be processed by an application before it is + viewable or usable by a user. Expected uses for the "application" + media type include file transfer, spreadsheets, data for mail-based + scheduling systems, and languages for "active" (computational) + material. (The latter, in particular, can pose security problems + which must be understood by implementors, and are considered in + detail in the discussion of the "application/PostScript" media type.) + + + + +Freed & Borenstein Standards Track [Page 12] + +RFC 2046 Media Types November 1996 + + + For example, a meeting scheduler might define a standard + representation for information about proposed meeting dates. An + intelligent user agent would use this information to conduct a dialog + with the user, and might then send additional material based on that + dialog. More generally, there have been several "active" messaging + languages developed in which programs in a suitably specialized + language are transported to a remote location and automatically run + in the recipient's environment. + + Such applications may be defined as subtypes of the "application" + media type. This document defines two subtypes: + + octet-stream, and PostScript. + + The subtype of "application" will often be either the name or include + part of the name of the application for which the data are intended. + This does not mean, however, that any application program name may be + used freely as a subtype of "application". + +4.5.1. Octet-Stream Subtype + + The "octet-stream" subtype is used to indicate that a body contains + arbitrary binary data. The set of currently defined parameters is: + + (1) TYPE -- the general type or category of binary data. + This is intended as information for the human recipient + rather than for any automatic processing. + + (2) PADDING -- the number of bits of padding that were + appended to the bit-stream comprising the actual + contents to produce the enclosed 8bit byte-oriented + data. This is useful for enclosing a bit-stream in a + body when the total number of bits is not a multiple of + 8. + + Both of these parameters are optional. + + An additional parameter, "CONVERSIONS", was defined in RFC 1341 but + has since been removed. RFC 1341 also defined the use of a "NAME" + parameter which gave a suggested file name to be used if the data + were to be written to a file. This has been deprecated in + anticipation of a separate Content-Disposition header field, to be + defined in a subsequent RFC. + + The recommended action for an implementation that receives an + "application/octet-stream" entity is to simply offer to put the data + in a file, with any Content-Transfer-Encoding undone, or perhaps to + use it as input to a user-specified process. + + + +Freed & Borenstein Standards Track [Page 13] + +RFC 2046 Media Types November 1996 + + + To reduce the danger of transmitting rogue programs, it is strongly + recommended that implementations NOT implement a path-search + mechanism whereby an arbitrary program named in the Content-Type + parameter (e.g., an "interpreter=" parameter) is found and executed + using the message body as input. + +4.5.2. PostScript Subtype + + A media type of "application/postscript" indicates a PostScript + program. Currently two variants of the PostScript language are + allowed; the original level 1 variant is described in [POSTSCRIPT] + and the more recent level 2 variant is described in [POSTSCRIPT2]. + + PostScript is a registered trademark of Adobe Systems, Inc. Use of + the MIME media type "application/postscript" implies recognition of + that trademark and all the rights it entails. + + The PostScript language definition provides facilities for internal + labelling of the specific language features a given program uses. + This labelling, called the PostScript document structuring + conventions, or DSC, is very general and provides substantially more + information than just the language level. The use of document + structuring conventions, while not required, is strongly recommended + as an aid to interoperability. Documents which lack proper + structuring conventions cannot be tested to see whether or not they + will work in a given environment. As such, some systems may assume + the worst and refuse to process unstructured documents. + + The execution of general-purpose PostScript interpreters entails + serious security risks, and implementors are discouraged from simply + sending PostScript bodies to "off- the-shelf" interpreters. While it + is usually safe to send PostScript to a printer, where the potential + for harm is greatly constrained by typical printer environments, + implementors should consider all of the following before they add + interactive display of PostScript bodies to their MIME readers. + + The remainder of this section outlines some, though probably not all, + of the possible problems with the transport of PostScript entities. + + (1) Dangerous operations in the PostScript language + include, but may not be limited to, the PostScript + operators "deletefile", "renamefile", "filenameforall", + and "file". "File" is only dangerous when applied to + something other than standard input or output. + Implementations may also define additional nonstandard + file operators; these may also pose a threat to + security. "Filenameforall", the wildcard file search + operator, may appear at first glance to be harmless. + + + +Freed & Borenstein Standards Track [Page 14] + +RFC 2046 Media Types November 1996 + + + Note, however, that this operator has the potential to + reveal information about what files the recipient has + access to, and this information may itself be + sensitive. Message senders should avoid the use of + potentially dangerous file operators, since these + operators are quite likely to be unavailable in secure + PostScript implementations. Message receiving and + displaying software should either completely disable + all potentially dangerous file operators or take + special care not to delegate any special authority to + their operation. These operators should be viewed as + being done by an outside agency when interpreting + PostScript documents. Such disabling and/or checking + should be done completely outside of the reach of the + PostScript language itself; care should be taken to + insure that no method exists for re-enabling full- + function versions of these operators. + + (2) The PostScript language provides facilities for exiting + the normal interpreter, or server, loop. Changes made + in this "outer" environment are customarily retained + across documents, and may in some cases be retained + semipermanently in nonvolatile memory. The operators + associated with exiting the interpreter loop have the + potential to interfere with subsequent document + processing. As such, their unrestrained use + constitutes a threat of service denial. PostScript + operators that exit the interpreter loop include, but + may not be limited to, the exitserver and startjob + operators. Message sending software should not + generate PostScript that depends on exiting the + interpreter loop to operate, since the ability to exit + will probably be unavailable in secure PostScript + implementations. Message receiving and displaying + software should completely disable the ability to make + retained changes to the PostScript environment by + eliminating or disabling the "startjob" and + "exitserver" operations. If these operations cannot be + eliminated or completely disabled the password + associated with them should at least be set to a hard- + to-guess value. + + (3) PostScript provides operators for setting system-wide + and device-specific parameters. These parameter + settings may be retained across jobs and may + potentially pose a threat to the correct operation of + the interpreter. The PostScript operators that set + system and device parameters include, but may not be + + + +Freed & Borenstein Standards Track [Page 15] + +RFC 2046 Media Types November 1996 + + + limited to, the "setsystemparams" and "setdevparams" + operators. Message sending software should not + generate PostScript that depends on the setting of + system or device parameters to operate correctly. The + ability to set these parameters will probably be + unavailable in secure PostScript implementations. + Message receiving and displaying software should + disable the ability to change system and device + parameters. If these operators cannot be completely + disabled the password associated with them should at + least be set to a hard-to-guess value. + + (4) Some PostScript implementations provide nonstandard + facilities for the direct loading and execution of + machine code. Such facilities are quite obviously open + to substantial abuse. Message sending software should + not make use of such features. Besides being totally + hardware-specific, they are also likely to be + unavailable in secure implementations of PostScript. + Message receiving and displaying software should not + allow such operators to be used if they exist. + + (5) PostScript is an extensible language, and many, if not + most, implementations of it provide a number of their + own extensions. This document does not deal with such + extensions explicitly since they constitute an unknown + factor. Message sending software should not make use + of nonstandard extensions; they are likely to be + missing from some implementations. Message receiving + and displaying software should make sure that any + nonstandard PostScript operators are secure and don't + present any kind of threat. + + (6) It is possible to write PostScript that consumes huge + amounts of various system resources. It is also + possible to write PostScript programs that loop + indefinitely. Both types of programs have the + potential to cause damage if sent to unsuspecting + recipients. Message-sending software should avoid the + construction and dissemination of such programs, which + is antisocial. Message receiving and displaying + software should provide appropriate mechanisms to abort + processing after a reasonable amount of time has + elapsed. In addition, PostScript interpreters should be + limited to the consumption of only a reasonable amount + of any given system resource. + + + + + +Freed & Borenstein Standards Track [Page 16] + +RFC 2046 Media Types November 1996 + + + (7) It is possible to include raw binary information inside + PostScript in various forms. This is not recommended + for use in Internet mail, both because it is not + supported by all PostScript interpreters and because it + significantly complicates the use of a MIME Content- + Transfer-Encoding. (Without such binary, PostScript + may typically be viewed as line-oriented data. The + treatment of CRLF sequences becomes extremely + problematic if binary and line-oriented data are mixed + in a single Postscript data stream.) + + (8) Finally, bugs may exist in some PostScript interpreters + which could possibly be exploited to gain unauthorized + access to a recipient's system. Apart from noting this + possibility, there is no specific action to take to + prevent this, apart from the timely correction of such + bugs if any are found. + +4.5.3. Other Application Subtypes + + It is expected that many other subtypes of "application" will be + defined in the future. MIME implementations must at a minimum treat + any unrecognized subtypes as being equivalent to "application/octet- + stream". + +5. Composite Media Type Values + + The remaining two of the seven initial Content-Type values refer to + composite entities. Composite entities are handled using MIME + mechanisms -- a MIME processor typically handles the body directly. + +5.1. Multipart Media Type + + In the case of multipart entities, in which one or more different + sets of data are combined in a single body, a "multipart" media type + field must appear in the entity's header. The body must then contain + one or more body parts, each preceded by a boundary delimiter line, + and the last one followed by a closing boundary delimiter line. + After its boundary delimiter line, each body part then consists of a + header area, a blank line, and a body area. Thus a body part is + similar to an RFC 822 message in syntax, but different in meaning. + + A body part is an entity and hence is NOT to be interpreted as + actually being an RFC 822 message. To begin with, NO header fields + are actually required in body parts. A body part that starts with a + blank line, therefore, is allowed and is a body part for which all + default values are to be assumed. In such a case, the absence of a + Content-Type header usually indicates that the corresponding body has + + + +Freed & Borenstein Standards Track [Page 17] + +RFC 2046 Media Types November 1996 + + + a content-type of "text/plain; charset=US-ASCII". + + The only header fields that have defined meaning for body parts are + those the names of which begin with "Content-". All other header + fields may be ignored in body parts. Although they should generally + be retained if at all possible, they may be discarded by gateways if + necessary. Such other fields are permitted to appear in body parts + but must not be depended on. "X-" fields may be created for + experimental or private purposes, with the recognition that the + information they contain may be lost at some gateways. + + NOTE: The distinction between an RFC 822 message and a body part is + subtle, but important. A gateway between Internet and X.400 mail, + for example, must be able to tell the difference between a body part + that contains an image and a body part that contains an encapsulated + message, the body of which is a JPEG image. In order to represent + the latter, the body part must have "Content-Type: message/rfc822", + and its body (after the blank line) must be the encapsulated message, + with its own "Content-Type: image/jpeg" header field. The use of + similar syntax facilitates the conversion of messages to body parts, + and vice versa, but the distinction between the two must be + understood by implementors. (For the special case in which parts + actually are messages, a "digest" subtype is also defined.) + + As stated previously, each body part is preceded by a boundary + delimiter line that contains the boundary delimiter. The boundary + delimiter MUST NOT appear inside any of the encapsulated parts, on a + line by itself or as the prefix of any line. This implies that it is + crucial that the composing agent be able to choose and specify a + unique boundary parameter value that does not contain the boundary + parameter value of an enclosing multipart as a prefix. + + All present and future subtypes of the "multipart" type must use an + identical syntax. Subtypes may differ in their semantics, and may + impose additional restrictions on syntax, but must conform to the + required syntax for the "multipart" type. This requirement ensures + that all conformant user agents will at least be able to recognize + and separate the parts of any multipart entity, even those of an + unrecognized subtype. + + As stated in the definition of the Content-Transfer-Encoding field + [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is + permitted for entities of type "multipart". The "multipart" boundary + delimiters and header fields are always represented as 7bit US-ASCII + in any case (though the header fields may encode non-US-ASCII header + text as per RFC 2047) and data within the body parts can be encoded + on a part-by-part basis, with Content-Transfer-Encoding fields for + each appropriate body part. + + + +Freed & Borenstein Standards Track [Page 18] + +RFC 2046 Media Types November 1996 + + +5.1.1. Common Syntax + + This section defines a common syntax for subtypes of "multipart". + All subtypes of "multipart" must use this syntax. A simple example + of a multipart message also appears in this section. An example of a + more complex multipart message is given in RFC 2049. + + The Content-Type field for multipart entities requires one parameter, + "boundary". The boundary delimiter line is then defined as a line + consisting entirely of two hyphen characters ("-", decimal value 45) + followed by the boundary parameter value from the Content-Type header + field, optional linear whitespace, and a terminating CRLF. + + NOTE: The hyphens are for rough compatibility with the earlier RFC + 934 method of message encapsulation, and for ease of searching for + the boundaries in some implementations. However, it should be noted + that multipart messages are NOT completely compatible with RFC 934 + encapsulations; in particular, they do not obey RFC 934 quoting + conventions for embedded lines that begin with hyphens. This + mechanism was chosen over the RFC 934 mechanism because the latter + causes lines to grow with each level of quoting. The combination of + this growth with the fact that SMTP implementations sometimes wrap + long lines made the RFC 934 mechanism unsuitable for use in the event + that deeply-nested multipart structuring is ever desired. + + WARNING TO IMPLEMENTORS: The grammar for parameters on the Content- + type field is such that it is often necessary to enclose the boundary + parameter values in quotes on the Content-type line. This is not + always necessary, but never hurts. Implementors should be sure to + study the grammar carefully in order to avoid producing invalid + Content-type fields. Thus, a typical "multipart" Content-Type header + field might look like this: + + Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p + + But the following is not valid: + + Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p + + (because of the colon) and must instead be represented as + + Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p" + + This Content-Type value indicates that the content consists of one or + more parts, each with a structure that is syntactically identical to + an RFC 822 message, except that the header area is allowed to be + completely empty, and that the parts are each preceded by the line + + + + +Freed & Borenstein Standards Track [Page 19] + +RFC 2046 Media Types November 1996 + + + --gc0pJq0M:08jU534c0p + + The boundary delimiter MUST occur at the beginning of a line, i.e., + following a CRLF, and the initial CRLF is considered to be attached + to the boundary delimiter line rather than part of the preceding + part. The boundary may be followed by zero or more characters of + linear whitespace. It is then terminated by either another CRLF and + the header fields for the next part, or by two CRLFs, in which case + there are no header fields for the next part. If no Content-Type + field is present it is assumed to be "message/rfc822" in a + "multipart/digest" and "text/plain" otherwise. + + NOTE: The CRLF preceding the boundary delimiter line is conceptually + attached to the boundary so that it is possible to have a part that + does not end with a CRLF (line break). Body parts that must be + considered to end with line breaks, therefore, must have two CRLFs + preceding the boundary delimiter line, the first of which is part of + the preceding body part, and the second of which is part of the + encapsulation boundary. + + Boundary delimiters must not appear within the encapsulated material, + and must be no longer than 70 characters, not counting the two + leading hyphens. + + The boundary delimiter line following the last body part is a + distinguished delimiter that indicates that no further body parts + will follow. Such a delimiter line is identical to the previous + delimiter lines, with the addition of two more hyphens after the + boundary parameter value. + + --gc0pJq0M:08jU534c0p-- + + NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the + boundary value with the beginning of each candidate line. An exact + match of the entire candidate line is not required; it is sufficient + that the boundary appear in its entirety following the CRLF. + + There appears to be room for additional information prior to the + first boundary delimiter line and following the final boundary + delimiter line. These areas should generally be left blank, and + implementations must ignore anything that appears before the first + boundary delimiter line or after the last one. + + NOTE: These "preamble" and "epilogue" areas are generally not used + because of the lack of proper typing of these parts and the lack of + clear semantics for handling these areas at gateways, particularly + X.400 gateways. However, rather than leaving the preamble area + blank, many MIME implementations have found this to be a convenient + + + +Freed & Borenstein Standards Track [Page 20] + +RFC 2046 Media Types November 1996 + + + place to insert an explanatory note for recipients who read the + message with pre-MIME software, since such notes will be ignored by + MIME-compliant software. + + NOTE: Because boundary delimiters must not appear in the body parts + being encapsulated, a user agent must exercise care to choose a + unique boundary parameter value. The boundary parameter value in the + example above could have been the result of an algorithm designed to + produce boundary delimiters with a very low probability of already + existing in the data to be encapsulated without having to prescan the + data. Alternate algorithms might result in more "readable" boundary + delimiters for a recipient with an old user agent, but would require + more attention to the possibility that the boundary delimiter might + appear at the beginning of some line in the encapsulated part. The + simplest boundary delimiter line possible is something like "---", + with a closing boundary delimiter line of "-----". + + As a very simple example, the following multipart message has two + parts, both of them plain text, one of them explicitly typed and one + of them implicitly typed: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) + Subject: Sample message + MIME-Version: 1.0 + Content-type: multipart/mixed; boundary="simple boundary" + + This is the preamble. It is to be ignored, though it + is a handy place for composition agents to include an + explanatory note to non-MIME conformant readers. + + --simple boundary + + This is implicitly typed plain US-ASCII text. + It does NOT end with a linebreak. + --simple boundary + Content-type: text/plain; charset=us-ascii + + This is explicitly typed plain US-ASCII text. + It DOES end with a linebreak. + + --simple boundary-- + + This is the epilogue. It is also to be ignored. + + + + + + +Freed & Borenstein Standards Track [Page 21] + +RFC 2046 Media Types November 1996 + + + The use of a media type of "multipart" in a body part within another + "multipart" entity is explicitly allowed. In such cases, for obvious + reasons, care must be taken to ensure that each nested "multipart" + entity uses a different boundary delimiter. See RFC 2049 for an + example of nested "multipart" entities. + + The use of the "multipart" media type with only a single body part + may be useful in certain contexts, and is explicitly permitted. + + NOTE: Experience has shown that a "multipart" media type with a + single body part is useful for sending non-text media types. It has + the advantage of providing the preamble as a place to include + decoding instructions. In addition, a number of SMTP gateways move + or remove the MIME headers, and a clever MIME decoder can take a good + guess at multipart boundaries even in the absence of the Content-Type + header and thereby successfully decode the message. + + The only mandatory global parameter for the "multipart" media type is + the boundary parameter, which consists of 1 to 70 characters from a + set of characters known to be very robust through mail gateways, and + NOT ending with white space. (If a boundary delimiter line appears to + end with white space, the white space must be presumed to have been + added by a gateway, and must be deleted.) It is formally specified + by the following BNF: + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / + "+" / "_" / "," / "-" / "." / + "/" / ":" / "=" / "?" + + Overall, the body of a "multipart" entity may be specified as + follows: + + dash-boundary := "--" boundary + ; boundary taken from the value of + ; boundary parameter of the + ; Content-Type field. + + multipart-body := [preamble CRLF] + dash-boundary transport-padding CRLF + body-part *encapsulation + close-delimiter transport-padding + [CRLF epilogue] + + + + + +Freed & Borenstein Standards Track [Page 22] + +RFC 2046 Media Types November 1996 + + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + encapsulation := delimiter transport-padding + CRLF body-part + + delimiter := CRLF dash-boundary + + close-delimiter := delimiter "--" + + preamble := discard-text + + epilogue := discard-text + + discard-text := *(*text CRLF) *text + ; May be ignored or discarded. + + body-part := MIME-part-headers [CRLF *OCTET] + ; Lines in a body-part must not start + ; with the specified dash-boundary and + ; the delimiter must not appear anywhere + ; in the body part. Note that the + ; semantics of a body-part differ from + ; the semantics of a message, as + ; described in the text. + + OCTET := <any 0-255 octet value> + + IMPORTANT: The free insertion of linear-white-space and RFC 822 + comments between the elements shown in this BNF is NOT allowed since + this BNF does not specify a structured header field. + + NOTE: In certain transport enclaves, RFC 822 restrictions such as + the one that limits bodies to printable US-ASCII characters may not + be in force. (That is, the transport domains may exist that resemble + standard Internet mail transport as specified in RFC 821 and assumed + by RFC 822, but without certain restrictions.) The relaxation of + these restrictions should be construed as locally extending the + definition of bodies, for example to include octets outside of the + US-ASCII range, as long as these extensions are supported by the + transport and adequately documented in the Content- Transfer-Encoding + header field. However, in no event are headers (either message + headers or body part headers) allowed to contain anything other than + US-ASCII characters. + + + +Freed & Borenstein Standards Track [Page 23] + +RFC 2046 Media Types November 1996 + + + NOTE: Conspicuously missing from the "multipart" type is a notion of + structured, related body parts. It is recommended that those wishing + to provide more structured or integrated multipart messaging + facilities should define subtypes of multipart that are syntactically + identical but define relationships between the various parts. For + example, subtypes of multipart could be defined that include a + distinguished part which in turn is used to specify the relationships + between the other parts, probably referring to them by their + Content-ID field. Old implementations will not recognize the new + subtype if this approach is used, but will treat it as + multipart/mixed and will thus be able to show the user the parts that + are recognized. + +5.1.2. Handling Nested Messages and Multiparts + + The "message/rfc822" subtype defined in a subsequent section of this + document has no terminating condition other than running out of data. + Similarly, an improperly truncated "multipart" entity may not have + any terminating boundary marker, and can turn up operationally due to + mail system malfunctions. + + It is essential that such entities be handled correctly when they are + themselves imbedded inside of another "multipart" structure. MIME + implementations are therefore required to recognize outer level + boundary markers at ANY level of inner nesting. It is not sufficient + to only check for the next expected marker or other terminating + condition. + +5.1.3. Mixed Subtype + + The "mixed" subtype of "multipart" is intended for use when the body + parts are independent and need to be bundled in a particular order. + Any "multipart" subtypes that an implementation does not recognize + must be treated as being of subtype "mixed". + +5.1.4. Alternative Subtype + + The "multipart/alternative" type is syntactically identical to + "multipart/mixed", but the semantics are different. In particular, + each of the body parts is an "alternative" version of the same + information. + + Systems should recognize that the content of the various parts are + interchangeable. Systems should choose the "best" type based on the + local environment and references, in some cases even through user + interaction. As with "multipart/mixed", the order of body parts is + significant. In this case, the alternatives appear in an order of + increasing faithfulness to the original content. In general, the + + + +Freed & Borenstein Standards Track [Page 24] + +RFC 2046 Media Types November 1996 + + + best choice is the LAST part of a type supported by the recipient + system's local environment. + + "Multipart/alternative" may be used, for example, to send a message + in a fancy text format in such a way that it can easily be displayed + anywhere: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST) + Subject: Formatted text mail + MIME-Version: 1.0 + Content-Type: multipart/alternative; boundary=boundary42 + + --boundary42 + Content-Type: text/plain; charset=us-ascii + + ... plain text version of message goes here ... + + --boundary42 + Content-Type: text/enriched + + ... RFC 1896 text/enriched version of same message + goes here ... + + --boundary42 + Content-Type: application/x-whatever + + ... fanciest version of same message goes here ... + + --boundary42-- + + In this example, users whose mail systems understood the + "application/x-whatever" format would see only the fancy version, + while other users would see only the enriched or plain text version, + depending on the capabilities of their system. + + In general, user agents that compose "multipart/alternative" entities + must place the body parts in increasing order of preference, that is, + with the preferred format last. For fancy text, the sending user + agent should put the plainest format first and the richest format + last. Receiving user agents should pick and display the last format + they are capable of displaying. In the case where one of the + alternatives is itself of type "multipart" and contains unrecognized + sub-parts, the user agent may choose either to show that alternative, + an earlier alternative, or both. + + + + + +Freed & Borenstein Standards Track [Page 25] + +RFC 2046 Media Types November 1996 + + + NOTE: From an implementor's perspective, it might seem more sensible + to reverse this ordering, and have the plainest alternative last. + However, placing the plainest alternative first is the friendliest + possible option when "multipart/alternative" entities are viewed + using a non-MIME-conformant viewer. While this approach does impose + some burden on conformant MIME viewers, interoperability with older + mail readers was deemed to be more important in this case. + + It may be the case that some user agents, if they can recognize more + than one of the formats, will prefer to offer the user the choice of + which format to view. This makes sense, for example, if a message + includes both a nicely- formatted image version and an easily-edited + text version. What is most critical, however, is that the user not + automatically be shown multiple versions of the same data. Either + the user should be shown the last recognized version or should be + given the choice. + + THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a + "multipart/alternative" entity represents the same data, but the + mappings between the two are not necessarily without information + loss. For example, information is lost when translating ODA to + PostScript or plain text. It is recommended that each part should + have a different Content-ID value in the case where the information + content of the two parts is not identical. And when the information + content is identical -- for example, where several parts of type + "message/external-body" specify alternate ways to access the + identical data -- the same Content-ID field value should be used, to + optimize any caching mechanisms that might be present on the + recipient's end. However, the Content-ID values used by the parts + should NOT be the same Content-ID value that describes the + "multipart/alternative" as a whole, if there is any such Content-ID + field. That is, one Content-ID value will refer to the + "multipart/alternative" entity, while one or more other Content-ID + values will refer to the parts inside it. + +5.1.5. Digest Subtype + + This document defines a "digest" subtype of the "multipart" Content- + Type. This type is syntactically identical to "multipart/mixed", but + the semantics are different. In particular, in a digest, the default + Content-Type value for a body part is changed from "text/plain" to + "message/rfc822". This is done to allow a more readable digest + format that is largely compatible (except for the quoting convention) + with RFC 934. + + Note: Though it is possible to specify a Content-Type value for a + body part in a digest which is other than "message/rfc822", such as a + "text/plain" part containing a description of the material in the + + + +Freed & Borenstein Standards Track [Page 26] + +RFC 2046 Media Types November 1996 + + + digest, actually doing so is undesireble. The "multipart/digest" + Content-Type is intended to be used to send collections of messages. + If a "text/plain" part is needed, it should be included as a seperate + part of a "multipart/mixed" message. + + A digest in this format might, then, look something like this: + + From: Moderator-Address + To: Recipient-List + Date: Mon, 22 Mar 1994 13:34:51 +0000 + Subject: Internet Digest, volume 42 + MIME-Version: 1.0 + Content-Type: multipart/mixed; + boundary="---- main boundary ----" + + ------ main boundary ---- + + ...Introductory text or table of contents... + + ------ main boundary ---- + Content-Type: multipart/digest; + boundary="---- next message ----" + + ------ next message ---- + + From: someone-else + Date: Fri, 26 Mar 1993 11:13:32 +0200 + Subject: my opinion + + ...body goes here ... + + ------ next message ---- + + From: someone-else-again + Date: Fri, 26 Mar 1993 10:07:13 -0500 + Subject: my different opinion + + ... another body goes here ... + + ------ next message ------ + + ------ main boundary ------ + +5.1.6. Parallel Subtype + + This document defines a "parallel" subtype of the "multipart" + Content-Type. This type is syntactically identical to + "multipart/mixed", but the semantics are different. In particular, + + + +Freed & Borenstein Standards Track [Page 27] + +RFC 2046 Media Types November 1996 + + + in a parallel entity, the order of body parts is not significant. + + A common presentation of this type is to display all of the parts + simultaneously on hardware and software that are capable of doing so. + However, composing agents should be aware that many mail readers will + lack this capability and will show the parts serially in any event. + +5.1.7. Other Multipart Subtypes + + Other "multipart" subtypes are expected in the future. MIME + implementations must in general treat unrecognized subtypes of + "multipart" as being equivalent to "multipart/mixed". + +5.2. Message Media Type + + It is frequently desirable, in sending mail, to encapsulate another + mail message. A special media type, "message", is defined to + facilitate this. In particular, the "rfc822" subtype of "message" is + used to encapsulate RFC 822 messages. + + NOTE: It has been suggested that subtypes of "message" might be + defined for forwarded or rejected messages. However, forwarded and + rejected messages can be handled as multipart messages in which the + first part contains any control or descriptive information, and a + second part, of type "message/rfc822", is the forwarded or rejected + message. Composing rejection and forwarding messages in this manner + will preserve the type information on the original message and allow + it to be correctly presented to the recipient, and hence is strongly + encouraged. + + Subtypes of "message" often impose restrictions on what encodings are + allowed. These restrictions are described in conjunction with each + specific subtype. + + Mail gateways, relays, and other mail handling agents are commonly + known to alter the top-level header of an RFC 822 message. In + particular, they frequently add, remove, or reorder header fields. + These operations are explicitly forbidden for the encapsulated + headers embedded in the bodies of messages of type "message." + +5.2.1. RFC822 Subtype + + A media type of "message/rfc822" indicates that the body contains an + encapsulated message, with the syntax of an RFC 822 message. + However, unlike top-level RFC 822 messages, the restriction that each + "message/rfc822" body must include a "From", "Date", and at least one + destination header is removed and replaced with the requirement that + at least one of "From", "Subject", or "Date" must be present. + + + +Freed & Borenstein Standards Track [Page 28] + +RFC 2046 Media Types November 1996 + + + It should be noted that, despite the use of the numbers "822", a + "message/rfc822" entity isn't restricted to material in strict + conformance to RFC822, nor are the semantics of "message/rfc822" + objects restricted to the semantics defined in RFC822. More + specifically, a "message/rfc822" message could well be a News article + or a MIME message. + + No encoding other than "7bit", "8bit", or "binary" is permitted for + the body of a "message/rfc822" entity. The message header fields are + always US-ASCII in any case, and data within the body can still be + encoded, in which case the Content-Transfer-Encoding header field in + the encapsulated message will reflect this. Non-US-ASCII text in the + headers of an encapsulated message can be specified using the + mechanisms described in RFC 2047. + +5.2.2. Partial Subtype + + The "partial" subtype is defined to allow large entities to be + delivered as several separate pieces of mail and automatically + reassembled by a receiving user agent. (The concept is similar to IP + fragmentation and reassembly in the basic Internet Protocols.) This + mechanism can be used when intermediate transport agents limit the + size of individual messages that can be sent. The media type + "message/partial" thus indicates that the body contains a fragment of + a larger entity. + + Because data of type "message" may never be encoded in base64 or + quoted-printable, a problem might arise if "message/partial" entities + are constructed in an environment that supports binary or 8bit + transport. The problem is that the binary data would be split into + multiple "message/partial" messages, each of them requiring binary + transport. If such messages were encountered at a gateway into a + 7bit transport environment, there would be no way to properly encode + them for the 7bit world, aside from waiting for all of the fragments, + reassembling the inner message, and then encoding the reassembled + data in base64 or quoted-printable. Since it is possible that + different fragments might go through different gateways, even this is + not an acceptable solution. For this reason, it is specified that + entities of type "message/partial" must always have a content- + transfer-encoding of 7bit (the default). In particular, even in + environments that support binary or 8bit transport, the use of a + content- transfer-encoding of "8bit" or "binary" is explicitly + prohibited for MIME entities of type "message/partial". This in turn + implies that the inner message must not use "8bit" or "binary" + encoding. + + + + + + +Freed & Borenstein Standards Track [Page 29] + +RFC 2046 Media Types November 1996 + + + Because some message transfer agents may choose to automatically + fragment large messages, and because such agents may use very + different fragmentation thresholds, it is possible that the pieces of + a partial message, upon reassembly, may prove themselves to comprise + a partial message. This is explicitly permitted. + + Three parameters must be specified in the Content-Type field of type + "message/partial": The first, "id", is a unique identifier, as close + to a world-unique identifier as possible, to be used to match the + fragments together. (In general, the identifier is essentially a + message-id; if placed in double quotes, it can be ANY message-id, in + accordance with the BNF for "parameter" given in RFC 2045.) The + second, "number", an integer, is the fragment number, which indicates + where this fragment fits into the sequence of fragments. The third, + "total", another integer, is the total number of fragments. This + third subfield is required on the final fragment, and is optional + (though encouraged) on the earlier fragments. Note also that these + parameters may be given in any order. + + Thus, the second piece of a 3-piece message may have either of the + following header fields: + + Content-Type: Message/Partial; number=2; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com" + + Content-Type: Message/Partial; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; + number=2 + + But the third piece MUST specify the total number of fragments: + + Content-Type: Message/Partial; number=3; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com" + + Note that fragment numbering begins with 1, not 0. + + When the fragments of an entity broken up in this manner are put + together, the result is always a complete MIME entity, which may have + its own Content-Type header field, and thus may contain any other + data type. + +5.2.2.1. Message Fragmentation and Reassembly + + The semantics of a reassembled partial message must be those of the + "inner" message, rather than of a message containing the inner + message. This makes it possible, for example, to send a large audio + message as several partial messages, and still have it appear to the + recipient as a simple audio message rather than as an encapsulated + + + +Freed & Borenstein Standards Track [Page 30] + +RFC 2046 Media Types November 1996 + + + message containing an audio message. That is, the encapsulation of + the message is considered to be "transparent". + + When generating and reassembling the pieces of a "message/partial" + message, the headers of the encapsulated message must be merged with + the headers of the enclosing entities. In this process the following + rules must be observed: + + (1) Fragmentation agents must split messages at line + boundaries only. This restriction is imposed because + splits at points other than the ends of lines in turn + depends on message transports being able to preserve + the semantics of messages that don't end with a CRLF + sequence. Many transports are incapable of preserving + such semantics. + + (2) All of the header fields from the initial enclosing + message, except those that start with "Content-" and + the specific header fields "Subject", "Message-ID", + "Encrypted", and "MIME-Version", must be copied, in + order, to the new message. + + (3) The header fields in the enclosed message which start + with "Content-", plus the "Subject", "Message-ID", + "Encrypted", and "MIME-Version" fields, must be + appended, in order, to the header fields of the new + message. Any header fields in the enclosed message + which do not start with "Content-" (except for the + "Subject", "Message-ID", "Encrypted", and "MIME- + Version" fields) will be ignored and dropped. + + (4) All of the header fields from the second and any + subsequent enclosing messages are discarded by the + reassembly process. + +5.2.2.2. Fragmentation and Reassembly Example + + If an audio message is broken into two pieces, the first piece might + look something like this: + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail (part 1 of 2) + Message-ID: <id1@host.com> + MIME-Version: 1.0 + Content-type: message/partial; id="ABC@host.com"; + + + +Freed & Borenstein Standards Track [Page 31] + +RFC 2046 Media Types November 1996 + + + number=1; total=2 + + X-Weird-Header-1: Bar + X-Weird-Header-2: Hello + Message-ID: <anotherid@foo.com> + Subject: Audio mail + MIME-Version: 1.0 + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here ... + + and the second half might look something like this: + + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail (part 2 of 2) + MIME-Version: 1.0 + Message-ID: <id2@host.com> + Content-type: message/partial; + id="ABC@host.com"; number=2; total=2 + + ... second half of encoded audio data goes here ... + + Then, when the fragmented message is reassembled, the resulting + message to be displayed to the user should look something like this: + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail + Message-ID: <anotherid@foo.com> + MIME-Version: 1.0 + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here ... + ... second half of encoded audio data goes here ... + + The inclusion of a "References" field in the headers of the second + and subsequent pieces of a fragmented message that references the + Message-Id on the previous piece may be of benefit to mail readers + that understand and track references. However, the generation of + such "References" fields is entirely optional. + + + + + +Freed & Borenstein Standards Track [Page 32] + +RFC 2046 Media Types November 1996 + + + Finally, it should be noted that the "Encrypted" header field has + been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421, + RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless + believed to describe the correct way to treat it if it is encountered + in the context of conversion to and from "message/partial" fragments. + +5.2.3. External-Body Subtype + + The external-body subtype indicates that the actual body data are not + included, but merely referenced. In this case, the parameters + describe a mechanism for accessing the external data. + + When a MIME entity is of type "message/external-body", it consists of + a header, two consecutive CRLFs, and the message header for the + encapsulated message. If another pair of consecutive CRLFs appears, + this of course ends the message header for the encapsulated message. + However, since the encapsulated message's body is itself external, it + does NOT appear in the area that follows. For example, consider the + following message: + + Content-type: message/external-body; + access-type=local-file; + name="/u/nsb/Me.jpeg" + + Content-type: image/jpeg + Content-ID: <id42@guppylake.bellcore.com> + Content-Transfer-Encoding: binary + + THIS IS NOT REALLY THE BODY! + + The area at the end, which might be called the "phantom body", is + ignored for most external-body messages. However, it may be used to + contain auxiliary information for some such messages, as indeed it is + when the access-type is "mail- server". The only access-type defined + in this document that uses the phantom body is "mail-server", but + other access-types may be defined in the future in other + specifications that use this area. + + The encapsulated headers in ALL "message/external-body" entities MUST + include a Content-ID header field to give a unique identifier by + which to reference the data. This identifier may be used for caching + mechanisms, and for recognizing the receipt of the data when the + access-type is "mail-server". + + Note that, as specified here, the tokens that describe external-body + data, such as file names and mail server commands, are required to be + in the US-ASCII character set. + + + + +Freed & Borenstein Standards Track [Page 33] + +RFC 2046 Media Types November 1996 + + + If this proves problematic in practice, a new mechanism may be + required as a future extension to MIME, either as newly defined + access-types for "message/external-body" or by some other mechanism. + + As with "message/partial", MIME entities of type "message/external- + body" MUST have a content-transfer-encoding of 7bit (the default). + In particular, even in environments that support binary or 8bit + transport, the use of a content- transfer-encoding of "8bit" or + "binary" is explicitly prohibited for entities of type + "message/external-body". + +5.2.3.1. General External-Body Parameters + + The parameters that may be used with any "message/external- body" + are: + + (1) ACCESS-TYPE -- A word indicating the supported access + mechanism by which the file or data may be obtained. + This word is not case sensitive. Values include, but + are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL- + FILE", and "MAIL-SERVER". Future values, except for + experimental values beginning with "X-", must be + registered with IANA, as described in RFC 2048. + This parameter is unconditionally mandatory and MUST be + present on EVERY "message/external-body". + + (2) EXPIRATION -- The date (in the RFC 822 "date-time" + syntax, as extended by RFC 1123 to permit 4 digits in + the year field) after which the existence of the + external data is not guaranteed. This parameter may be + used with ANY access-type and is ALWAYS optional. + + (3) SIZE -- The size (in octets) of the data. The intent + of this parameter is to help the recipient decide + whether or not to expend the necessary resources to + retrieve the external data. Note that this describes + the size of the data in its canonical form, that is, + before any Content-Transfer-Encoding has been applied + or after the data have been decoded. This parameter + may be used with ANY access-type and is ALWAYS + optional. + + (4) PERMISSION -- A case-insensitive field that indicates + whether or not it is expected that clients might also + attempt to overwrite the data. By default, or if + permission is "read", the assumption is that they are + not, and that if the data is retrieved once, it is + never needed again. If PERMISSION is "read-write", + + + +Freed & Borenstein Standards Track [Page 34] + +RFC 2046 Media Types November 1996 + + + this assumption is invalid, and any local copy must be + considered no more than a cache. "Read" and "Read- + write" are the only defined values of permission. This + parameter may be used with ANY access-type and is + ALWAYS optional. + + The precise semantics of the access-types defined here are described + in the sections that follow. + +5.2.3.2. The 'ftp' and 'tftp' Access-Types + + An access-type of FTP or TFTP indicates that the message body is + accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783] + protocols, respectively. For these access-types, the following + additional parameters are mandatory: + + (1) NAME -- The name of the file that contains the actual + body data. + + (2) SITE -- A machine from which the file may be obtained, + using the given protocol. This must be a fully + qualified domain name, not a nickname. + + (3) Before any data are retrieved, using FTP, the user will + generally need to be asked to provide a login id and a + password for the machine named by the site parameter. + For security reasons, such an id and password are not + specified as content-type parameters, but must be + obtained from the user. + + In addition, the following parameters are optional: + + (1) DIRECTORY -- A directory from which the data named by + NAME should be retrieved. + + (2) MODE -- A case-insensitive string indicating the mode + to be used when retrieving the information. The valid + values for access-type "TFTP" are "NETASCII", "OCTET", + and "MAIL", as specified by the TFTP protocol [RFC- + 783]. The valid values for access-type "FTP" are + "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a + decimal integer, typically 8. These correspond to the + representation types "A" "E" "I" and "L n" as specified + by the FTP protocol [RFC-959]. Note that "BINARY" and + "TENEX" are not valid values for MODE and that "OCTET" + or "IMAGE" or "LOCAL8" should be used instead. IF MODE + is not specified, the default value is "NETASCII" for + TFTP and "ASCII" otherwise. + + + +Freed & Borenstein Standards Track [Page 35] + +RFC 2046 Media Types November 1996 + + +5.2.3.3. The 'anon-ftp' Access-Type + + The "anon-ftp" access-type is identical to the "ftp" access type, + except that the user need not be asked to provide a name and password + for the specified site. Instead, the ftp protocol will be used with + login "anonymous" and a password that corresponds to the user's mail + address. + +5.2.3.4. The 'local-file' Access-Type + + An access-type of "local-file" indicates that the actual body is + accessible as a file on the local machine. Two additional parameters + are defined for this access type: + + (1) NAME -- The name of the file that contains the actual + body data. This parameter is mandatory for the + "local-file" access-type. + + (2) SITE -- A domain specifier for a machine or set of + machines that are known to have access to the data + file. This optional parameter is used to describe the + locality of reference for the data, that is, the site + or sites at which the file is expected to be visible. + Asterisks may be used for wildcard matching to a part + of a domain name, such as "*.bellcore.com", to indicate + a set of machines on which the data should be directly + visible, while a single asterisk may be used to + indicate a file that is expected to be universally + available, e.g., via a global file system. + +5.2.3.5. The 'mail-server' Access-Type + + The "mail-server" access-type indicates that the actual body is + available from a mail server. Two additional parameters are defined + for this access-type: + + (1) SERVER -- The addr-spec of the mail server from which + the actual body data can be obtained. This parameter + is mandatory for the "mail-server" access-type. + + (2) SUBJECT -- The subject that is to be used in the mail + that is sent to obtain the data. Note that keying mail + servers on Subject lines is NOT recommended, but such + mail servers are known to exist. This is an optional + parameter. + + + + + + +Freed & Borenstein Standards Track [Page 36] + +RFC 2046 Media Types November 1996 + + + Because mail servers accept a variety of syntaxes, some of which is + multiline, the full command to be sent to a mail server is not + included as a parameter in the content-type header field. Instead, + it is provided as the "phantom body" when the media type is + "message/external-body" and the access-type is mail-server. + + Note that MIME does not define a mail server syntax. Rather, it + allows the inclusion of arbitrary mail server commands in the phantom + body. Implementations must include the phantom body in the body of + the message it sends to the mail server address to retrieve the + relevant data. + + Unlike other access-types, mail-server access is asynchronous and + will happen at an unpredictable time in the future. For this reason, + it is important that there be a mechanism by which the returned data + can be matched up with the original "message/external-body" entity. + MIME mail servers must use the same Content-ID field on the returned + message that was used in the original "message/external-body" + entities, to facilitate such matching. + +5.2.3.6. External-Body Security Issues + + "Message/external-body" entities give rise to two important security + issues: + + (1) Accessing data via a "message/external-body" reference + effectively results in the message recipient performing + an operation that was specified by the message + originator. It is therefore possible for the message + originator to trick a recipient into doing something + they would not have done otherwise. For example, an + originator could specify a action that attempts + retrieval of material that the recipient is not + authorized to obtain, causing the recipient to + unwittingly violate some security policy. For this + reason, user agents capable of resolving external + references must always take steps to describe the + action they are to take to the recipient and ask for + explicit permisssion prior to performing it. + + The 'mail-server' access-type is particularly + vulnerable, in that it causes the recipient to send a + new message whose contents are specified by the + original message's originator. Given the potential for + abuse, any such request messages that are constructed + should contain a clear indication that they were + generated automatically (e.g. in a Comments: header + field) in an attempt to resolve a MIME + + + +Freed & Borenstein Standards Track [Page 37] + +RFC 2046 Media Types November 1996 + + + "message/external-body" reference. + + (2) MIME will sometimes be used in environments that + provide some guarantee of message integrity and + authenticity. If present, such guarantees may apply + only to the actual direct content of messages -- they + may or may not apply to data accessed through MIME's + "message/external-body" mechanism. In particular, it + may be possible to subvert certain access mechanisms + even when the messaging system itself is secure. + + It should be noted that this problem exists either with + or without the availabilty of MIME mechanisms. A + casual reference to an FTP site containing a document + in the text of a secure message brings up similar + issues -- the only difference is that MIME provides for + automatic retrieval of such material, and users may + place unwarranted trust is such automatic retrieval + mechanisms. + +5.2.3.7. Examples and Further Explanations + + When the external-body mechanism is used in conjunction with the + "multipart/alternative" media type it extends the functionality of + "multipart/alternative" to include the case where the same entity is + provided in the same format but via different accces mechanisms. + When this is done the originator of the message must order the parts + first in terms of preferred formats and then by preferred access + mechanisms. The recipient's viewer should then evaluate the list + both in terms of format and access mechanisms. + + With the emerging possibility of very wide-area file systems, it + becomes very hard to know in advance the set of machines where a file + will and will not be accessible directly from the file system. + Therefore it may make sense to provide both a file name, to be tried + directly, and the name of one or more sites from which the file is + known to be accessible. An implementation can try to retrieve remote + files using FTP or any other protocol, using anonymous file retrieval + or prompting the user for the necessary name and password. If an + external body is accessible via multiple mechanisms, the sender may + include multiple entities of type "message/external-body" within the + body parts of an enclosing "multipart/alternative" entity. + + However, the external-body mechanism is not intended to be limited to + file retrieval, as shown by the mail-server access-type. Beyond + this, one can imagine, for example, using a video server for external + references to video clips. + + + + +Freed & Borenstein Standards Track [Page 38] + +RFC 2046 Media Types November 1996 + + + The embedded message header fields which appear in the body of the + "message/external-body" data must be used to declare the media type + of the external body if it is anything other than plain US-ASCII + text, since the external body does not have a header section to + declare its type. Similarly, any Content-transfer-encoding other + than "7bit" must also be declared here. Thus a complete + "message/external-body" message, referring to an object in PostScript + format, might look like this: + + From: Whomever + To: Someone + Date: Whenever + Subject: whatever + MIME-Version: 1.0 + Message-ID: <id1@host.com> + Content-Type: multipart/alternative; boundary=42 + Content-ID: <id001@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; name="BodyFormats.ps"; + site="thumper.bellcore.com"; mode="image"; + access-type=ANON-FTP; directory="pub"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; access-type=local-file; + name="/u/nsb/writing/rfcs/RFC-MIME.ps"; + site="thumper.bellcore.com"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; + access-type=mail-server + server="listserv@bogus.bitnet"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + get RFC-MIME.DOC + + --42-- + + + +Freed & Borenstein Standards Track [Page 39] + +RFC 2046 Media Types November 1996 + + + Note that in the above examples, the default Content-transfer- + encoding of "7bit" is assumed for the external postscript data. + + Like the "message/partial" type, the "message/external-body" media + type is intended to be transparent, that is, to convey the data type + in the external body rather than to convey a message with a body of + that type. Thus the headers on the outer and inner parts must be + merged using the same rules as for "message/partial". In particular, + this means that the Content-type and Subject fields are overridden, + but the From field is preserved. + + Note that since the external bodies are not transported along with + the external body reference, they need not conform to transport + limitations that apply to the reference itself. In particular, + Internet mail transports may impose 7bit and line length limits, but + these do not automatically apply to binary external body references. + Thus a Content-Transfer-Encoding is not generally necessary, though + it is permitted. + + Note that the body of a message of type "message/external-body" is + governed by the basic syntax for an RFC 822 message. In particular, + anything before the first consecutive pair of CRLFs is header + information, while anything after it is body information, which is + ignored for most access-types. + +5.2.4. Other Message Subtypes + + MIME implementations must in general treat unrecognized subtypes of + "message" as being equivalent to "application/octet-stream". + + Future subtypes of "message" intended for use with email should be + restricted to "7bit" encoding. A type other than "message" should be + used if restriction to "7bit" is not possible. + +6. Experimental Media Type Values + + A media type value beginning with the characters "X-" is a private + value, to be used by consenting systems by mutual agreement. Any + format without a rigorous and public definition must be named with an + "X-" prefix, and publicly specified values shall never begin with + "X-". (Older versions of the widely used Andrew system use the "X- + BE2" name, so new systems should probably choose a different name.) + + In general, the use of "X-" top-level types is strongly discouraged. + Implementors should invent subtypes of the existing types whenever + possible. In many cases, a subtype of "application" will be more + appropriate than a new top-level type. + + + + +Freed & Borenstein Standards Track [Page 40] + +RFC 2046 Media Types November 1996 + + +7. Summary + + The five discrete media types provide provide a standardized + mechanism for tagging entities as "audio", "image", or several other + kinds of data. The composite "multipart" and "message" media types + allow mixing and hierarchical structuring of entities of different + types in a single message. A distinguished parameter syntax allows + further specification of data format details, particularly the + specification of alternate character sets. Additional optional + header fields provide mechanisms for certain extensions deemed + desirable by many implementors. Finally, a number of useful media + types are defined for general use by consenting user agents, notably + "message/partial" and "message/external-body". + +9. Security Considerations + + Security issues are discussed in the context of the + "application/postscript" type, the "message/external-body" type, and + in RFC 2048. Implementors should pay special attention to the + security implications of any media types that can cause the remote + execution of any actions in the recipient's environment. In such + cases, the discussion of the "application/postscript" type may serve + as a model for considering other media types with remote execution + capabilities. + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 41] + +RFC 2046 Media Types November 1996 + + +9. Authors' Addresses + + For more information, the authors of this document are best contacted + via Internet mail: + + Ned Freed + Innosoft International, Inc. + 1050 East Garvey Avenue South + West Covina, CA 91790 + USA + + Phone: +1 818 919 3600 + Fax: +1 818 919 3614 + EMail: ned@innosoft.com + + + Nathaniel S. Borenstein + First Virtual Holdings + 25 Washington Avenue + Morristown, NJ 07960 + USA + + Phone: +1 201 540 8967 + Fax: +1 201 993 3032 + EMail: nsb@nsb.fv.com + + + MIME is a result of the work of the Internet Engineering Task Force + Working Group on RFC 822 Extensions. The chairman of that group, + Greg Vaudreuil, may be reached at: + + Gregory M. Vaudreuil + Octel Network Services + 17080 Dallas Parkway + Dallas, TX 75248-1905 + USA + + EMail: Greg.Vaudreuil@Octel.Com + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 42] + +RFC 2046 Media Types November 1996 + + +Appendix A -- Collected Grammar + + This appendix contains the complete BNF grammar for all the syntax + specified by this document. + + By itself, however, this grammar is incomplete. It refers by name to + several syntax rules that are defined by RFC 822. Rather than + reproduce those definitions here, and risk unintentional differences + between the two, this document simply refers the reader to RFC 822 + for the remaining definitions. Wherever a term is undefined, it + refers to the RFC 822 definition. + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / + "+" / "_" / "," / "-" / "." / + "/" / ":" / "=" / "?" + + body-part := <"message" as defined in RFC 822, with all + header fields optional, not starting with the + specified dash-boundary, and with the + delimiter not occurring anywhere in the + body part. Note that the semantics of a + part differ from the semantics of a message, + as described in the text.> + + close-delimiter := delimiter "--" + + dash-boundary := "--" boundary + ; boundary taken from the value of + ; boundary parameter of the + ; Content-Type field. + + delimiter := CRLF dash-boundary + + discard-text := *(*text CRLF) + ; May be ignored or discarded. + + encapsulation := delimiter transport-padding + CRLF body-part + + epilogue := discard-text + + multipart-body := [preamble CRLF] + dash-boundary transport-padding CRLF + body-part *encapsulation + + + +Freed & Borenstein Standards Track [Page 43] + +RFC 2046 Media Types November 1996 + + + close-delimiter transport-padding + [CRLF epilogue] + + preamble := discard-text + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 44] + diff --git a/rfc/rfc2047.txt b/rfc/rfc2047.txt @@ -0,0 +1,843 @@ + + + + + + +Network Working Group K. Moore +Request for Comments: 2047 University of Tennessee +Obsoletes: 1521, 1522, 1590 November 1996 +Category: Standards Track + + + MIME (Multipurpose Internet Mail Extensions) Part Three: + Message Header Extensions for Non-ASCII Text + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822, defines a message representation protocol specifying + considerable detail about US-ASCII message headers, and leaves the + message content, or message body, as flat US-ASCII text. This set of + documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than US-ASCII, + + (2) an extensible set of different formats for non-textual message + bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + This particular document is the third document in the series. It + describes extensions to RFC 822 to allow non-US-ASCII text data in + Internet mail header fields. + + + + + + + + + +Moore Standards Track [Page 1] + +RFC 2047 Message Header Extensions November 1996 + + + Other documents in this series include: + + + RFC 2045, which specifies the various headers used to describe + the structure of MIME messages. + + + RFC 2046, which defines the general structure of the MIME media + typing system and defines an initial set of media types, + + + RFC 2048, which specifies various IANA registration procedures + for MIME-related facilities, and + + + RFC 2049, which describes MIME conformance criteria and + provides some illustrative examples of MIME message formats, + acknowledgements, and the bibliography. + + These documents are revisions of RFCs 1521, 1522, and 1590, which + themselves were revisions of RFCs 1341 and 1342. An appendix in RFC + 2049 describes differences and changes from previous versions. + +1. Introduction + + RFC 2045 describes a mechanism for denoting textual body parts which + are coded in various character sets, as well as methods for encoding + such body parts as sequences of printable US-ASCII characters. This + memo describes similar techniques to allow the encoding of non-ASCII + text in various portions of a RFC 822 [2] message header, in a manner + which is unlikely to confuse existing message handling software. + + Like the encoding techniques described in RFC 2045, the techniques + outlined here were designed to allow the use of non-ASCII characters + in message headers in a way which is unlikely to be disturbed by the + quirks of existing Internet mail handling programs. In particular, + some mail relaying programs are known to (a) delete some message + header fields while retaining others, (b) rearrange the order of + addresses in To or Cc fields, (c) rearrange the (vertical) order of + header fields, and/or (d) "wrap" message headers at different places + than those in the original message. In addition, some mail reading + programs are known to have difficulty correctly parsing message + headers which, while legal according to RFC 822, make use of + backslash-quoting to "hide" special characters such as "<", ",", or + ":", or which exploit other infrequently-used features of that + specification. + + While it is unfortunate that these programs do not correctly + interpret RFC 822 headers, to "break" these programs would cause + severe operational problems for the Internet mail system. The + extensions described in this memo therefore do not rely on little- + used features of RFC 822. + + + +Moore Standards Track [Page 2] + +RFC 2047 Message Header Extensions November 1996 + + + Instead, certain sequences of "ordinary" printable ASCII characters + (known as "encoded-words") are reserved for use as encoded data. The + syntax of encoded-words is such that they are unlikely to + "accidentally" appear as normal text in message headers. + Furthermore, the characters used in encoded-words are restricted to + those which do not have special meanings in the context in which the + encoded-word appears. + + Generally, an "encoded-word" is a sequence of printable ASCII + characters that begins with "=?", ends with "?=", and has two "?"s in + between. It specifies a character set and an encoding method, and + also includes the original text encoded as graphic ASCII characters, + according to the rules for that encoding method. + + A mail composer that implements this specification will provide a + means of inputting non-ASCII text in header fields, but will + translate these fields (or appropriate portions of these fields) into + encoded-words before inserting them into the message header. + + A mail reader that implements this specification will recognize + encoded-words when they appear in certain portions of the message + header. Instead of displaying the encoded-word "as is", it will + reverse the encoding and display the original text in the designated + character set. + +NOTES + + This memo relies heavily on notation and terms defined RFC 822 and + RFC 2045. In particular, the syntax for the ABNF used in this memo + is defined in RFC 822, as well as many of the terminal or nonterminal + symbols from RFC 822 are used in the grammar for the header + extensions defined here. Among the symbols defined in RFC 822 and + referenced in this memo are: 'addr-spec', 'atom', 'CHAR', 'comment', + 'CTLs', 'ctext', 'linear-white-space', 'phrase', 'quoted-pair'. + 'quoted-string', 'SPACE', and 'word'. Successful implementation of + this protocol extension requires careful attention to the RFC 822 + definitions of these terms. + + When the term "ASCII" appears in this memo, it refers to the "7-Bit + American Standard Code for Information Interchange", ANSI X3.4-1986. + The MIME charset name for this character set is "US-ASCII". When not + specifically referring to the MIME charset name, this document uses + the term "ASCII", both for brevity and for consistency with RFC 822. + However, implementors are warned that the character set name must be + spelled "US-ASCII" in MIME message and body part headers. + + + + + + +Moore Standards Track [Page 3] + +RFC 2047 Message Header Extensions November 1996 + + + This memo specifies a protocol for the representation of non-ASCII + text in message headers. It specifically DOES NOT define any + translation between "8-bit headers" and pure ASCII headers, nor is + any such translation assumed to be possible. + +2. Syntax of encoded-words + + An 'encoded-word' is defined by the following ABNF grammar. The + notation of RFC 822 is used, with the exception that white space + characters MUST NOT appear between components of an 'encoded-word'. + + encoded-word = "=?" charset "?" encoding "?" encoded-text "?=" + + charset = token ; see section 3 + + encoding = token ; see section 4 + + token = 1*<Any CHAR except SPACE, CTLs, and especials> + + especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / " + <"> / "/" / "[" / "]" / "?" / "." / "=" + + encoded-text = 1*<Any printable ASCII character other than "?" + or SPACE> + ; (but see "Use of encoded-words in message + ; headers", section 5) + + Both 'encoding' and 'charset' names are case-independent. Thus the + charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the + encoding named "Q" may be spelled either "Q" or "q". + + An 'encoded-word' may not be more than 75 characters long, including + 'charset', 'encoding', 'encoded-text', and delimiters. If it is + desirable to encode more text than will fit in an 'encoded-word' of + 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may + be used. + + While there is no limit to the length of a multiple-line header + field, each line of a header field that contains one or more + 'encoded-word's is limited to 76 characters. + + The length restrictions are included both to ease interoperability + through internetwork mail gateways, and to impose a limit on the + amount of lookahead a header parser must employ (while looking for a + final ?= delimiter) before it can decide whether a token is an + "encoded-word" or something else. + + + + + +Moore Standards Track [Page 4] + +RFC 2047 Message Header Extensions November 1996 + + + IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's + by an RFC 822 parser. As a consequence, unencoded white space + characters (such as SPACE and HTAB) are FORBIDDEN within an + 'encoded-word'. For example, the character sequence + + =?iso-8859-1?q?this is some text?= + + would be parsed as four 'atom's, rather than as a single 'atom' (by + an RFC 822 parser) or 'encoded-word' (by a parser which understands + 'encoded-words'). The correct way to encode the string "this is some + text" is to encode the SPACE characters as well, e.g. + + =?iso-8859-1?q?this=20is=20some=20text?= + + The characters which may appear in 'encoded-text' are further + restricted by the rules in section 5. + +3. Character sets + + The 'charset' portion of an 'encoded-word' specifies the character + set associated with the unencoded text. A 'charset' can be any of + the character set names allowed in an MIME "charset" parameter of a + "text/plain" body part, or any character set name registered with + IANA for use with the MIME text/plain content-type. + + Some character sets use code-switching techniques to switch between + "ASCII mode" and other modes. If unencoded text in an 'encoded-word' + contains a sequence which causes the charset interpreter to switch + out of ASCII mode, it MUST contain additional control codes such that + ASCII mode is again selected at the end of the 'encoded-word'. (This + rule applies separately to each 'encoded-word', including adjacent + 'encoded-word's within a single header field.) + + When there is a possibility of using more than one character set to + represent the text in an 'encoded-word', and in the absence of + private agreements between sender and recipients of a message, it is + recommended that members of the ISO-8859-* series be used in + preference to other character sets. + +4. Encodings + + Initially, the legal values for "encoding" are "Q" and "B". These + encodings are described below. The "Q" encoding is recommended for + use when most of the characters to be encoded are in the ASCII + character set; otherwise, the "B" encoding should be used. + Nevertheless, a mail reader which claims to recognize 'encoded-word's + MUST be able to accept either encoding for any character set which it + supports. + + + +Moore Standards Track [Page 5] + +RFC 2047 Message Header Extensions November 1996 + + + Only a subset of the printable ASCII characters may be used in + 'encoded-text'. Space and tab characters are not allowed, so that + the beginning and end of an 'encoded-word' are obvious. The "?" + character is used within an 'encoded-word' to separate the various + portions of the 'encoded-word' from one another, and thus cannot + appear in the 'encoded-text' portion. Other characters are also + illegal in certain contexts. For example, an 'encoded-word' in a + 'phrase' preceding an address in a From header field may not contain + any of the "specials" defined in RFC 822. Finally, certain other + characters are disallowed in some contexts, to ensure reliability for + messages that pass through internetwork mail gateways. + + The "B" encoding automatically meets these requirements. The "Q" + encoding allows a wide range of printable characters to be used in + non-critical locations in the message header (e.g., Subject), with + fewer characters available for use in other locations. + +4.1. The "B" encoding + + The "B" encoding is identical to the "BASE64" encoding defined by RFC + 2045. + +4.2. The "Q" encoding + + The "Q" encoding is similar to the "Quoted-Printable" content- + transfer-encoding defined in RFC 2045. It is designed to allow text + containing mostly ASCII characters to be decipherable on an ASCII + terminal without decoding. + + (1) Any 8-bit value may be represented by a "=" followed by two + hexadecimal digits. For example, if the character set in use + were ISO-8859-1, the "=" character would thus be encoded as + "=3D", and a SPACE by "=20". (Upper case should be used for + hexadecimal digits "A" through "F".) + + (2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be + represented as "_" (underscore, ASCII 95.). (This character may + not pass through some internetwork mail gateways, but its use + will greatly enhance readability of "Q" encoded data with mail + readers that do not support this encoding.) Note that the "_" + always represents hexadecimal 20, even if the SPACE character + occupies a different code position in the character set in use. + + (3) 8-bit values which correspond to printable ASCII characters other + than "=", "?", and "_" (underscore), MAY be represented as those + characters. (But see section 5 for restrictions.) In + particular, SPACE and TAB MUST NOT be represented as themselves + within encoded words. + + + +Moore Standards Track [Page 6] + +RFC 2047 Message Header Extensions November 1996 + + +5. Use of encoded-words in message headers + + An 'encoded-word' may appear in a message header or body part header + according to the following rules: + +(1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822) + in any Subject or Comments header field, any extension message + header field, or any MIME body part field for which the field body + is defined as '*text'. An 'encoded-word' may also appear in any + user-defined ("X-") message or body part header field. + + Ordinary ASCII text and 'encoded-word's may appear together in the + same header field. However, an 'encoded-word' that appears in a + header field defined as '*text' MUST be separated from any adjacent + 'encoded-word' or 'text' by 'linear-white-space'. + +(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and + ")", i.e., wherever a 'ctext' is allowed. More precisely, the RFC + 822 ABNF definition for 'comment' is amended as follows: + + comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")" + + A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT + contain the characters "(", ")" or " + 'encoded-word' that appears in a 'comment' MUST be separated from + any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'. + + It is important to note that 'comment's are only recognized inside + "structured" field bodies. In fields whose bodies are defined as + '*text', "(" and ")" are treated as ordinary characters rather than + comment delimiters, and rule (1) of this section applies. (See RFC + 822, sections 3.1.2 and 3.1.3) + +(3) As a replacement for a 'word' entity within a 'phrase', for example, + one that precedes an address in a From, To, or Cc header. The ABNF + definition for 'phrase' from RFC 822 thus becomes: + + phrase = 1*( encoded-word / word ) + + In this case the set of characters that may be used in a "Q"-encoded + 'encoded-word' is restricted to: <upper and lower case ASCII + letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_" + (underscore, ASCII 95.)>. An 'encoded-word' that appears within a + 'phrase' MUST be separated from any adjacent 'word', 'text' or + 'special' by 'linear-white-space'. + + + + + + +Moore Standards Track [Page 7] + +RFC 2047 Message Header Extensions November 1996 + + + These are the ONLY locations where an 'encoded-word' may appear. In + particular: + + + An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'. + + + An 'encoded-word' MUST NOT appear within a 'quoted-string'. + + + An 'encoded-word' MUST NOT be used in a Received header field. + + + An 'encoded-word' MUST NOT be used in parameter of a MIME + Content-Type or Content-Disposition field, or in any structured + field body except within a 'comment' or 'phrase'. + + The 'encoded-text' in an 'encoded-word' must be self-contained; + 'encoded-text' MUST NOT be continued from one 'encoded-word' to + another. This implies that the 'encoded-text' portion of a "B" + 'encoded-word' will be a multiple of 4 characters long; for a "Q" + 'encoded-word', any "=" character that appears in the 'encoded-text' + portion will be followed by two hexadecimal characters. + + Each 'encoded-word' MUST encode an integral number of octets. The + 'encoded-text' in each 'encoded-word' must be well-formed according + to the encoding specified; the 'encoded-text' may not be continued in + the next 'encoded-word'. (For example, "=?charset?Q?=?= + =?charset?Q?AB?=" would be illegal, because the two hex digits "AB" + must follow the "=" in the same 'encoded-word'.) + + Each 'encoded-word' MUST represent an integral number of characters. + A multi-octet character may not be split across adjacent 'encoded- + word's. + + Only printable and white space character data should be encoded using + this scheme. However, since these encoding schemes allow the + encoding of arbitrary octet values, mail readers that implement this + decoding should also ensure that display of the decoded data on the + recipient's terminal will not cause unwanted side-effects. + + Use of these methods to encode non-textual data (e.g., pictures or + sounds) is not defined by this memo. Use of 'encoded-word's to + represent strings of purely ASCII characters is allowed, but + discouraged. In rare cases it may be necessary to encode ordinary + text that looks like an 'encoded-word'. + + + + + + + + + +Moore Standards Track [Page 8] + +RFC 2047 Message Header Extensions November 1996 + + +6. Support of 'encoded-word's by mail readers + +6.1. Recognition of 'encoded-word's in message headers + + A mail reader must parse the message and body part headers according + to the rules in RFC 822 to correctly recognize 'encoded-word's. + + 'encoded-word's are to be recognized as follows: + + (1) Any message or body part header field defined as '*text', or any + user-defined header field, should be parsed as follows: Beginning + at the start of the field-body and immediately following each + occurrence of 'linear-white-space', each sequence of up to 75 + printable characters (not containing any 'linear-white-space') + should be examined to see if it is an 'encoded-word' according to + the syntax rules in section 2. Any other sequence of printable + characters should be treated as ordinary ASCII text. + + (2) Any header field not defined as '*text' should be parsed + according to the syntax rules for that header field. However, + any 'word' that appears within a 'phrase' should be treated as an + 'encoded-word' if it meets the syntax rules in section 2. + Otherwise it should be treated as an ordinary 'word'. + + (3) Within a 'comment', any sequence of up to 75 printable characters + (not containing 'linear-white-space'), that meets the syntax + rules in section 2, should be treated as an 'encoded-word'. + Otherwise it should be treated as normal comment text. + + (4) A MIME-Version header field is NOT required to be present for + 'encoded-word's to be interpreted according to this + specification. One reason for this is that the mail reader is + not expected to parse the entire message header before displaying + lines that may contain 'encoded-word's. + +6.2. Display of 'encoded-word's + + Any 'encoded-word's so recognized are decoded, and if possible, the + resulting unencoded text is displayed in the original character set. + + NOTE: Decoding and display of encoded-words occurs *after* a + structured field body is parsed into tokens. It is therefore + possible to hide 'special' characters in encoded-words which, when + displayed, will be indistinguishable from 'special' characters in the + surrounding text. For this and other reasons, it is NOT generally + possible to translate a message header containing 'encoded-word's to + an unencoded form which can be parsed by an RFC 822 mail reader. + + + + +Moore Standards Track [Page 9] + +RFC 2047 Message Header Extensions November 1996 + + + When displaying a particular header field that contains multiple + 'encoded-word's, any 'linear-white-space' that separates a pair of + adjacent 'encoded-word's is ignored. (This is to allow the use of + multiple 'encoded-word's to represent long strings of unencoded text, + without having to separate 'encoded-word's where spaces occur in the + unencoded text.) + + In the event other encodings are defined in the future, and the mail + reader does not support the encoding used, it may either (a) display + the 'encoded-word' as ordinary text, or (b) substitute an appropriate + message indicating that the text could not be decoded. + + If the mail reader does not support the character set used, it may + (a) display the 'encoded-word' as ordinary text (i.e., as it appears + in the header), (b) make a "best effort" to display using such + characters as are available, or (c) substitute an appropriate message + indicating that the decoded text could not be displayed. + + If the character set being used employs code-switching techniques, + display of the encoded text implicitly begins in "ASCII mode". In + addition, the mail reader must ensure that the output device is once + again in "ASCII mode" after the 'encoded-word' is displayed. + +6.3. Mail reader handling of incorrectly formed 'encoded-word's + + It is possible that an 'encoded-word' that is legal according to the + syntax defined in section 2, is incorrectly formed according to the + rules for the encoding being used. For example: + + (1) An 'encoded-word' which contains characters which are not legal + for a particular encoding (for example, a "-" in the "B" + encoding, or a SPACE or HTAB in either the "B" or "Q" encoding), + is incorrectly formed. + + (2) Any 'encoded-word' which encodes a non-integral number of + characters or octets is incorrectly formed. + + A mail reader need not attempt to display the text associated with an + 'encoded-word' that is incorrectly formed. However, a mail reader + MUST NOT prevent the display or handling of a message because an + 'encoded-word' is incorrectly formed. + +7. Conformance + + A mail composing program claiming compliance with this specification + MUST ensure that any string of non-white-space printable ASCII + characters within a '*text' or '*ctext' that begins with "=?" and + ends with "?=" be a valid 'encoded-word'. ("begins" means: at the + + + +Moore Standards Track [Page 10] + +RFC 2047 Message Header Extensions November 1996 + + + start of the field-body, immediately following 'linear-white-space', + or immediately following a "(" for an 'encoded-word' within '*ctext'; + "ends" means: at the end of the field-body, immediately preceding + 'linear-white-space', or immediately preceding a ")" for an + 'encoded-word' within '*ctext'.) In addition, any 'word' within a + 'phrase' that begins with "=?" and ends with "?=" must be a valid + 'encoded-word'. + + A mail reading program claiming compliance with this specification + must be able to distinguish 'encoded-word's from 'text', 'ctext', or + 'word's, according to the rules in section 6, anytime they appear in + appropriate places in message headers. It must support both the "B" + and "Q" encodings for any character set which it supports. The + program must be able to display the unencoded text if the character + set is "US-ASCII". For the ISO-8859-* character sets, the mail + reading program must at least be able to display the characters which + are also in the ASCII set. + +8. Examples + + The following are examples of message headers containing 'encoded- + word's: + + From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu> + To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk> + CC: =?ISO-8859-1?Q?Andr=E9?= Pirard <PIRARD@vm1.ulg.ac.be> + Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= + =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= + + Note: In the first 'encoded-word' of the Subject field above, the + last "=" at the end of the 'encoded-text' is necessary because each + 'encoded-word' must be self-contained (the "=" character completes a + group of 4 base64 characters representing 2 octets). An additional + octet could have been encoded in the first 'encoded-word' (so that + the encoded-word would contain an exact multiple of 3 encoded + octets), except that the second 'encoded-word' uses a different + 'charset' than the first one. + + From: =?ISO-8859-1?Q?Olle_J=E4rnefors?= <ojarnef@admin.kth.se> + To: ietf-822@dimacs.rutgers.edu, ojarnef@admin.kth.se + Subject: Time for ISO 10646? + + To: Dave Crocker <dcrocker@mordor.stanford.edu> + Cc: ietf-822@dimacs.rutgers.edu, paf@comsol.se + From: =?ISO-8859-1?Q?Patrik_F=E4ltstr=F6m?= <paf@nada.kth.se> + Subject: Re: RFC-HDR care and feeding + + + + + +Moore Standards Track [Page 11] + +RFC 2047 Message Header Extensions November 1996 + + + From: Nathaniel Borenstein <nsb@thumper.bellcore.com> + (=?iso-8859-8?b?7eXs+SDv4SDp7Oj08A==?=) + To: Greg Vaudreuil <gvaudre@NRI.Reston.VA.US>, Ned Freed + <ned@innosoft.com>, Keith Moore <moore@cs.utk.edu> + Subject: Test of new header generator + MIME-Version: 1.0 + Content-type: text/plain; charset=ISO-8859-1 + + The following examples illustrate how text containing 'encoded-word's + which appear in a structured field body. The rules are slightly + different for fields defined as '*text' because "(" and ")" are not + recognized as 'comment' delimiters. [Section 5, paragraph (1)]. + + In each of the following examples, if the same sequence were to occur + in a '*text' field, the "displayed as" form would NOT be treated as + encoded words, but be identical to the "encoded form". This is + because each of the encoded-words in the following examples is + adjacent to a "(" or ")" character. + + encoded form displayed as + --------------------------------------------------------------------- + (=?ISO-8859-1?Q?a?=) (a) + + (=?ISO-8859-1?Q?a?= b) (a b) + + Within a 'comment', white space MUST appear between an + 'encoded-word' and surrounding text. [Section 5, + paragraph (2)]. However, white space is not needed between + the initial "(" that begins the 'comment', and the + 'encoded-word'. + + + (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab) + + White space between adjacent 'encoded-word's is not + displayed. + + (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab) + + Even multiple SPACEs between 'encoded-word's are ignored + for the purpose of display. + + (=?ISO-8859-1?Q?a?= (ab) + =?ISO-8859-1?Q?b?=) + + Any amount of linear-space-white between 'encoded-word's, + even if it includes a CRLF followed by one or more SPACEs, + is ignored for the purposes of display. + + + +Moore Standards Track [Page 12] + +RFC 2047 Message Header Extensions November 1996 + + + (=?ISO-8859-1?Q?a_b?=) (a b) + + In order to cause a SPACE to be displayed within a portion + of encoded text, the SPACE MUST be encoded as part of the + 'encoded-word'. + + (=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=) (a b) + + In order to cause a SPACE to be displayed between two strings + of encoded text, the SPACE MAY be encoded as part of one of + the 'encoded-word's. + +9. References + + [RFC 822] Crocker, D., "Standard for the Format of ARPA Internet Text + Messages", STD 11, RFC 822, UDEL, August 1982. + + [RFC 2049] Borenstein, N., and N. Freed, "Multipurpose Internet Mail + Extensions (MIME) Part Five: Conformance Criteria and Examples", + RFC 2049, November 1996. + + [RFC 2045] Borenstein, N., and N. Freed, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message Bodies", + RFC 2045, November 1996. + + [RFC 2046] Borenstein N., and N. Freed, "Multipurpose Internet Mail + Extensions (MIME) Part Two: Media Types", RFC 2046, + November 1996. + + [RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) Part Four: Registration + Procedures", RFC 2048, November 1996. + + + + + + + + + + + + + + + + + + + +Moore Standards Track [Page 13] + +RFC 2047 Message Header Extensions November 1996 + + +10. Security Considerations + + Security issues are not discussed in this memo. + +11. Acknowledgements + + The author wishes to thank Nathaniel Borenstein, Issac Chan, Lutz + Donnerhacke, Paul Eggert, Ned Freed, Andreas M. Kirchwitz, Olle + Jarnefors, Mike Rosin, Yutaka Sato, Bart Schaefer, and Kazuhiko + Yamamoto, for their helpful advice, insightful comments, and + illuminating questions in response to earlier versions of this + specification. + +12. Author's Address + + Keith Moore + University of Tennessee + 107 Ayres Hall + Knoxville TN 37996-1301 + + EMail: moore@cs.utk.edu + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Moore Standards Track [Page 14] + +RFC 2047 Message Header Extensions November 1996 + + +Appendix - changes since RFC 1522 (in no particular order) + + + explicitly state that the MIME-Version is not requried to use + 'encoded-word's. + + + add explicit note that SPACEs and TABs are not allowed within + 'encoded-word's, explaining that an 'encoded-word' must look like an + 'atom' to an RFC822 parser.values, to be precise). + + + add examples from Olle Jarnefors (thanks!) which illustrate how + encoded-words with adjacent linear-white-space are displayed. + + + explicitly list terms defined in RFC822 and referenced in this memo + + + fix transcription typos that caused one or two lines and a couple of + characters to disappear in the resulting text, due to nroff quirks. + + + clarify that encoded-words are allowed in '*text' fields in both + RFC822 headers and MIME body part headers, but NOT as parameter + values. + + + clarify the requirement to switch back to ASCII within the encoded + portion of an 'encoded-word', for any charset that uses code switching + sequences. + + + add a note about 'encoded-word's being delimited by "(" and ")" + within a comment, but not in a *text (how bizarre!). + + + fix the Andre Pirard example to get rid of the trailing "_" after + the =E9. (no longer needed post-1342). + + + clarification: an 'encoded-word' may appear immediately following + the initial "(" or immediately before the final ")" that delimits a + comment, not just adjacent to "(" and ")" *within* *ctext. + + + add a note to explain that a "B" 'encoded-word' will always have a + multiple of 4 characters in the 'encoded-text' portion. + + + add note about the "=" in the examples + + + note that processing of 'encoded-word's occurs *after* parsing, and + some of the implications thereof. + + + explicitly state that you can't expect to translate between + 1522 and either vanilla 822 or so-called "8-bit headers". + + + explicitly state that 'encoded-word's are not valid within a + 'quoted-string'. + + + +Moore Standards Track [Page 15] + diff --git a/rfc/rfc2048.txt b/rfc/rfc2048.txt @@ -0,0 +1,1180 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2048 Innosoft +BCP: 13 J. Klensin +Obsoletes: 1521, 1522, 1590 MCI +Category: Best Current Practice J. Postel + ISI + November 1996 + + + Multipurpose Internet Mail Extensions + (MIME) Part Four: + Registration Procedures + +Status of this Memo + + This document specifies an Internet Best Current Practices for the + Internet Community, and requests discussion and suggestions for + improvements. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822, defines a message representation protocol specifying + considerable detail about US-ASCII message headers, and leaves the + message content, or message body, as flat US-ASCII text. This set of + documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) an extensible set of different formats for non-textual + message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + + + + + + + + +Freed, et. al. Best Current Practice [Page 1] + +RFC 2048 MIME Registration Procedures November 1996 + + + This fourth document, RFC 2048, specifies various IANA registration + procedures for the following MIME facilities: + + (1) media types, + + (2) external body access types, + + (3) content-transfer-encodings. + + Registration of character sets for use in MIME is covered elsewhere + and is no longer addressed by this document. + + These documents are revisions of RFCs 1521 and 1522, which themselves + were revisions of RFCs 1341 and 1342. An appendix in RFC 2049 + describes differences and changes from previous versions. + +Table of Contents + + 1. Introduction ......................................... 3 + 2. Media Type Registration .............................. 4 + 2.1 Registration Trees and Subtype Names ................ 4 + 2.1.1 IETF Tree ......................................... 4 + 2.1.2 Vendor Tree ....................................... 4 + 2.1.3 Personal or Vanity Tree ........................... 5 + 2.1.4 Special `x.' Tree ................................. 5 + 2.1.5 Additional Registration Trees ..................... 6 + 2.2 Registration Requirements ........................... 6 + 2.2.1 Functionality Requirement ......................... 6 + 2.2.2 Naming Requirements ............................... 6 + 2.2.3 Parameter Requirements ............................ 7 + 2.2.4 Canonicalization and Format Requirements .......... 7 + 2.2.5 Interchange Recommendations ....................... 8 + 2.2.6 Security Requirements ............................. 8 + 2.2.7 Usage and Implementation Non-requirements ......... 9 + 2.2.8 Publication Requirements .......................... 10 + 2.2.9 Additional Information ............................ 10 + 2.3 Registration Procedure .............................. 11 + 2.3.1 Present the Media Type to the Community for Review 11 + 2.3.2 IESG Approval ..................................... 12 + 2.3.3 IANA Registration ................................. 12 + 2.4 Comments on Media Type Registrations ................ 12 + 2.5 Location of Registered Media Type List .............. 12 + 2.6 IANA Procedures for Registering Media Types ......... 12 + 2.7 Change Control ...................................... 13 + 2.8 Registration Template ............................... 14 + 3. External Body Access Types ........................... 14 + 3.1 Registration Requirements ........................... 15 + 3.1.1 Naming Requirements ............................... 15 + + + +Freed, et. al. Best Current Practice [Page 2] + +RFC 2048 MIME Registration Procedures November 1996 + + + 3.1.2 Mechanism Specification Requirements .............. 15 + 3.1.3 Publication Requirements .......................... 15 + 3.1.4 Security Requirements ............................. 15 + 3.2 Registration Procedure .............................. 15 + 3.2.1 Present the Access Type to the Community .......... 16 + 3.2.2 Access Type Reviewer .............................. 16 + 3.2.3 IANA Registration ................................. 16 + 3.3 Location of Registered Access Type List ............. 16 + 3.4 IANA Procedures for Registering Access Types ........ 16 + 4. Transfer Encodings ................................... 17 + 4.1 Transfer Encoding Requirements ...................... 17 + 4.1.1 Naming Requirements ............................... 17 + 4.1.2 Algorithm Specification Requirements .............. 18 + 4.1.3 Input Domain Requirements ......................... 18 + 4.1.4 Output Range Requirements ......................... 18 + 4.1.5 Data Integrity and Generality Requirements ........ 18 + 4.1.6 New Functionality Requirements .................... 18 + 4.2 Transfer Encoding Definition Procedure .............. 19 + 4.3 IANA Procedures for Transfer Encoding Registration... 19 + 4.4 Location of Registered Transfer Encodings List ...... 19 + 5. Authors' Addresses ................................... 20 + A. Grandfathered Media Types ............................ 21 + +1. Introduction + + Recent Internet protocols have been carefully designed to be easily + extensible in certain areas. In particular, MIME [RFC 2045] is an + open-ended framework and can accommodate additional object types, + character sets, and access methods without any changes to the basic + protocol. A registration process is needed, however, to ensure that + the set of such values is developed in an orderly, well-specified, + and public manner. + + This document defines registration procedures which use the Internet + Assigned Numbers Authority (IANA) as a central registry for such + values. + + Historical Note: The registration process for media types was + initially defined in the context of the asynchronous Internet mail + environment. In this mail environment there is a need to limit the + number of possible media types to increase the likelihood of + interoperability when the capabilities of the remote mail system are + not known. As media types are used in new environments, where the + proliferation of media types is not a hindrance to interoperability, + the original procedure was excessively restrictive and had to be + generalized. + + + + + +Freed, et. al. Best Current Practice [Page 3] + +RFC 2048 MIME Registration Procedures November 1996 + + +2. Media Type Registration + + Registration of a new media type or types starts with the + construction of a registration proposal. Registration may occur in + several different registration trees, which have different + requirements as discussed below. In general, the new registration + proposal is circulated and reviewed in a fashion appropriate to the + tree involved. The media type is then registered if the proposal is + acceptable. The following sections describe the requirements and + procedures used for each of the different registration trees. + +2.1. Registration Trees and Subtype Names + + In order to increase the efficiency and flexibility of the + registration process, different structures of subtype names may be + registered to accomodate the different natural requirements for, + e.g., a subtype that will be recommended for wide support and + implementation by the Internet Community or a subtype that is used to + move files associated with proprietary software. The following + subsections define registration "trees", distinguished by the use of + faceted names (e.g., names of the form "tree.subtree...type"). Note + that some media types defined prior to this document do not conform + to the naming conventions described below. See Appendix A for a + discussion of them. + +2.1.1. IETF Tree + + The IETF tree is intended for types of general interest to the + Internet Community. Registration in the IETF tree requires approval + by the IESG and publication of the media type registration as some + form of RFC. + + Media types in the IETF tree are normally denoted by names that are + not explicitly faceted, i.e., do not contain period (".", full stop) + characters. + + The "owner" of a media type registration in the IETF tree is assumed + to be the IETF itself. Modification or alteration of the + specification requires the same level of processing (e.g. standards + track) required for the initial registration. + +2.1.2. Vendor Tree + + The vendor tree is used for media types associated with commercially + available products. "Vendor" or "producer" are construed as + equivalent and very broadly in this context. + + + + + +Freed, et. al. Best Current Practice [Page 4] + +RFC 2048 MIME Registration Procedures November 1996 + + + A registration may be placed in the vendor tree by anyone who has + need to interchange files associated with the particular product. + However, the registration formally belongs to the vendor or + organization producing the software or file format. Changes to the + specification will be made at their request, as discussed in + subsequent sections. + + Registrations in the vendor tree will be distinguished by the leading + facet "vnd.". That may be followed, at the discretion of the + registration, by either a media type name from a well-known producer + (e.g., "vnd.mudpie") or by an IANA-approved designation of the + producer's name which is then followed by a media type or product + designation (e.g., vnd.bigcompany.funnypictures). + + While public exposure and review of media types to be registered in + the vendor tree is not required, using the ietf-types list for review + is strongly encouraged to improve the quality of those + specifications. Registrations in the vendor tree may be submitted + directly to the IANA. + +2.1.3. Personal or Vanity Tree + + Registrations for media types created experimentally or as part of + products that are not distributed commercially may be registered in + the personal or vanity tree. The registrations are distinguished by + the leading facet "prs.". + + The owner of "personal" registrations and associated specifications + is the person or entity making the registration, or one to whom + responsibility has been transferred as described below. + + While public exposure and review of media types to be registered in + the personal tree is not required, using the ietf-types list for + review is strongly encouraged to improve the quality of those + specifications. Registrations in the personl tree may be submitted + directly to the IANA. + +2.1.4. Special `x.' Tree + + For convenience and symmetry with this registration scheme, media + type names with "x." as the first facet may be used for the same + purposes for which names starting in "x-" are normally used. These + types are unregistered, experimental, and should be used only with + the active agreement of the parties exchanging them. + + + + + + + +Freed, et. al. Best Current Practice [Page 5] + +RFC 2048 MIME Registration Procedures November 1996 + + + However, with the simplified registration procedures described above + for vendor and personal trees, it should rarely, if ever, be + necessary to use unregistered experimental types, and as such use of + both "x-" and "x." forms is discouraged. + +2.1.5. Additional Registration Trees + + From time to time and as required by the community, the IANA may, + with the advice and consent of the IESG, create new top-level + registration trees. It is explicitly assumed that these trees may be + created for external registration and management by well-known + permanent bodies, such as scientific societies for media types + specific to the sciences they cover. In general, the quality of + review of specifications for one of these additional registration + trees is expected to be equivalent to that which IETF would give to + registrations in its own tree. Establishment of these new trees will + be announced through RFC publication approved by the IESG. + +2.2. Registration Requirements + + Media type registration proposals are all expected to conform to + various requirements laid out in the following sections. Note that + requirement specifics sometimes vary depending on the registration + tree, again as detailed in the following sections. + +2.2.1. Functionality Requirement + + Media types must function as an actual media format: Registration of + things that are better thought of as a transfer encoding, as a + character set, or as a collection of separate entities of another + type, is not allowed. For example, although applications exist to + decode the base64 transfer encoding [RFC 2045], base64 cannot be + registered as a media type. + + This requirement applies regardless of the registration tree + involved. + +2.2.2. Naming Requirements + + All registered media types must be assigned MIME type and subtype + names. The combination of these names then serves to uniquely + identify the media type and the format of the subtype name identifies + the registration tree. + + The choice of top-level type name must take the nature of media type + involved into account. For example, media normally used for + representing still images should be a subtype of the image content + type, whereas media capable of representing audio information belongs + + + +Freed, et. al. Best Current Practice [Page 6] + +RFC 2048 MIME Registration Procedures November 1996 + + + under the audio content type. See RFC 2046 for additional information + on the basic set of top-level types and their characteristics. + + New subtypes of top-level types must conform to the restrictions of + the top-level type, if any. For example, all subtypes of the + multipart content type must use the same encapsulation syntax. + + In some cases a new media type may not "fit" under any currently + defined top-level content type. Such cases are expected to be quite + rare. However, if such a case arises a new top-level type can be + defined to accommodate it. Such a definition must be done via + standards-track RFC; no other mechanism can be used to define + additional top-level content types. + + These requirements apply regardless of the registration tree + involved. + +2.2.3. Parameter Requirements + + Media types may elect to use one or more MIME content type + parameters, or some parameters may be automatically made available to + the media type by virtue of being a subtype of a content type that + defines a set of parameters applicable to any of its subtypes. In + either case, the names, values, and meanings of any parameters must + be fully specified when a media type is registered in the IETF tree, + and should be specified as completely as possible when media types + are registered in the vendor or personal trees. + + New parameters must not be defined as a way to introduce new + functionality in types registered in the IETF tree, although new + parameters may be added to convey additional information that does + not otherwise change existing functionality. An example of this + would be a "revision" parameter to indicate a revision level of an + external specification such as JPEG. Similar behavior is encouraged + for media types registered in the vendor or personal trees but is not + required. + +2.2.4. Canonicalization and Format Requirements + + All registered media types must employ a single, canonical data + format, regardless of registration tree. + + A precise and openly available specification of the format of each + media type is required for all types registered in the IETF tree and + must at a minimum be referenced by, if it isn't actually included in, + the media type registration proposal itself. + + + + + +Freed, et. al. Best Current Practice [Page 7] + +RFC 2048 MIME Registration Procedures November 1996 + + + The specifications of format and processing particulars may or may + not be publically available for media types registered in the vendor + tree, and such registration proposals are explicitly permitted to + include only a specification of which software and version produce or + process such media types. References to or inclusion of format + specifications in registration proposals is encouraged but not + required. + + Format specifications are still required for registration in the + personal tree, but may be either published as RFCs or otherwise + deposited with IANA. The deposited specifications will meet the same + criteria as those required to register a well-known TCP port and, in + particular, need not be made public. + + Some media types involve the use of patented technology. The + registration of media types involving patented technology is + specifically permitted. However, the restrictions set forth in RFC + 1602 on the use of patented technology in standards-track protocols + must be respected when the specification of a media type is part of a + standards-track protocol. + +2.2.5. Interchange Recommendations + + Media types should, whenever possible, interoperate across as many + systems and applications as possible. However, some media types will + inevitably have problems interoperating across different platforms. + Problems with different versions, byte ordering, and specifics of + gateway handling can and will arise. + + Universal interoperability of media types is not required, but known + interoperability issues should be identified whenever possible. + Publication of a media type does not require an exhaustive review of + interoperability, and the interoperability considerations section is + subject to continuing evaluation. + + These recommendations apply regardless of the registration tree + involved. + +2.2.6. Security Requirements + + An analysis of security issues is required for for all types + registered in the IETF Tree. (This is in accordance with the basic + requirements for all IETF protocols.) A similar analysis for media + types registered in the vendor or personal trees is encouraged but + not required. However, regardless of what security analysis has or + has not been done, all descriptions of security issues must be as + accurate as possible regardless of registration tree. In particular, + a statement that there are "no security issues associated with this + + + +Freed, et. al. Best Current Practice [Page 8] + +RFC 2048 MIME Registration Procedures November 1996 + + + type" must not be confused with "the security issues associates with + this type have not been assessed". + + There is absolutely no requirement that media types registered in any + tree be secure or completely free from risks. Nevertheless, all + known security risks must be identified in the registration of a + media type, again regardless of registration tree. + + The security considerations section of all registrations is subject + to continuing evaluation and modification, and in particular may be + extended by use of the "comments on media types" mechanism described + in subsequent sections. + + Some of the issues that should be looked at in a security analysis of + a media type are: + + (1) Complex media types may include provisions for + directives that institute actions on a recipient's + files or other resources. In many cases provision is + made for originators to specify arbitrary actions in an + unrestricted fashion which may then have devastating + effects. See the registration of the + application/postscript media type in RFC 2046 for + an example of such directives and how to handle them. + + (2) Complex media types may include provisions for + directives that institute actions which, while not + directly harmful to the recipient, may result in + disclosure of information that either facilitates a + subsequent attack or else violates a recipient's + privacy in some way. Again, the registration of the + application/postscript media type illustrates how such + directives can be handled. + + (3) A media type might be targeted for applications that + require some sort of security assurance but not provide + the necessary security mechanisms themselves. For + example, a media type could be defined for storage of + confidential medical information which in turn requires + an external confidentiality service. + +2.2.7. Usage and Implementation Non-requirements + + In the asynchronous mail environment, where information on the + capabilities of the remote mail agent is frequently not available to + the sender, maximum interoperability is attained by restricting the + number of media types used to those "common" formats expected to be + widely implemented. This was asserted in the past as a reason to + + + +Freed, et. al. Best Current Practice [Page 9] + +RFC 2048 MIME Registration Procedures November 1996 + + + limit the number of possible media types and resulted in a + registration process with a significant hurdle and delay for those + registering media types. + + However, the need for "common" media types does not require limiting + the registration of new media types. If a limited set of media types + is recommended for a particular application, that should be asserted + by a separate applicability statement specific for the application + and/or environment. + + As such, universal support and implementation of a media type is NOT + a requirement for registration. If, however, a media type is + explicitly intended for limited use, this should be noted in its + registration. + +2.2.8. Publication Requirements + + Proposals for media types registered in the IETF tree must be + published as RFCs. RFC publication of vendor and personal media type + proposals is encouraged but not required. In all cases IANA will + retain copies of all media type proposals and "publish" them as part + of the media types registration tree itself. + + Other than in the IETF tree, the registration of a data type does not + imply endorsement, approval, or recommendation by IANA or IETF or + even certification that the specification is adequate. To become + Internet Standards, protocol, data objects, or whatever must go + through the IETF standards process. This is too difficult and too + lengthy a process for the convenient registration of media types. + + The IETF tree exists for media types that do require require a + substantive review and approval process with the vendor and personal + trees exist for those that do not. It is expected that applicability + statements for particular applications will be published from time to + time that recommend implementation of, and support for, media types + that have proven particularly useful in those contexts. + + As discussed above, registration of a top-level type requires + standards-track processing and, hence, RFC publication. + +2.2.9. Additional Information + + Various sorts of optional information may be included in the + specification of a media type if it is available: + + (1) Magic number(s) (length, octet values). Magic numbers + are byte sequences that are always present and thus can + be used to identify entities as being of a given media + + + +Freed, et. al. Best Current Practice [Page 10] + +RFC 2048 MIME Registration Procedures November 1996 + + + type. + + (2) File extension(s) commonly used on one or more + platforms to indicate that some file containing a given + type of media. + + (3) Macintosh File Type code(s) (4 octets) used to label + files containing a given type of media. + + Such information is often quite useful to implementors and if + available should be provided. + +2.3. Registration Procedure + + The following procedure has been implemented by the IANA for review + and approval of new media types. This is not a formal standards + process, but rather an administrative procedure intended to allow + community comment and sanity checking without excessive time delay. + For registration in the IETF tree, the normal IETF processes should + be followed, treating posting of an internet-draft and announcement + on the ietf-types list (as described in the next subsection) as a + first step. For registrations in the vendor or personal tree, the + initial review step described below may be omitted and the type + registered directly by submitting the template and an explanation + directly to IANA (at iana@iana.org). However, authors of vendor or + personal media type specifications are encouraged to seek community + review and comment whenever that is feasible. + +2.3.1. Present the Media Type to the Community for Review + + Send a proposed media type registration to the "ietf-types@iana.org" + mailing list for a two week review period. This mailing list has + been established for the purpose of reviewing proposed media and + access types. Proposed media types are not formally registered and + must not be used; the "x-" prefix specified in RFC 2045 can be used + until registration is complete. + + The intent of the public posting is to solicit comments and feedback + on the choice of type/subtype name, the unambiguity of the references + with respect to versions and external profiling information, and a + review of any interoperability or security considerations. The + submitter may submit a revised registration, or withdraw the + registration completely, at any time. + + + + + + + + +Freed, et. al. Best Current Practice [Page 11] + +RFC 2048 MIME Registration Procedures November 1996 + + +2.3.2. IESG Approval + + Media types registered in the IETF tree must be submitted to the IESG + for approval. + +2.3.3. IANA Registration + + Provided that the media type meets the requirements for media types + and has obtained approval that is necessary, the author may submit + the registration request to the IANA, which will register the media + type and make the media type registration available to the community. + +2.4. Comments on Media Type Registrations + + Comments on registered media types may be submitted by members of the + community to IANA. These comments will be passed on to the "owner" + of the media type if possible. Submitters of comments may request + that their comment be attached to the media type registration itself, + and if IANA approves of this the comment will be made accessible in + conjunction with the type registration itself. + +2.5. Location of Registered Media Type List + + Media type registrations will be posted in the anonymous FTP + directory "ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/" + and all registered media types will be listed in the periodically + issued "Assigned Numbers" RFC [currently STD 2, RFC 1700]. The media + type description and other supporting material may also be published + as an Informational RFC by sending it to "rfc-editor@isi.edu" (please + follow the instructions to RFC authors [RFC-1543]). + +2.6. IANA Procedures for Registering Media Types + + The IANA will only register media types in the IETF tree in response + to a communication from the IESG stating that a given registration + has been approved. Vendor and personal types will be registered by + the IANA automatically and without any formal review as long as the + following minimal conditions are met: + + (1) Media types must function as an actual media format. + In particular, character sets and transfer encodings + may not be registered as media types. + + (2) All media types must have properly formed type and + subtype names. All type names must be defined by a + standards-track RFC. All subtype names must be unique, + must conform to the MIME grammar for such names, and + must contain the proper tree prefix. + + + +Freed, et. al. Best Current Practice [Page 12] + +RFC 2048 MIME Registration Procedures November 1996 + + + (3) Types registered in the personal tree must either + provide a format specification or a pointer to one. + + (4) Any security considerations given must not be obviously + bogus. (It is neither possible nor necessary for the + IANA to conduct a comprehensive security review of + media type registrations. Nevertheless, IANA has the + authority to identify obviously incompetent material + and exclude it.) + +2.7. Change Control + + Once a media type has been published by IANA, the author may request + a change to its definition. The descriptions of the different + registration trees above designate the "owners" of each type of + registration. The change request follows the same procedure as the + registration request: + + (1) Publish the revised template on the ietf-types list. + + (2) Leave at least two weeks for comments. + + (3) Publish using IANA after formal review if required. + + Changes should be requested only when there are serious omission or + errors in the published specification. When review is required, a + change request may be denied if it renders entities that were valid + under the previous definition invalid under the new definition. + + The owner of a content type may pass responsibility for the content + type to another person or agency by informing IANA and the ietf-types + list; this can be done without discussion or review. + + The IESG may reassign responsibility for a media type. The most + common case of this will be to enable changes to be made to types + where the author of the registration has died, moved out of contact + or is otherwise unable to make changes that are important to the + community. + + Media type registrations may not be deleted; media types which are no + longer believed appropriate for use can be declared OBSOLETE by a + change to their "intended use" field; such media types will be + clearly marked in the lists published by IANA. + + + + + + + + +Freed, et. al. Best Current Practice [Page 13] + +RFC 2048 MIME Registration Procedures November 1996 + + +2.8. Registration Template + + To: ietf-types@iana.org + Subject: Registration of MIME media type XXX/YYY + + MIME media type name: + + MIME subtype name: + + Required parameters: + + Optional parameters: + + Encoding considerations: + + Security considerations: + + Interoperability considerations: + + Published specification: + + Applications which use this media type: + + Additional information: + + Magic number(s): + File extension(s): + Macintosh File Type Code(s): + + Person & email address to contact for further information: + + Intended usage: + + (One of COMMON, LIMITED USE or OBSOLETE) + + Author/Change controller: + + (Any other information that the author deems interesting may be + added below this line.) + +3. External Body Access Types + + RFC 2046 defines the message/external-body media type, whereby a MIME + entity can act as pointer to the actual body data in lieu of + including the data directly in the entity body. Each + message/external-body reference specifies an access type, which + determines the mechanism used to retrieve the actual body data. RFC + 2046 defines an initial set of access types, but allows for the + + + +Freed, et. al. Best Current Practice [Page 14] + +RFC 2048 MIME Registration Procedures November 1996 + + + registration of additional access types to accommodate new retrieval + mechanisms. + +3.1. Registration Requirements + + New access type specifications must conform to a number of + requirements as described below. + +3.1.1. Naming Requirements + + Each access type must have a unique name. This name appears in the + access-type parameter in the message/external-body content-type + header field, and must conform to MIME content type parameter syntax. + +3.1.2. Mechanism Specification Requirements + + All of the protocols, transports, and procedures used by a given + access type must be described, either in the specification of the + access type itself or in some other publicly available specification, + in sufficient detail for the access type to be implemented by any + competent implementor. Use of secret and/or proprietary methods in + access types are expressly prohibited. The restrictions imposed by + RFC 1602 on the standardization of patented algorithms must be + respected as well. + +3.1.3. Publication Requirements + + All access types must be described by an RFC. The RFC may be + informational rather than standards-track, although standard-track + review and approval are encouraged for all access types. + +3.1.4. Security Requirements + + Any known security issues that arise from the use of the access type + must be completely and fully described. It is not required that the + access type be secure or that it be free from risks, but that the + known risks be identified. Publication of a new access type does not + require an exhaustive security review, and the security + considerations section is subject to continuing evaluation. + Additional security considerations should be addressed by publishing + revised versions of the access type specification. + +3.2. Registration Procedure + + Registration of a new access type starts with the construction of a + draft of an RFC. + + + + + +Freed, et. al. Best Current Practice [Page 15] + +RFC 2048 MIME Registration Procedures November 1996 + + +3.2.1. Present the Access Type to the Community + + Send a proposed access type specification to the "ietf- + types@iana.org" mailing list for a two week review period. This + mailing list has been established for the purpose of reviewing + proposed access and media types. Proposed access types are not + formally registered and must not be used. + + The intent of the public posting is to solicit comments and feedback + on the access type specification and a review of any security + considerations. + +3.2.2. Access Type Reviewer + + When the two week period has passed, the access type reviewer, who is + appointed by the IETF Applications Area Director, either forwards the + request to iana@isi.edu, or rejects it because of significant + objections raised on the list. + + Decisions made by the reviewer must be posted to the ietf-types + mailing list within 14 days. Decisions made by the reviewer may be + appealed to the IESG. + +3.2.3. IANA Registration + + Provided that the access type has either passed review or has been + successfully appealed to the IESG, the IANA will register the access + type and make the registration available to the community. The + specification of the access type must also be published as an RFC. + Informational RFCs are published by sending them to "rfc- + editor@isi.edu" (please follow the instructions to RFC authors [RFC- + 1543]). + +3.3. Location of Registered Access Type List + + Access type registrations will be posted in the anonymous FTP + directory "ftp://ftp.isi.edu/in-notes/iana/assignments/access-types/" + and all registered access types will be listed in the periodically + issued "Assigned Numbers" RFC [currently RFC-1700]. + +3.4. IANA Procedures for Registering Access Types + + The identity of the access type reviewer is communicated to the IANA + by the IESG. The IANA then only acts in response to access type + definitions that either are approved by the access type reviewer and + forwarded by the reviewer to the IANA for registration, or in + response to a communication from the IESG that an access type + definition appeal has overturned the access type reviewer's ruling. + + + +Freed, et. al. Best Current Practice [Page 16] + +RFC 2048 MIME Registration Procedures November 1996 + + +4. Transfer Encodings + + Transfer encodings are tranformations applied to MIME media types + after conversion to the media type's canonical form. Transfer + encodings are used for several purposes: + + (1) Many transports, especially message transports, can + only handle data consisting of relatively short lines + of text. There can also be severe restrictions on what + characters can be used in these lines of text -- some + transports are restricted to a small subset of US-ASCII + and others cannot handle certain character sequences. + Transfer encodings are used to transform binary data + into textual form that can survive such transports. + Examples of this sort of transfer encoding include the + base64 and quoted-printable transfer encodings defined + in RFC 2045. + + (2) Image, audio, video, and even application entities are + sometimes quite large. Compression algorithms are often + quite effective in reducing the size of large entities. + Transfer encodings can be used to apply general-purpose + non-lossy compression algorithms to MIME entities. + + (3) Transport encodings can be defined as a means of + representing existing encoding formats in a MIME + context. + + IMPORTANT: The standardization of a large numbers of different + transfer encodings is seen as a significant barrier to widespread + interoperability and is expressely discouraged. Nevertheless, the + following procedure has been defined to provide a means of defining + additional transfer encodings, should standardization actually be + justified. + +4.1. Transfer Encoding Requirements + + Transfer encoding specifications must conform to a number of + requirements as described below. + +4.1.1. Naming Requirements + + Each transfer encoding must have a unique name. This name appears in + the Content-Transfer-Encoding header field and must conform to the + syntax of that field. + + + + + + +Freed, et. al. Best Current Practice [Page 17] + +RFC 2048 MIME Registration Procedures November 1996 + + +4.1.2. Algorithm Specification Requirements + + All of the algorithms used in a transfer encoding (e.g. conversion + to printable form, compression) must be described in their entirety + in the transfer encoding specification. Use of secret and/or + proprietary algorithms in standardized transfer encodings are + expressly prohibited. The restrictions imposed by RFC 1602 on the + standardization of patented algorithms must be respected as well. + +4.1.3. Input Domain Requirements + + All transfer encodings must be applicable to an arbitrary sequence of + octets of any length. Dependence on particular input forms is not + allowed. + + It should be noted that the 7bit and 8bit encodings do not conform to + this requirement. Aside from the undesireability of having + specialized encodings, the intent here is to forbid the addition of + additional encodings along the lines of 7bit and 8bit. + +4.1.4. Output Range Requirements + + There is no requirement that a particular tranfer encoding produce a + particular form of encoded output. However, the output format for + each transfer encoding must be fully and completely documented. In + particular, each specification must clearly state whether the output + format always lies within the confines of 7bit data, 8bit data, or is + simply pure binary data. + +4.1.5. Data Integrity and Generality Requirements + + All transfer encodings must be fully invertible on any platform; it + must be possible for anyone to recover the original data by + performing the corresponding decoding operation. Note that this + requirement effectively excludes all forms of lossy compression as + well as all forms of encryption from use as a transfer encoding. + +4.1.6. New Functionality Requirements + + All transfer encodings must provide some sort of new functionality. + Some degree of functionality overlap with previously defined transfer + encodings is acceptable, but any new transfer encoding must also + offer something no other transfer encoding provides. + + + + + + + + +Freed, et. al. Best Current Practice [Page 18] + +RFC 2048 MIME Registration Procedures November 1996 + + +4.2. Transfer Encoding Definition Procedure + + Definition of a new transfer encoding starts with the construction of + a draft of a standards-track RFC. The RFC must define the transfer + encoding precisely and completely, and must also provide substantial + justification for defining and standardizing a new transfer encoding. + This specification must then be presented to the IESG for + consideration. The IESG can + + (1) reject the specification outright as being + inappropriate for standardization, + + (2) approve the formation of an IETF working group to work + on the specification in accordance with IETF + procedures, or, + + (3) accept the specification as-is and put it directly on + the standards track. + + Transfer encoding specifications on the standards track follow normal + IETF rules for standards track documents. A transfer encoding is + considered to be defined and available for use once it is on the + standards track. + +4.3. IANA Procedures for Transfer Encoding Registration + + There is no need for a special procedure for registering Transfer + Encodings with the IANA. All legitimate transfer encoding + registrations must appear as a standards-track RFC, so it is the + IESG's responsibility to notify the IANA when a new transfer encoding + has been approved. + +4.4. Location of Registered Transfer Encodings List + + Transfer encoding registrations will be posted in the anonymous FTP + directory "ftp://ftp.isi.edu/in-notes/iana/assignments/transfer- + encodings/" and all registered transfer encodings will be listed in + the periodically issued "Assigned Numbers" RFC [currently RFC-1700]. + + + + + + + + + + + + + +Freed, et. al. Best Current Practice [Page 19] + +RFC 2048 MIME Registration Procedures November 1996 + + +5. Authors' Addresses + + For more information, the authors of this document are best + contacted via Internet mail: + + Ned Freed + Innosoft International, Inc. + 1050 East Garvey Avenue South + West Covina, CA 91790 + USA + + Phone: +1 818 919 3600 + Fax: +1 818 919 3614 + EMail: ned@innosoft.com + + + John Klensin + MCI + 2100 Reston Parkway + Reston, VA 22091 + + Phone: +1 703 715-7361 + Fax: +1 703 715-7436 + EMail: klensin@mci.net + + + Jon Postel + USC/Information Sciences Institute + 4676 Admiralty Way + Marina del Rey, CA 90292 + USA + + + Phone: +1 310 822 1511 + Fax: +1 310 823 6714 + EMail: Postel@ISI.EDU + + + + + + + + + + + + + + + +Freed, et. al. Best Current Practice [Page 20] + +RFC 2048 MIME Registration Procedures November 1996 + + +Appendix A -- Grandfathered Media Types + + A number of media types, registered prior to 1996, would, if + registered under the guidelines in this document, be placed into + either the vendor or personal trees. Reregistration of those types + to reflect the appropriate trees is encouraged, but not required. + Ownership and change control principles outlined in this document + apply to those types as if they had been registered in the trees + described above. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed, et. al. Best Current Practice [Page 21] + diff --git a/rfc/rfc2049.txt b/rfc/rfc2049.txt @@ -0,0 +1,1347 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2049 Innosoft +Obsoletes: 1521, 1522, 1590 N. Borenstein +Category: Standards Track First Virtual + November 1996 + + + Multipurpose Internet Mail Extensions + (MIME) Part Five: + Conformance Criteria and Examples + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822, defines a message representation protocol specifying + considerable detail about US-ASCII message headers, and leaves the + message content, or message body, as flat US-ASCII text. This set of + documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) an extensible set of different formats for non-textual + message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + The initial document in this set, RFC 2045, specifies the various + headers used to describe the structure of MIME messages. The second + document defines the general structure of the MIME media typing + system and defines an initial set of media types. The third + document, RFC 2047, describes extensions to RFC 822 to allow non-US- + + + +Freed & Borenstein Standards Track [Page 1] + +RFC 2049 MIME Conformance November 1996 + + + ASCII text data in Internet mail header fields. The fourth document, + RFC 2048, specifies various IANA registration procedures for MIME- + related facilities. This fifth and final document describes MIME + conformance criteria as well as providing some illustrative examples + of MIME message formats, acknowledgements, and the bibliography. + + These documents are revisions of RFCs 1521, 1522, and 1590, which + themselves were revisions of RFCs 1341 and 1342. Appendix B of this + document describes differences and changes from previous versions. + +Table of Contents + + 1. Introduction .......................................... 2 + 2. MIME Conformance ...................................... 2 + 3. Guidelines for Sending Email Data ..................... 6 + 4. Canonical Encoding Model .............................. 9 + 5. Summary ............................................... 12 + 6. Security Considerations ............................... 12 + 7. Authors' Addresses .................................... 12 + 8. Acknowledgements ...................................... 13 + A. A Complex Multipart Example ........................... 15 + B. Changes from RFC 1521, 1522, and 1590 ................. 16 + C. References ............................................ 20 + +1. Introduction + + The first and second documents in this set define MIME header fields + and the initial set of MIME media types. The third document + describes extensions to RFC822 formats to allow for character sets + other than US-ASCII. This document describes what portions of MIME + must be supported by a conformant MIME implementation. It also + describes various pitfalls of contemporary messaging systems as well + as the canonical encoding model MIME is based on. + +2. MIME Conformance + + The mechanisms described in these documents are open-ended. It is + definitely not expected that all implementations will support all + available media types, nor that they will all share the same + extensions. In order to promote interoperability, however, it is + useful to define the concept of "MIME-conformance" to define a + certain level of implementation that allows the useful interworking + of messages with content that differs from US-ASCII text. In this + section, we specify the requirements for such conformance. + + + + + + + +Freed & Borenstein Standards Track [Page 2] + +RFC 2049 MIME Conformance November 1996 + + + A mail user agent that is MIME-conformant MUST: + + (1) Always generate a "MIME-Version: 1.0" header field in + any message it creates. + + (2) Recognize the Content-Transfer-Encoding header field + and decode all received data encoded by either quoted- + printable or base64 implementations. The identity + transformations 7bit, 8bit, and binary must also be + recognized. + + Any non-7bit data that is sent without encoding must be + properly labelled with a content-transfer-encoding of + 8bit or binary, as appropriate. If the underlying + transport does not support 8bit or binary (as SMTP + [RFC-821] does not), the sender is required to both + encode and label data using an appropriate Content- + Transfer-Encoding such as quoted-printable or base64. + + (3) Must treat any unrecognized Content-Transfer-Encoding + as if it had a Content-Type of "application/octet- + stream", regardless of whether or not the actual + Content-Type is recognized. + + (4) Recognize and interpret the Content-Type header field, + and avoid showing users raw data with a Content-Type + field other than text. Implementations must be able + to send at least text/plain messages, with the + character set specified with the charset parameter if + it is not US-ASCII. + + (5) Ignore any content type parameters whose names they do + not recognize. + + (6) Explicitly handle the following media type values, to + at least the following extents: + + Text: + + -- Recognize and display "text" mail with the + character set "US-ASCII." + + -- Recognize other character sets at least to the + extent of being able to inform the user about what + character set the message uses. + + + + + + +Freed & Borenstein Standards Track [Page 3] + +RFC 2049 MIME Conformance November 1996 + + + -- Recognize the "ISO-8859-*" character sets to the + extent of being able to display those characters that + are common to ISO-8859-* and US-ASCII, namely all + characters represented by octet values 1-127. + + -- For unrecognized subtypes in a known character + set, show or offer to show the user the "raw" version + of the data after conversion of the content from + canonical form to local form. + + -- Treat material in an unknown character set as if + it were "application/octet-stream". + + Image, audio, and video: + + -- At a minumum provide facilities to treat any + unrecognized subtypes as if they were + "application/octet-stream". + + Application: + + -- Offer the ability to remove either of the quoted- + printable or base64 encodings defined in this + document if they were used and put the resulting + information in a user file. + + Multipart: + + -- Recognize the mixed subtype. Display all relevant + information on the message level and the body part + header level and then display or offer to display + each of the body parts individually. + + -- Recognize the "alternative" subtype, and avoid + showing the user redundant parts of + multipart/alternative mail. + + -- Recognize the "multipart/digest" subtype, + specifically using "message/rfc822" rather than + "text/plain" as the default media type for body parts + inside "multipart/digest" entities. + + -- Treat any unrecognized subtypes as if they were + "mixed". + + + + + + + +Freed & Borenstein Standards Track [Page 4] + +RFC 2049 MIME Conformance November 1996 + + + Message: + + -- Recognize and display at least the RFC822 message + encapsulation (message/rfc822) in such a way as to + preserve any recursive structure, that is, displaying + or offering to display the encapsulated data in + accordance with its media type. + + -- Treat any unrecognized subtypes as if they were + "application/octet-stream". + + (7) Upon encountering any unrecognized Content-Type field, + an implementation must treat it as if it had a media + type of "application/octet-stream" with no parameter + sub-arguments. How such data are handled is up to an + implementation, but likely options for handling such + unrecognized data include offering the user to write it + into a file (decoded from its mail transport format) or + offering the user to name a program to which the + decoded data should be passed as input. + + (8) Conformant user agents are required, if they provide + non-standard support for non-MIME messages employing + character sets other than US-ASCII, to do so on + received messages only. Conforming user agents must not + send non-MIME messages containing anything other than + US-ASCII text. + + In particular, the use of non-US-ASCII text in mail + messages without a MIME-Version field is strongly + discouraged as it impedes interoperability when sending + messages between regions with different localization + conventions. Conforming user agents MUST include proper + MIME labelling when sending anything other than plain + text in the US-ASCII character set. + + In addition, non-MIME user agents should be upgraded if + at all possible to include appropriate MIME header + information in the messages they send even if nothing + else in MIME is supported. This upgrade will have + little, if any, effect on non-MIME recipients and will + aid MIME in correctly displaying such messages. It + also provides a smooth transition path to eventual + adoption of other MIME capabilities. + + (9) Conforming user agents must ensure that any string of + non-white-space printable US-ASCII characters within a + "*text" or "*ctext" that begins with "=?" and ends with + + + +Freed & Borenstein Standards Track [Page 5] + +RFC 2049 MIME Conformance November 1996 + + + "?=" be a valid encoded-word. ("begins" means: At the + start of the field-body or immediately following + linear-white-space; "ends" means: At the end of the + field-body or immediately preceding linear-white- + space.) In addition, any "word" within a "phrase" that + begins with "=?" and ends with "?=" must be a valid + encoded-word. + + (10) Conforming user agents must be able to distinguish + encoded-words from "text", "ctext", or "word"s, + according to the rules in section 4, anytime they + appear in appropriate places in message headers. It + must support both the "B" and "Q" encodings for any + character set which it supports. The program must be + able to display the unencoded text if the character set + is "US-ASCII". For the ISO-8859-* character sets, the + mail reading program must at least be able to display + the characters which are also in the US-ASCII set. + + A user agent that meets the above conditions is said to be MIME- + conformant. The meaning of this phrase is that it is assumed to be + "safe" to send virtually any kind of properly-marked data to users of + such mail systems, because such systems will at least be able to + treat the data as undifferentiated binary, and will not simply splash + it onto the screen of unsuspecting users. + + There is another sense in which it is always "safe" to send data in a + format that is MIME-conformant, which is that such data will not + break or be broken by any known systems that are conformant with RFC + 821 and RFC 822. User agents that are MIME-conformant have the + additional guarantee that the user will not be shown data that were + never intended to be viewed as text. + +3. Guidelines for Sending Email Data + + Internet email is not a perfect, homogeneous system. Mail may become + corrupted at several stages in its travel to a final destination. + Specifically, email sent throughout the Internet may travel across + many networking technologies. Many networking and mail technologies + do not support the full functionality possible in the SMTP transport + environment. Mail traversing these systems is likely to be modified + in order that it can be transported. + + There exist many widely-deployed non-conformant MTAs in the Internet. + These MTAs, speaking the SMTP protocol, alter messages on the fly to + take advantage of the internal data structure of the hosts they are + implemented on, or are just plain broken. + + + + +Freed & Borenstein Standards Track [Page 6] + +RFC 2049 MIME Conformance November 1996 + + + The following guidelines may be useful to anyone devising a data + format (media type) that is supposed to survive the widest range of + networking technologies and known broken MTAs unscathed. Note that + anything encoded in the base64 encoding will satisfy these rules, but + that some well-known mechanisms, notably the UNIX uuencode facility, + will not. Note also that anything encoded in the Quoted-Printable + encoding will survive most gateways intact, but possibly not some + gateways to systems that use the EBCDIC character set. + + (1) Under some circumstances the encoding used for data may + change as part of normal gateway or user agent + operation. In particular, conversion from base64 to + quoted-printable and vice versa may be necessary. This + may result in the confusion of CRLF sequences with line + breaks in text bodies. As such, the persistence of + CRLF as something other than a line break must not be + relied on. + + (2) Many systems may elect to represent and store text data + using local newline conventions. Local newline + conventions may not match the RFC822 CRLF convention -- + systems are known that use plain CR, plain LF, CRLF, or + counted records. The result is that isolated CR and LF + characters are not well tolerated in general; they may + be lost or converted to delimiters on some systems, and + hence must not be relied on. + + (3) The transmission of NULs (US-ASCII value 0) is + problematic in Internet mail. (This is largely the + result of NULs being used as a termination character by + many of the standard runtime library routines in the C + programming language.) The practice of using NULs as + termination characters is so entrenched now that + messages should not rely on them being preserved. + + (4) TAB (HT) characters may be misinterpreted or may be + automatically converted to variable numbers of spaces. + This is unavoidable in some environments, notably those + not based on the US-ASCII character set. Such + conversion is STRONGLY DISCOURAGED, but it may occur, + and mail formats must not rely on the persistence of + TAB (HT) characters. + + (5) Lines longer than 76 characters may be wrapped or + truncated in some environments. Line wrapping or line + truncation imposed by mail transports is STRONGLY + DISCOURAGED, but unavoidable in some cases. + Applications which require long lines must somehow + + + +Freed & Borenstein Standards Track [Page 7] + +RFC 2049 MIME Conformance November 1996 + + + differentiate between soft and hard line breaks. (A + simple way to do this is to use the quoted-printable + encoding.) + + (6) Trailing "white space" characters (SPACE, TAB (HT)) on + a line may be discarded by some transport agents, while + other transport agents may pad lines with these + characters so that all lines in a mail file are of + equal length. The persistence of trailing white space, + therefore, must not be relied on. + + (7) Many mail domains use variations on the US-ASCII + character set, or use character sets such as EBCDIC + which contain most but not all of the US-ASCII + characters. The correct translation of characters not + in the "invariant" set cannot be depended on across + character converting gateways. For example, this + situation is a problem when sending uuencoded + information across BITNET, an EBCDIC system. Similar + problems can occur without crossing a gateway, since + many Internet hosts use character sets other than US- + ASCII internally. The definition of Printable Strings + in X.400 adds further restrictions in certain special + cases. In particular, the only characters that are + known to be consistent across all gateways are the 73 + characters that correspond to the upper and lower case + letters A-Z and a-z, the 10 digits 0-9, and the + following eleven special characters: + + "'" (US-ASCII decimal value 39) + "(" (US-ASCII decimal value 40) + ")" (US-ASCII decimal value 41) + "+" (US-ASCII decimal value 43) + "," (US-ASCII decimal value 44) + "-" (US-ASCII decimal value 45) + "." (US-ASCII decimal value 46) + "/" (US-ASCII decimal value 47) + ":" (US-ASCII decimal value 58) + "=" (US-ASCII decimal value 61) + "?" (US-ASCII decimal value 63) + + A maximally portable mail representation will confine + itself to relatively short lines of text in which the + only meaningful characters are taken from this set of + 73 characters. The base64 encoding follows this rule. + + (8) Some mail transport agents will corrupt data that + includes certain literal strings. In particular, a + + + +Freed & Borenstein Standards Track [Page 8] + +RFC 2049 MIME Conformance November 1996 + + + period (".") alone on a line is known to be corrupted + by some (incorrect) SMTP implementations, and a line + that starts with the five characters "From " (the fifth + character is a SPACE) are commonly corrupted as well. + A careful composition agent can prevent these + corruptions by encoding the data (e.g., in the quoted- + printable encoding using "=46rom " in place of "From " + at the start of a line, and "=2E" in place of "." alone + on a line). + + Please note that the above list is NOT a list of recommended + practices for MTAs. RFC 821 MTAs are prohibited from altering the + character of white space or wrapping long lines. These BAD and + invalid practices are known to occur on established networks, and + implementations should be robust in dealing with the bad effects they + can cause. + +4. Canonical Encoding Model + + There was some confusion, in earlier versions of these documents, + regarding the model for when email data was to be converted to + canonical form and encoded, and in particular how this process would + affect the treatment of CRLFs, given that the representation of + newlines varies greatly from system to system. For this reason, a + canonical model for encoding is presented below. + + The process of composing a MIME entity can be modeled as being done + in a number of steps. Note that these steps are roughly similar to + those steps used in PEM [RFC-1421] and are performed for each + "innermost level" body: + + (1) Creation of local form. + + The body to be transmitted is created in the system's + native format. The native character set is used and, + where appropriate, local end of line conventions are + used as well. The body may be a UNIX-style text file, + or a Sun raster image, or a VMS indexed file, or audio + data in a system-dependent format stored only in + memory, or anything else that corresponds to the local + model for the representation of some form of + information. Fundamentally, the data is created in the + "native" form that corresponds to the type specified by + the media type. + + + + + + + +Freed & Borenstein Standards Track [Page 9] + +RFC 2049 MIME Conformance November 1996 + + + (2) Conversion to canonical form. + + The entire body, including "out-of-band" information + such as record lengths and possibly file attribute + information, is converted to a universal canonical + form. The specific media type of the body as well as + its associated attributes dictate the nature of the + canonical form that is used. Conversion to the proper + canonical form may involve character set conversion, + transformation of audio data, compression, or various + other operations specific to the various media types. + If character set conversion is involved, however, care + must be taken to understand the semantics of the media + type, which may have strong implications for any + character set conversion, e.g. with regard to + syntactically meaningful characters in a text subtype + other than "plain". + + For example, in the case of text/plain data, the text + must be converted to a supported character set and + lines must be delimited with CRLF delimiters in + accordance with RFC 822. Note that the restriction on + line lengths implied by RFC 822 is eliminated if the + next step employs either quoted-printable or base64 + encoding. + + (3) Apply transfer encoding. + + A Content-Transfer-Encoding appropriate for this body + is applied. Note that there is no fixed relationship + between the media type and the transfer encoding. In + particular, it may be appropriate to base the choice of + base64 or quoted-printable on character frequency + counts which are specific to a given instance of a + body. + + (4) Insertion into entity. + + The encoded body is inserted into a MIME entity with + appropriate headers. The entity is then inserted into + the body of a higher-level entity (message or + multipart) as needed. + + Conversion from entity form to local form is accomplished by + reversing these steps. Note that reversal of these steps may produce + differing results since there is no guarantee that the original and + final local forms are the same. + + + + +Freed & Borenstein Standards Track [Page 10] + +RFC 2049 MIME Conformance November 1996 + + + It is vital to note that these steps are only a model; they are + specifically NOT a blueprint for how an actual system would be built. + In particular, the model fails to account for two common designs: + + (1) In many cases the conversion to a canonical form prior + to encoding will be subsumed into the encoder itself, + which understands local formats directly. For example, + the local newline convention for text bodies might be + carried through to the encoder itself along with + knowledge of what that format is. + + (2) The output of the encoders may have to pass through one + or more additional steps prior to being transmitted as + a message. As such, the output of the encoder may not + be conformant with the formats specified by RFC 822. + In particular, once again it may be appropriate for the + converter's output to be expressed using local newline + conventions rather than using the standard RFC 822 CRLF + delimiters. + + Other implementation variations are conceivable as well. The vital + aspect of this discussion is that, in spite of any optimizations, + collapsings of required steps, or insertion of additional processing, + the resulting messages must be consistent with those produced by the + model described here. For example, a message with the following + header fields: + + Content-type: text/foo; charset=bar + Content-Transfer-Encoding: base64 + + must be first represented in the text/foo form, then (if necessary) + represented in the "bar" character set, and finally transformed via + the base64 algorithm into a mail-safe form. + + NOTE: Some confusion has been caused by systems that represent + messages in a format which uses local newline conventions which + differ from the RFC822 CRLF convention. It is important to note that + these formats are not canonical RFC822/MIME. These formats are + instead *encodings* of RFC822, where CRLF sequences in the canonical + representation of the message are encoded as the local newline + convention. Note that formats which encode CRLF sequences as, for + example, LF are not capable of representing MIME messages containing + binary data which contains LF octets not part of CRLF line separation + sequences. + + + + + + + +Freed & Borenstein Standards Track [Page 11] + +RFC 2049 MIME Conformance November 1996 + + +5. Summary + + This document defines what is meant by MIME Conformance. It also + details various problems known to exist in the Internet email system + and how to use MIME to overcome them. Finally, it describes MIME's + canonical encoding model. + +6. Security Considerations + + Security issues are discussed in the second document in this set, RFC + 2046. + +7. Authors' Addresses + + For more information, the authors of this document are best contacted + via Internet mail: + + Ned Freed + Innosoft International, Inc. + 1050 East Garvey Avenue South + West Covina, CA 91790 + USA + + Phone: +1 818 919 3600 + Fax: +1 818 919 3614 + EMail: ned@innosoft.com + + Nathaniel S. Borenstein + First Virtual Holdings + 25 Washington Avenue + Morristown, NJ 07960 + USA + + Phone: +1 201 540 8967 + Fax: +1 201 993 3032 + EMail: nsb@nsb.fv.com + + MIME is a result of the work of the Internet Engineering Task Force + Working Group on RFC 822 Extensions. The chairman of that group, + Greg Vaudreuil, may be reached at: + + Gregory M. Vaudreuil + Octel Network Services + 17080 Dallas Parkway + Dallas, TX 75248-1905 + USA + + EMail: Greg.Vaudreuil@Octel.Com + + + +Freed & Borenstein Standards Track [Page 12] + +RFC 2049 MIME Conformance November 1996 + + +8. Acknowledgements + + This document is the result of the collective effort of a large + number of people, at several IETF meetings, on the IETF-SMTP and + IETF-822 mailing lists, and elsewhere. Although any enumeration + seems doomed to suffer from egregious omissions, the following are + among the many contributors to this effort: + + Harald Tveit Alvestrand Marc Andreessen + Randall Atkinson Bob Braden + Philippe Brandon Brian Capouch + Kevin Carosso Uhhyung Choi + Peter Clitherow Dave Collier-Brown + Cristian Constantinof John Coonrod + Mark Crispin Dave Crocker + Stephen Crocker Terry Crowley + Walt Daniels Jim Davis + Frank Dawson Axel Deininger + Hitoshi Doi Kevin Donnelly + Steve Dorner Keith Edwards + Chris Eich Dana S. Emery + Johnny Eriksson Craig Everhart + Patrik Faltstrom Erik E. Fair + Roger Fajman Alain Fontaine + Martin Forssen James M. Galvin + Stephen Gildea Philip Gladstone + Thomas Gordon Keld Simonsen + Terry Gray Phill Gross + James Hamilton David Herron + Mark Horton Bruce Howard + Bill Janssen Olle Jarnefors + Risto Kankkunen Phil Karn + Alan Katz Tim Kehres + Neil Katin Steve Kille + Kyuho Kim Anders Klemets + John Klensin Valdis Kletniek + Jim Knowles Stev Knowles + Bob Kummerfeld Pekka Kytolaakso + Stellan Lagerstrom Vincent Lau + Timo Lehtinen Donald Lindsay + Warner Losh Carlyn Lowery + Laurence Lundblade Charles Lynn + John R. MacMillan Larry Masinter + Rick McGowan Michael J. McInerny + Leo Mclaughlin Goli Montaser-Kohsari + Tom Moore John Gardiner Myers + Erik Naggum Mark Needleman + Chris Newman John Noerenberg + + + +Freed & Borenstein Standards Track [Page 13] + +RFC 2049 MIME Conformance November 1996 + + + Mats Ohrman Julian Onions + Michael Patton David J. Pepper + Erik van der Poel Blake C. Ramsdell + Christer Romson Luc Rooijakkers + Marshall T. Rose Jonathan Rosenberg + Guido van Rossum Jan Rynning + Harri Salminen Michael Sanderson + Yutaka Sato Markku Savela + Richard Alan Schafer Masahiro Sekiguchi + Mark Sherman Bob Smart + Peter Speck Henry Spencer + Einar Stefferud Michael Stein + Klaus Steinberger Peter Svanberg + James Thompson Steve Uhler + Stuart Vance Peter Vanderbilt + Greg Vaudreuil Ed Vielmetti + Larry W. Virden Ryan Waldron + Rhys Weatherly Jay Weber + Dave Wecker Wally Wedel + Sven-Ove Westberg Brian Wideen + John Wobus Glenn Wright + Rayan Zachariassen David Zimmerman + + The authors apologize for any omissions from this list, which are + certainly unintentional. + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 14] + +RFC 2049 MIME Conformance November 1996 + + +Appendix A -- A Complex Multipart Example + + What follows is the outline of a complex multipart message. This + message contains five parts that are to be displayed serially: two + introductory plain text objects, an embedded multipart message, a + text/enriched object, and a closing encapsulated text message in a + non-ASCII character set. The embedded multipart message itself + contains two objects to be displayed in parallel, a picture and an + audio fragment. + + MIME-Version: 1.0 + From: Nathaniel Borenstein <nsb@nsb.fv.com> + To: Ned Freed <ned@innosoft.com> + Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) + Subject: A multipart example + Content-Type: multipart/mixed; + boundary=unique-boundary-1 + + This is the preamble area of a multipart message. + Mail readers that understand multipart format + should ignore this preamble. + + If you are reading this text, you might want to + consider changing to a mail reader that understands + how to properly display multipart messages. + + --unique-boundary-1 + + ... Some text appears here ... + + [Note that the blank between the boundary and the start + of the text in this part means no header fields were + given and this is text in the US-ASCII character set. + It could have been done with explicit typing as in the + next part.] + + --unique-boundary-1 + Content-type: text/plain; charset=US-ASCII + + This could have been part of the previous part, but + illustrates explicit versus implicit typing of body + parts. + + --unique-boundary-1 + Content-Type: multipart/parallel; boundary=unique-boundary-2 + + --unique-boundary-2 + Content-Type: audio/basic + + + +Freed & Borenstein Standards Track [Page 15] + +RFC 2049 MIME Conformance November 1996 + + + Content-Transfer-Encoding: base64 + + ... base64-encoded 8000 Hz single-channel + mu-law-format audio data goes here ... + + --unique-boundary-2 + Content-Type: image/jpeg + Content-Transfer-Encoding: base64 + + ... base64-encoded image data goes here ... + + --unique-boundary-2-- + + --unique-boundary-1 + Content-type: text/enriched + + This is <bold><italic>enriched.</italic></bold> + <smaller>as defined in RFC 1896</smaller> + + Isn't it + <bigger><bigger>cool?</bigger></bigger> + + --unique-boundary-1 + Content-Type: message/rfc822 + + From: (mailbox in US-ASCII) + To: (address in US-ASCII) + Subject: (subject in US-ASCII) + Content-Type: Text/plain; charset=ISO-8859-1 + Content-Transfer-Encoding: Quoted-printable + + ... Additional text in ISO-8859-1 goes here ... + + --unique-boundary-1-- + +Appendix B -- Changes from RFC 1521, 1522, and 1590 + + These documents are a revision of RFC 1521, 1522, and 1590. For the + convenience of those familiar with the earlier documents, the changes + from those documents are summarized in this appendix. For further + history, note that Appendix H in RFC 1521 specified how that document + differed from its predecessor, RFC 1341. + + (1) This document has been completely reformatted and split + into multiple documents. This was done to improve the + quality of the plain text version of this document, + which is required to be the reference copy. + + + + +Freed & Borenstein Standards Track [Page 16] + +RFC 2049 MIME Conformance November 1996 + + + (2) BNF describing the overall structure of MIME object + headers has been added. This is a documentation change + only -- the underlying syntax has not changed in any + way. + + (3) The specific BNF for the seven media types in MIME has + been removed. This BNF was incorrect, incomplete, amd + inconsistent with the type-indendependent BNF. And + since the type-independent BNF already fully specifies + the syntax of the various MIME headers, the type- + specific BNF was, in the final analysis, completely + unnecessary and caused more problems than it solved. + + (4) The more specific "US-ASCII" character set name has + replaced the use of the informal term ASCII in many + parts of these documents. + + (5) The informal concept of a primary subtype has been + removed. + + (6) The term "object" was being used inconsistently. The + definition of this term has been clarified, along with + the related terms "body", "body part", and "entity", + and usage has been corrected where appropriate. + + (7) The BNF for the multipart media type has been + rearranged to make it clear that the CRLF preceeding + the boundary marker is actually part of the marker + itself rather than the preceeding body part. + + (8) The prose and BNF describing the multipart media type + have been changed to make it clear that the body parts + within a multipart object MUST NOT contain any lines + beginning with the boundary parameter string. + + (9) In the rules on reassembling "message/partial" MIME + entities, "Subject" is added to the list of headers to + take from the inner message, and the example is + modified to clarify this point. + + (10) "Message/partial" fragmenters are restricted to + splitting MIME objects only at line boundaries. + + (11) In the discussion of the application/postscript type, + an additional paragraph has been added warning about + possible interoperability problems caused by embedding + of binary data inside a PostScript MIME entity. + + + + +Freed & Borenstein Standards Track [Page 17] + +RFC 2049 MIME Conformance November 1996 + + + (12) Added a clarifying note to the basic syntax rules for + the Content-Type header field to make it clear that the + following two forms: + + Content-type: text/plain; charset=us-ascii (comment) + + Content-type: text/plain; charset="us-ascii" + + are completely equivalent. + + (13) The following sentence has been removed from the + discussion of the MIME-Version header: "However, + conformant software is encouraged to check the version + number and at least warn the user if an unrecognized + MIME-version is encountered." + + (14) A typo was fixed that said "application/external-body" + instead of "message/external-body". + + (15) The definition of a character set has been reorganized + to make the requirements clearer. + + (16) The definition of the "image/gif" media type has been + moved to a separate document. This change was made + because of potential conflicts with IETF rules + governing the standardization of patented technology. + + (17) The definitions of "7bit" and "8bit" have been + tightened so that use of bare CR, LF can only be used + as end-of-line sequences. The document also no longer + requires that NUL characters be preserved, which brings + MIME into alignment with real-world implementations. + + (18) The definition of canonical text in MIME has been + tightened so that line breaks must be represented by a + CRLF sequence. CR and LF characters are not allowed + outside of this usage. The definition of quoted- + printable encoding has been altered accordingly. + + (19) The definition of the quoted-printable encoding now + includes a number of suggestions for how quoted- + printable encoders might best handle improperly encoded + material. + + (20) Prose was added to clarify the use of the "7bit", + "8bit", and "binary" transfer-encodings on multipart or + message entities encapsulating "8bit" or "binary" data. + + + + +Freed & Borenstein Standards Track [Page 18] + +RFC 2049 MIME Conformance November 1996 + + + (21) In the section on MIME Conformance, "multipart/digest" + support was added to the list of requirements for + minimal MIME conformance. Also, the requirement for + "message/rfc822" support were strengthened to clarify + the importance of recognizing recursive structure. + + (22) The various restrictions on subtypes of "message" are + now specified entirely on a subtype by subtype basis. + + (23) The definition of "message/rfc822" was changed to + indicate that at least one of the "From", "Subject", or + "Date" headers must be present. + + (24) The required handling of unrecognized subtypes as + "application/octet-stream" has been made more explicit + in both the type definitions sections and the + conformance guidelines. + + (25) Examples using text/richtext were changed to + text/enriched. + + (26) The BNF definition of subtype has been changed to make + it clear that either an IANA registered subtype or a + nonstandard "X-" subtype must be used in a Content-Type + header field. + + (27) MIME media types that are simply registered for use and + those that are standardized by the IETF are now + distinguished in the MIME BNF. + + (28) All of the various MIME registration procedures have + been extensively revised. IANA registration procedures + for character sets have been moved to a separate + document that is no included in this set of documents. + + (29) The use of escape and shift mechanisms in the US-ASCII + and ISO-8859-X character sets these documents define + have been clarified: Such mechanisms should never be + used in conjunction with these character sets and their + effect if they are used is undefined. + + (30) The definition of the AFS access-type for + message/external-body has been removed. + + (31) The handling of the combination of + multipart/alternative and message/external-body is now + specifically addressed. + + + + +Freed & Borenstein Standards Track [Page 19] + +RFC 2049 MIME Conformance November 1996 + + + (32) Security issues specific to message/external-body are + now discussed in some detail. + +Appendix C -- References + + [ATK] + Borenstein, Nathaniel S., Multimedia Applications + Development with the Andrew Toolkit, Prentice-Hall, 1990. + + [ISO-2022] + International Standard -- Information Processing -- + Character Code Structure and Extension Techniques, + ISO/IEC 2022:1994, 4th ed. + + [ISO-8859] + International Standard -- Information Processing -- 8-bit + Single-Byte Coded Graphic Character Sets + - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed. + - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed. + - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed. + - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed. + - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st + ed. + - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed. + - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed. + - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed. + - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st + ed. + International Standard -- Information Technology -- 8-bit + Single-Byte Coded Graphic Character Sets + - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992, + 1st ed. + + [ISO-646] + International Standard -- Information Technology -- ISO + 7-bit Coded Character Set for Information Interchange, + ISO 646:1991, 3rd ed.. + + [JPEG] + JPEG Draft Standard ISO 10918-1 CD. + + [MPEG] + Video Coding Draft Standard ISO 11172 CD, ISO + IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May, + 1991. + + + + + + +Freed & Borenstein Standards Track [Page 20] + +RFC 2049 MIME Conformance November 1996 + + + [PCM] + CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code + Modulation (PCM) of Voice Frequencies", Geneva, 1972. + + [POSTSCRIPT] + Adobe Systems, Inc., PostScript Language Reference + Manual, Addison-Wesley, 1985. + + [POSTSCRIPT2] + Adobe Systems, Inc., PostScript Language Reference + Manual, Addison-Wesley, Second Ed., 1990. + + [RFC-783] + Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783, + MIT, June 1981. + + [RFC-821] + Postel, J.B., "Simple Mail Transfer Protocol", STD 10, + RFC 821, USC/Information Sciences Institute, August 1982. + + [RFC-822] + Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", STD 11, RFC 822, UDEL, August 1982. + + [RFC-934] + Rose, M. and E. Stefferud, "Proposed Standard for Message + Encapsulation", RFC 934, Delaware and NMA, January 1985. + + [RFC-959] + Postel, J. and J. Reynolds, "File Transfer Protocol", STD + 9, RFC 959, USC/Information Sciences Institute, October + 1985. + + [RFC-1049] + Sirbu, M., "Content-Type Header Field for Internet + Messages", RFC 1049, CMU, March 1988. + + [RFC-1154] + Robinson, D., and R. Ullmann, "Encoding Header Field for + Internet Messages", RFC 1154, Prime Computer, Inc., April + 1990. + + [RFC-1341] + Borenstein, N., and N. Freed, "MIME (Multipurpose + Internet Mail Extensions): Mechanisms for Specifying and + Describing the Format of Internet Message Bodies", RFC + 1341, Bellcore, Innosoft, June 1992. + + + + +Freed & Borenstein Standards Track [Page 21] + +RFC 2049 MIME Conformance November 1996 + + + [RFC-1342] + Moore, K., "Representation of Non-Ascii Text in Internet + Message Headers", RFC 1342, University of Tennessee, June + 1992. + + [RFC-1344] + Borenstein, N., "Implications of MIME for Internet Mail + Gateways", RFC 1344, Bellcore, June 1992. + + [RFC-1345] + Simonsen, K., "Character Mnemonics & Character Sets", RFC + 1345, Rationel Almen Planlaegning, June 1992. + + [RFC-1421] + Linn, J., "Privacy Enhancement for Internet Electronic + Mail: Part I -- Message Encryption and Authentication + Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, + February 1993. + + [RFC-1422] + Kent, S., "Privacy Enhancement for Internet Electronic + Mail: Part II -- Certificate-Based Key Management", RFC + 1422, IAB IRTF PSRG, IETF PEM WG, February 1993. + + [RFC-1423] + Balenson, D., "Privacy Enhancement for Internet + Electronic Mail: Part III -- Algorithms, Modes, and + Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993. + + [RFC-1424] + Kaliski, B., "Privacy Enhancement for Internet Electronic + Mail: Part IV -- Key Certification and Related + Services", IAB IRTF PSRG, IETF PEM WG, February 1993. + + [RFC-1521] + Borenstein, N., and Freed, N., "MIME (Multipurpose + Internet Mail Extensions): Mechanisms for Specifying and + Describing the Format of Internet Message Bodies", RFC + 1521, Bellcore, Innosoft, September, 1993. + + [RFC-1522] + Moore, K., "Representation of Non-ASCII Text in Internet + Message Headers", RFC 1522, University of Tennessee, + September 1993. + + + + + + + +Freed & Borenstein Standards Track [Page 22] + +RFC 2049 MIME Conformance November 1996 + + + [RFC-1524] + Borenstein, N., "A User Agent Configuration Mechanism for + Multimedia Mail Format Information", RFC 1524, Bellcore, + September 1993. + + [RFC-1543] + Postel, J., "Instructions to RFC Authors", RFC 1543, + USC/Information Sciences Institute, October 1993. + + [RFC-1556] + Nussbacher, H., "Handling of Bi-directional Texts in + MIME", RFC 1556, Israeli Inter-University Computer + Center, December 1993. + + [RFC-1590] + Postel, J., "Media Type Registration Procedure", RFC + 1590, USC/Information Sciences Institute, March 1994. + + [RFC-1602] + Internet Architecture Board, Internet Engineering + Steering Group, Huitema, C., Gross, P., "The Internet + Standards Process -- Revision 2", March 1994. + + [RFC-1652] + Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., + Stefferud, E., and Crocker, D., "SMTP Service Extension + for 8bit-MIME transport", RFC 1652, United Nations + University, Innosoft, Dover Beach Consulting, Inc., + Network Management Associates, Inc., The Branch Office, + March 1994. + + [RFC-1700] + Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, + RFC 1700, USC/Information Sciences Institute, October + 1994. + + [RFC-1741] + Faltstrom, P., Crocker, D., and Fair, E., "MIME Content + Type for BinHex Encoded Files", December 1994. + + [RFC-1896] + Resnick, P., and A. Walker, "The text/enriched MIME + Content-type", RFC 1896, February, 1996. + + + + + + + + +Freed & Borenstein Standards Track [Page 23] + +RFC 2049 MIME Conformance November 1996 + + + [RFC-2045] + Freed, N., and and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message + Bodies", RFC 2045, Innosoft, First Virtual Holdings, + November 1996. + + [RFC-2046] + Freed, N., and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Two: Media Types", RFC 2046, + Innosoft, First Virtual Holdings, November 1996. + + [RFC-2047] + Moore, K., "Multipurpose Internet Mail Extensions (MIME) + Part Three: Representation of Non-ASCII Text in Internet + Message Headers", RFC 2047, University of + Tennessee, November 1996. + + [RFC-2048] + Freed, N., Klensin, J., and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) Part Four: MIME + Registration Procedures", RFC 2048, Innosoft, MCI, + ISI, November 1996. + + [RFC-2049] + Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Five: Conformance Criteria and + Examples", RFC 2049 (this document), Innosoft, First + Virtual Holdings, November 1996. + + [US-ASCII] + Coded Character Set -- 7-Bit American Standard Code for + Information Interchange, ANSI X3.4-1986. + + [X400] + Schicker, Pietro, "Message Handling Systems, X.400", + Message Handling Systems and Distributed Applications, E. + Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- + Holland, 1989, pp. 3-41. + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 24] + diff --git a/rfc/rfc2183.txt b/rfc/rfc2183.txt @@ -0,0 +1,675 @@ + + + + + + +Network Working Group R. Troost +Request for Comments: 2183 New Century Systems +Updates: 1806 S. Dorner +Category: Standards Track QUALCOMM Incorporated + K. Moore, Editor + University of Tennessee + August 1997 + + + Communicating Presentation Information in + Internet Messages: + The Content-Disposition Header Field + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This memo provides a mechanism whereby messages conforming to the + MIME specifications [RFC 2045, RFC 2046, RFC 2047, RFC 2048, RFC + 2049] can convey presentational information. It specifies the + "Content-Disposition" header field, which is optional and valid for + any MIME entity ("message" or "body part"). Two values for this + header field are described in this memo; one for the ordinary linear + presentation of the body part, and another to facilitate the use of + mail to transfer files. It is expected that more values will be + defined in the future, and procedures are defined for extending this + set of values. + + This document is intended as an extension to MIME. As such, the + reader is assumed to be familiar with the MIME specifications, and + [RFC 822]. The information presented herein supplements but does not + replace that found in those documents. + + This document is a revision to the Experimental protocol defined in + RFC 1806. As compared to RFC 1806, this document contains minor + editorial updates, adds new parameters needed to support the File + Transfer Body Part, and references a separate specification for the + handling of non-ASCII and/or very long parameter values. + + + + + + + +Troost, et. al. Standards Track [Page 1] + +RFC 2183 Content-Disposition August 1997 + + +1. Introduction + + MIME specifies a standard format for encapsulating multiple pieces of + data into a single Internet message. That document does not address + the issue of presentation styles; it provides a framework for the + interchange of message content, but leaves presentation issues solely + in the hands of mail user agent (MUA) implementors. + + Two common ways of presenting multipart electronic messages are as a + main document with a list of separate attachments, and as a single + document with the various parts expanded (displayed) inline. The + display of an attachment is generally construed to require positive + action on the part of the recipient, while inline message components + are displayed automatically when the message is viewed. A mechanism + is needed to allow the sender to transmit this sort of presentational + information to the recipient; the Content-Disposition header provides + this mechanism, allowing each component of a message to be tagged + with an indication of its desired presentation semantics. + + Tagging messages in this manner will often be sufficient for basic + message formatting. However, in many cases a more powerful and + flexible approach will be necessary. The definition of such + approaches is beyond the scope of this memo; however, such approaches + can benefit from additional Content-Disposition values and + parameters, to be defined at a later date. + + In addition to allowing the sender to specify the presentational + disposition of a message component, it is desirable to allow her to + indicate a default archival disposition; a filename. The optional + "filename" parameter provides for this. Further, the creation-date, + modification-date, and read-date parameters allow preservation of + those file attributes when the file is transmitted over MIME email. + + NB: The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, + SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this + document, are to be interpreted as described in [RFC 2119]. + +2. The Content-Disposition Header Field + + Content-Disposition is an optional header field. In its absence, the + MUA may use whatever presentation method it deems suitable. + + It is desirable to keep the set of possible disposition types small + and well defined, to avoid needless complexity. Even so, evolving + usage will likely require the definition of additional disposition + types or parameters, so the set of disposition values is extensible; + see below. + + + + +Troost, et. al. Standards Track [Page 2] + +RFC 2183 Content-Disposition August 1997 + + + In the extended BNF notation of [RFC 822], the Content-Disposition + header field is defined as follows: + + disposition := "Content-Disposition" ":" + disposition-type + *(";" disposition-parm) + + disposition-type := "inline" + / "attachment" + / extension-token + ; values are not case-sensitive + + disposition-parm := filename-parm + / creation-date-parm + / modification-date-parm + / read-date-parm + / size-parm + / parameter + + filename-parm := "filename" "=" value + + creation-date-parm := "creation-date" "=" quoted-date-time + + modification-date-parm := "modification-date" "=" quoted-date-time + + read-date-parm := "read-date" "=" quoted-date-time + + size-parm := "size" "=" 1*DIGIT + + quoted-date-time := quoted-string + ; contents MUST be an RFC 822 `date-time' + ; numeric timezones (+HHMM or -HHMM) MUST be used + + + + NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters) + parameter value containing only non-`tspecials' characters SHOULD be + represented as a single `token'. A short parameter value containing + only ASCII characters, but including `tspecials' characters, SHOULD + be represented as `quoted-string'. Parameter values longer than 78 + characters, or which contain non-ASCII characters, MUST be encoded as + specified in [RFC 2184]. + + `Extension-token', `parameter', `tspecials' and `value' are defined + according to [RFC 2045] (which references [RFC 822] in the definition + of some of these tokens). `quoted-string' and `DIGIT' are defined in + [RFC 822]. + + + + +Troost, et. al. Standards Track [Page 3] + +RFC 2183 Content-Disposition August 1997 + + +2.1 The Inline Disposition Type + + A bodypart should be marked `inline' if it is intended to be + displayed automatically upon display of the message. Inline + bodyparts should be presented in the order in which they occur, + subject to the normal semantics of multipart messages. + +2.2 The Attachment Disposition Type + + Bodyparts can be designated `attachment' to indicate that they are + separate from the main body of the mail message, and that their + display should not be automatic, but contingent upon some further + action of the user. The MUA might instead present the user of a + bitmap terminal with an iconic representation of the attachments, or, + on character terminals, with a list of attachments from which the + user could select for viewing or storage. + +2.3 The Filename Parameter + + The sender may want to suggest a filename to be used if the entity is + detached and stored in a separate file. If the receiving MUA writes + the entity to a file, the suggested filename should be used as a + basis for the actual filename, where possible. + + It is important that the receiving MUA not blindly use the suggested + filename. The suggested filename SHOULD be checked (and possibly + changed) to see that it conforms to local filesystem conventions, + does not overwrite an existing file, and does not present a security + problem (see Security Considerations below). + + The receiving MUA SHOULD NOT respect any directory path information + that may seem to be present in the filename parameter. The filename + should be treated as a terminal component only. Portable + specification of directory paths might possibly be done in the future + via a separate Content-Disposition parameter, but no provision is + made for it in this draft. + + Current [RFC 2045] grammar restricts parameter values (and hence + Content-Disposition filenames) to US-ASCII. We recognize the great + desirability of allowing arbitrary character sets in filenames, but + it is beyond the scope of this document to define the necessary + mechanisms. We expect that the basic [RFC 1521] `value' + specification will someday be amended to allow use of non-US-ASCII + characters, at which time the same mechanism should be used in the + Content-Disposition filename parameter. + + + + + + +Troost, et. al. Standards Track [Page 4] + +RFC 2183 Content-Disposition August 1997 + + + Beyond the limitation to US-ASCII, the sending MUA may wish to bear + in mind the limitations of common filesystems. Many have severe + length and character set restrictions. Short alphanumeric filenames + are least likely to require modification by the receiving system. + + The presence of the filename parameter does not force an + implementation to write the entity to a separate file. It is + perfectly acceptable for implementations to leave the entity as part + of the normal mail stream unless the user requests otherwise. As a + consequence, the parameter may be used on any MIME entity, even + `inline' ones. These will not normally be written to files, but the + parameter could be used to provide a filename if the receiving user + should choose to write the part to a file. + +2.4 The Creation-Date parameter + + The creation-date parameter MAY be used to indicate the date at which + the file was created. If this parameter is included, the paramter + value MUST be a quoted-string which contains a representation of the + creation date of the file in [RFC 822] `date-time' format. + + UNIX and POSIX implementors are cautioned that the `st_ctime' file + attribute of the `stat' structure is not the creation time of the + file; it is thus not appropriate as a source for the creation-date + parameter value. + +2.5 The Modification-Date parameter + + The modification-date parameter MAY be used to indicate the date at + which the file was last modified. If the modification-date parameter + is included, the paramter value MUST be a quoted-string which + contains a representation of the last modification date of the file + in [RFC 822] `date-time' format. + +2.6 The Read-Date parameter + + The read-date parameter MAY be used to indicate the date at which the + file was last read. If the read-date parameter is included, the + parameter value MUST be a quoted-string which contains a + representation of the last-read date of the file in [RFC 822] `date- + time' format. + +2.7 The Size parameter + + The size parameter indicates an approximate size of the file in + octets. It can be used, for example, to pre-allocate space before + attempting to store the file, or to determine whether enough space + exists. + + + +Troost, et. al. Standards Track [Page 5] + +RFC 2183 Content-Disposition August 1997 + + +2.8 Future Extensions and Unrecognized Disposition Types + + In the likely event that new parameters or disposition types are + needed, they should be registered with the Internet Assigned Numbers + Authority (IANA), in the manner specified in Section 9 of this memo. + + Once new disposition types and parameters are defined, there is of + course the likelihood that implementations will see disposition types + and parameters they do not understand. Furthermore, since x-tokens + are allowed, implementations may also see entirely unregistered + disposition types and parameters. + + Unrecognized parameters should be ignored. Unrecognized disposition + types should be treated as `attachment'. The choice of `attachment' + for unrecognized types is made because a sender who goes to the + trouble of producing a Content-Disposition header with a new + disposition type is more likely aiming for something more elaborate + than inline presentation. + + Unless noted otherwise in the definition of a parameter, Content- + Disposition parameters are valid for all dispositions. (In contrast + to MIME content-type parameters, which are defined on a per-content- + type basis.) Thus, for example, the `filename' parameter still means + the name of the file to which the part should be written, even if the + disposition itself is unrecognized. + +2.9 Content-Disposition and Multipart + + If a Content-Disposition header is used on a multipart body part, it + applies to the multipart as a whole, not the individual subparts. + The disposition types of the subparts do not need to be consulted + until the multipart itself is presented. When the multipart is + displayed, then the dispositions of the subparts should be respected. + + If the `inline' disposition is used, the multipart should be + displayed as normal; however, an `attachment' subpart should require + action from the user to display. + + If the `attachment' disposition is used, presentation of the + multipart should not proceed without explicit user action. Once the + user has chosen to display the multipart, the individual subpart + dispositions should be consulted to determine how to present the + subparts. + + + + + + + + +Troost, et. al. Standards Track [Page 6] + +RFC 2183 Content-Disposition August 1997 + + +2.10 Content-Disposition and the Main Message + + It is permissible to use Content-Disposition on the main body of an + [RFC 822] message. + +3. Examples + + Here is a an example of a body part containing a JPEG image that is + intended to be viewed by the user immediately: + + Content-Type: image/jpeg + Content-Disposition: inline + Content-Description: just a small picture of me + + <jpeg data> + + The following body part contains a JPEG image that should be + displayed to the user only if the user requests it. If the JPEG is + written to a file, the file should be named "genome.jpg". The + recipient's user might also choose to set the last-modified date of + the stored file to date in the modification-date parameter: + + Content-Type: image/jpeg + Content-Disposition: attachment; filename=genome.jpeg; + modification-date="Wed, 12 Feb 1997 16:29:51 -0500"; + Content-Description: a complete map of the human genome + + <jpeg data> + + The following is an example of the use of the `attachment' + disposition with a multipart body part. The user should see text- + part-1 immediately, then take some action to view multipart-2. After + taking action to view multipart-2, the user will see text-part-2 + right away, and be required to take action to view jpeg-1. Subparts + are indented for clarity; they would not be so indented in a real + message. + + + + + + + + + + + + + + + +Troost, et. al. Standards Track [Page 7] + +RFC 2183 Content-Disposition August 1997 + + + Content-Type: multipart/mixed; boundary=outer + Content-Description: multipart-1 + + --outer + Content-Type: text/plain + Content-Disposition: inline + Content-Description: text-part-1 + + Some text goes here + + --outer + Content-Type: multipart/mixed; boundary=inner + Content-Disposition: attachment + Content-Description: multipart-2 + + --inner + Content-Type: text/plain + Content-Disposition: inline + Content-Description: text-part-2 + + Some more text here. + + --inner + Content-Type: image/jpeg + Content-Disposition: attachment + Content-Description: jpeg-1 + + <jpeg data> + --inner-- + --outer-- + +4. Summary + + Content-Disposition takes one of two values, `inline' and + `attachment'. `Inline' indicates that the entity should be + immediately displayed to the user, whereas `attachment' means that + the user should take additional action to view the entity. + + The `filename' parameter can be used to suggest a filename for + storing the bodypart, if the user wishes to store it in an external + file. + + + + + + + + + + +Troost, et. al. Standards Track [Page 8] + +RFC 2183 Content-Disposition August 1997 + + +5. Security Considerations + + There are security issues involved any time users exchange data. + While these are not to be minimized, neither does this memo change + the status quo in that regard, except in one instance. + + Since this memo provides a way for the sender to suggest a filename, + a receiving MUA must take care that the sender's suggested filename + does not represent a hazard. Using UNIX as an example, some hazards + would be: + + + Creating startup files (e.g., ".login"). + + + Creating or overwriting system files (e.g., "/etc/passwd"). + + + Overwriting any existing file. + + + Placing executable files into any command search path + (e.g., "~/bin/more"). + + + Sending the file to a pipe (e.g., "| sh"). + + In general, the receiving MUA should not name or place the file such + that it will get interpreted or executed without the user explicitly + initiating the action. + + It is very important to note that this is not an exhaustive list; it + is intended as a small set of examples only. Implementors must be + alert to the potential hazards on their target systems. + +6. References + + [RFC 2119] + Bradner, S., "Key words for use in RFCs to Indicate Requirement + Levels", RFC 2119, March 1997. + + [RFC 2184] + Freed, N. and K. Moore, "MIME Parameter value and Encoded Words: + Character Sets, Lanaguage, and Continuations", RFC 2184, August + 1997. + + [RFC 2045] + Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail + Extensions) Part One: Format of Internet Message Bodies", RFC + 2045, December 1996. + + + + + + +Troost, et. al. Standards Track [Page 9] + +RFC 2183 Content-Disposition August 1997 + + + [RFC 2046] + Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail + Extensions) Part Two: Media Types", RFC 2046, December 1996. + + [RFC 2047] + Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part + Three: Message Header Extensions for non-ASCII Text", RFC 2047, + December 1996. + + [RFC 2048] + Freed, N., Klensin, J. and J. Postel, "MIME (Multipurpose + Internet Mail Extensions) Part Four: Registration Procedures", + RFC 2048, December 1996. + + [RFC 2049] + Freed, N. and N. Borenstein, "MIME (Multipurpose Internet Mail + Extensions) Part Five: Conformance Criteria and Examples", RFC + 2049, December 1996. + + [RFC 822] + Crocker, D., "Standard for the Format of ARPA Internet Text + Messages", STD 11, RFC 822, UDEL, August 1982. + +7. Acknowledgements + + We gratefully acknowledge the help these people provided during the + preparation of this draft: + + Nathaniel Borenstein + Ned Freed + Keith Moore + Dave Crocker + Dan Pritchett + + + + + + + + + + + + + + + + + + +Troost, et. al. Standards Track [Page 10] + +RFC 2183 Content-Disposition August 1997 + + +8. Authors' Addresses + + You should blame the editor of this version of the document for any + changes since RFC 1806: + + Keith Moore + Department of Computer Science + University of Tennessee, Knoxville + 107 Ayres Hall + Knoxville TN 37996-1301 + USA + + Phone: +1 (423) 974-5067 + Fax: +1 (423) 974-8296 + Email: moore@cs.utk.edu + + + The authors of RFC 1806 are: + + Rens Troost + New Century Systems + 324 East 41st Street #804 + New York, NY, 10017 USA + + Phone: +1 (212) 557-2050 + Fax: +1 (212) 557-2049 + EMail: rens@century.com + + + Steve Dorner + QUALCOMM Incorporated + 6455 Lusk Boulevard + San Diego, CA 92121 + USA + + EMail: sdorner@qualcomm.com + + +9. Registration of New Content-Disposition Values and Parameters + + New Content-Disposition values (besides "inline" and "attachment") + may be defined only by Internet standards-track documents, or in + Experimental documents approved by the Internet Engineering Steering + Group. + + + + + + + +Troost, et. al. Standards Track [Page 11] + +RFC 2183 Content-Disposition August 1997 + + + New content-disposition parameters may be registered by supplying the + information in the following template and sending it via electronic + mail to IANA@IANA.ORG: + + To: IANA@IANA.ORG + Subject: Registration of new Content-Disposition parameter + + Content-Disposition parameter name: + + Allowable values for this parameter: + (If the parameter can only assume a small number of values, + list each of those values. Otherwise, describe the values + that the parameter can assume.) + Description: + (What is the purpose of this parameter and how is it used?) + +10. Changes since RFC 1806 + + The following changes have been made since the earlier version of + this document, published in RFC 1806 as an Experimental protocol: + + + Updated references to MIME documents. In some cases this + involved substituting a reference to one of the current MIME + RFCs for a reference to RFC 1521; in other cases, a reference to + RFC 1521 was simply replaced with the word "MIME". + + + Added a section on registration procedures, since none of the + procedures in RFC 2048 seemed to be appropriate. + + + Added new parameter types: creation-date, modification-date, + read-date, and size. + + + + Incorporated a reference to draft-freed-pvcsc-* for encoding + long or non-ASCII parameter values. + + + Added reference to RFC 2119 to define MUST, SHOULD, etc. + keywords. + + + + + + + + + + + + + +Troost, et. al. Standards Track [Page 12] + diff --git a/rfc/rfc2231.txt b/rfc/rfc2231.txt @@ -0,0 +1,563 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2231 Innosoft +Updates: 2045, 2047, 2183 K. Moore +Obsoletes: 2184 University of Tennessee +Category: Standards Track November 1997 + + + MIME Parameter Value and Encoded Word Extensions: + Character Sets, Languages, and Continuations + + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1997). All Rights Reserved. + +1. Abstract + + This memo defines extensions to the RFC 2045 media type and RFC 2183 + disposition parameter value mechanisms to provide + + (1) a means to specify parameter values in character sets + other than US-ASCII, + + (2) to specify the language to be used should the value be + displayed, and + + (3) a continuation mechanism for long parameter values to + avoid problems with header line wrapping. + + This memo also defines an extension to the encoded words defined in + RFC 2047 to allow the specification of the language to be used for + display as well as the character set. + +2. Introduction + + The Multipurpose Internet Mail Extensions, or MIME [RFC-2045, RFC- + 2046, RFC-2047, RFC-2048, RFC-2049], define a message format that + allows for: + + + + + +Freed & Moore Standards Track [Page 1] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) non-textual message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + MIME is now widely deployed and is used by a variety of Internet + protocols, including, of course, Internet email. However, MIME's + success has resulted in the need for additional mechanisms that were + not provided in the original protocol specification. + + In particular, existing MIME mechanisms provide for named media type + (content-type field) parameters as well as named disposition + (content-disposition field). A MIME media type may specify any + number of parameters associated with all of its subtypes, and any + specific subtype may specify additional parameters for its own use. A + MIME disposition value may specify any number of associated + parameters, the most important of which is probably the attachment + disposition's filename parameter. + + These parameter names and values end up appearing in the content-type + and content-disposition header fields in Internet email. This + inherently imposes three crucial limitations: + + (1) Lines in Internet email header fields are folded + according to RFC 822 folding rules. This makes long + parameter values problematic. + + (2) MIME headers, like the RFC 822 headers they often + appear in, are limited to 7bit US-ASCII, and the + encoded-word mechanisms of RFC 2047 are not available + to parameter values. This makes it impossible to have + parameter values in character sets other than US-ASCII + without specifying some sort of private per-parameter + encoding. + + (3) It has recently become clear that character set + information is not sufficient to properly display some + sorts of information -- language information is also + needed [RFC-2130]. For example, support for + handicapped users may require reading text string + + + + + + +Freed & Moore Standards Track [Page 2] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + aloud. The language the text is written in is needed + for this to be done correctly. Some parameter values + may need to be displayed, hence there is a need to + allow for the inclusion of language information. + + The last problem on this list is also an issue for the encoded words + defined by RFC 2047, as encoded words are intended primarily for + display purposes. + + This document defines extensions that address all of these + limitations. All of these extensions are implemented in a fashion + that is completely compatible at a syntactic level with existing MIME + implementations. In addition, the extensions are designed to have as + little impact as possible on existing uses of MIME. + + IMPORTANT NOTE: These mechanisms end up being somewhat gibbous when + they actually are used. As such, these mechanisms should not be used + lightly; they should be reserved for situations where a real need for + them exists. + +2.1. Requirements notation + + This document occasionally uses terms that appear in capital letters. + When the terms "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY" + appear capitalized, they are being used to indicate particular + requirements of this specification. A discussion of the meanings of + these terms appears in [RFC- 2119]. + +3. Parameter Value Continuations + + Long MIME media type or disposition parameter values do not interact + well with header line wrapping conventions. In particular, proper + header line wrapping depends on there being places where linear + whitespace (LWSP) is allowed, which may or may not be present in a + parameter value, and even if present may not be recognizable as such + since specific knowledge of parameter value syntax may not be + available to the agent doing the line wrapping. The result is that + long parameter values may end up getting truncated or otherwise + damaged by incorrect line wrapping implementations. + + A mechanism is therefore needed to break up parameter values into + smaller units that are amenable to line wrapping. Any such mechanism + MUST be compatible with existing MIME processors. This means that + + (1) the mechanism MUST NOT change the syntax of MIME media + type and disposition lines, and + + + + + +Freed & Moore Standards Track [Page 3] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + (2) the mechanism MUST NOT depend on parameter ordering + since MIME states that parameters are not order + sensitive. Note that while MIME does prohibit + modification of MIME headers during transport, it is + still possible that parameters will be reordered when + user agent level processing is done. + + The obvious solution, then, is to use multiple parameters to contain + a single parameter value and to use some kind of distinguished name + to indicate when this is being done. And this obvious solution is + exactly what is specified here: The asterisk character ("*") followed + by a decimal count is employed to indicate that multiple parameters + are being used to encapsulate a single parameter value. The count + starts at 0 and increments by 1 for each subsequent section of the + parameter value. Decimal values are used and neither leading zeroes + nor gaps in the sequence are allowed. + + The original parameter value is recovered by concatenating the + various sections of the parameter, in order. For example, the + content-type field + + Content-Type: message/external-body; access-type=URL; + URL*0="ftp://"; + URL*1="cs.utk.edu/pub/moore/bulk-mailer/bulk-mailer.tar" + + is semantically identical to + + Content-Type: message/external-body; access-type=URL; + URL="ftp://cs.utk.edu/pub/moore/bulk-mailer/bulk-mailer.tar" + + Note that quotes around parameter values are part of the value + syntax; they are NOT part of the value itself. Furthermore, it is + explicitly permitted to have a mixture of quoted and unquoted + continuation fields. + +4. Parameter Value Character Set and Language Information + + Some parameter values may need to be qualified with character set or + language information. It is clear that a distinguished parameter + name is needed to identify when this information is present along + with a specific syntax for the information in the value itself. In + addition, a lightweight encoding mechanism is needed to accommodate 8 + bit information in parameter values. + + + + + + + + +Freed & Moore Standards Track [Page 4] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + Asterisks ("*") are reused to provide the indicator that language and + character set information is present and encoding is being used. A + single quote ("'") is used to delimit the character set and language + information at the beginning of the parameter value. Percent signs + ("%") are used as the encoding flag, which agrees with RFC 2047. + + Specifically, an asterisk at the end of a parameter name acts as an + indicator that character set and language information may appear at + the beginning of the parameter value. A single quote is used to + separate the character set, language, and actual value information in + the parameter value string, and an percent sign is used to flag + octets encoded in hexadecimal. For example: + + Content-Type: application/x-stuff; + title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A + + Note that it is perfectly permissible to leave either the character + set or language field blank. Note also that the single quote + delimiters MUST be present even when one of the field values is + omitted. This is done when either character set, language, or both + are not relevant to the parameter value at hand. This MUST NOT be + done in order to indicate a default character set or language -- + parameter field definitions MUST NOT assign a default character set + or language. + +4.1. Combining Character Set, Language, and Parameter Continuations + + Character set and language information may be combined with the + parameter continuation mechanism. For example: + + Content-Type: application/x-stuff + title*0*=us-ascii'en'This%20is%20even%20more%20 + title*1*=%2A%2A%2Afun%2A%2A%2A%20 + title*2="isn't it!" + + Note that: + + (1) Language and character set information only appear at + the beginning of a given parameter value. + + (2) Continuations do not provide a facility for using more + than one character set or language in the same + parameter value. + + (3) A value presented using multiple continuations may + contain a mixture of encoded and unencoded segments. + + + + + +Freed & Moore Standards Track [Page 5] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + (4) The first segment of a continuation MUST be encoded if + language and character set information are given. + + (5) If the first segment of a continued parameter value is + encoded the language and character set field delimiters + MUST be present even when the fields are left blank. + +5. Language specification in Encoded Words + + RFC 2047 provides support for non-US-ASCII character sets in RFC 822 + message header comments, phrases, and any unstructured text field. + This is done by defining an encoded word construct which can appear + in any of these places. Given that these are fields intended for + display, it is sometimes necessary to associate language information + with encoded words as well as just the character set. This + specification extends the definition of an encoded word to allow the + inclusion of such information. This is simply done by suffixing the + character set specification with an asterisk followed by the language + tag. For example: + + From: =?US-ASCII*EN?Q?Keith_Moore?= <moore@cs.utk.edu> + +6. IMAP4 Handling of Parameter Values + + IMAP4 [RFC-2060] servers SHOULD decode parameter value continuations + when generating the BODY and BODYSTRUCTURE fetch attributes. + +7. Modifications to MIME ABNF + + The ABNF for MIME parameter values given in RFC 2045 is: + + parameter := attribute "=" value + + attribute := token + ; Matching of attributes + ; is ALWAYS case-insensitive. + + This specification changes this ABNF to: + + parameter := regular-parameter / extended-parameter + + regular-parameter := regular-parameter-name "=" value + + regular-parameter-name := attribute [section] + + attribute := 1*attribute-char + + + + + +Freed & Moore Standards Track [Page 6] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + attribute-char := <any (US-ASCII) CHAR except SPACE, CTLs, + "*", "'", "%", or tspecials> + + section := initial-section / other-sections + + initial-section := "*0" + + other-sections := "*" ("1" / "2" / "3" / "4" / "5" / + "6" / "7" / "8" / "9") *DIGIT) + + extended-parameter := (extended-initial-name "=" + extended-value) / + (extended-other-names "=" + extended-other-values) + + extended-initial-name := attribute [initial-section] "*" + + extended-other-names := attribute other-sections "*" + + extended-initial-value := [charset] "'" [language] "'" + extended-other-values + + extended-other-values := *(ext-octet / attribute-char) + + ext-octet := "%" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") + + charset := <registered character set name> + + language := <registered language tag [RFC-1766]> + + The ABNF given in RFC 2047 for encoded-words is: + + encoded-word := "=?" charset "?" encoding "?" encoded-text "?=" + + This specification changes this ABNF to: + + encoded-word := "=?" charset ["*" language] "?" encoded-text "?=" + +8. Character sets which allow specification of language + + In the future it is likely that some character sets will provide + facilities for inline language labeling. Such facilities are + inherently more flexible than those defined here as they allow for + language switching in the middle of a string. + + + + + + + +Freed & Moore Standards Track [Page 7] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + If and when such facilities are developed they SHOULD be used in + preference to the language labeling facilities specified here. Note + that all the mechanisms defined here allow for the omission of + language labels so as to be able to accommodate this possible future + usage. + +9. Security Considerations + + This RFC does not discuss security issues and is not believed to + raise any security issues not already endemic in electronic mail and + present in fully conforming implementations of MIME. + +10. References + + [RFC-822] + Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", STD 11, RFC 822 August 1982. + + [RFC-1766] + Alvestrand, H., "Tags for the Identification of + Languages", RFC 1766, March 1995. + + [RFC-2045] + Freed, N., and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message + Bodies", RFC 2045, December 1996. + + [RFC-2046] + Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Two: Media Types", RFC 2046, + December 1996. + + [RFC-2047] + Moore, K., "Multipurpose Internet Mail Extensions (MIME) + Part Three: Representation of Non-ASCII Text in Internet + Message Headers", RFC 2047, December 1996. + + [RFC-2048] + Freed, N., Klensin, J. and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) Part Four: MIME + Registration Procedures", RFC 2048, December 1996. + + [RFC-2049] + Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Five: Conformance Criteria and + Examples", RFC 2049, December 1996. + + + + + +Freed & Moore Standards Track [Page 8] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + + [RFC-2060] + Crispin, M., "Internet Message Access Protocol - Version + 4rev1", RFC 2060, December 1996. + + [RFC-2119] + Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", RFC 2119, March 1997. + + [RFC-2130] + Weider, C., Preston, C., Simonsen, K., Alvestrand, H., + Atkinson, R., Crispin, M., and P. Svanberg, "Report from the + IAB Character Set Workshop", RFC 2130, April 1997. + + [RFC-2183] + Troost, R., Dorner, S. and K. Moore, "Communicating + Presentation Information in Internet Messages: The + Content-Disposition Header", RFC 2183, August 1997. + +11. Authors' Addresses + + Ned Freed + Innosoft International, Inc. + 1050 Lakes Drive + West Covina, CA 91790 + USA + + Phone: +1 626 919 3600 + Fax: +1 626 919 3614 + EMail: ned.freed@innosoft.com + + + Keith Moore + Computer Science Dept. + University of Tennessee + 107 Ayres Hall + Knoxville, TN 37996-1301 + USA + + EMail: moore@cs.utk.edu + + + + + + + + + + + + +Freed & Moore Standards Track [Page 9] + +RFC 2231 MIME Value and Encoded Word Extensions November 1997 + + +12. Full Copyright Statement + + Copyright (C) The Internet Society (1997). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Moore Standards Track [Page 10] + diff --git a/rfc/rfc2387.txt b/rfc/rfc2387.txt @@ -0,0 +1,563 @@ + + + + + + +Network Working Group E. Levinson +Request for Comments: 2387 August 1998 +Obsoletes: 2112 +Category: Standards Track + + + The MIME Multipart/Related Content-type + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1998). All Rights Reserved. + +Abstract + + The Multipart/Related content-type provides a common mechanism for + representing objects that are aggregates of related MIME body parts. + This document defines the Multipart/Related content-type and provides + examples of its use. + +1. Introduction + + Several applications of MIME, including MIME-PEM, and MIME-Macintosh + and other proposals, require multiple body parts that make sense only + in the aggregate. The present approach to these compound objects has + been to define specific multipart subtypes for each new object. In + keeping with the MIME philosophy of having one mechanism to achieve + the same goal for different purposes, this document describes a + single mechanism for such aggregate or compound objects. + + The Multipart/Related content-type addresses the MIME representation + of compound objects. The object is categorized by a "type" + parameter. Additional parameters are provided to indicate a specific + starting body part or root and auxiliary information which may be + required when unpacking or processing the object. + + Multipart/Related MIME entities may contain Content-Disposition + headers that provide suggestions for the storage and display of a + body part. Multipart/Related processing takes precedence over + Content-Disposition; the interaction between them is discussed in + section 4. + + + +Levinson Standards Track [Page 1] + +RFC 2387 Multipart/Related August 1998 + + + Responsibility for the display or processing of a Multipart/Related's + constituent entities rests with the application that handles the + compound object. + +2. Multipart/Related Registration Information + + The following form is copied from RFC 1590, Appendix A. + + To: IANA@isi.edu + Subject: Registration of new Media Type content-type/subtype + + Media Type name: Multipart + + Media subtype name: Related + + Required parameters: Type, a media type/subtype. + + Optional parameters: Start + Start-info + + Encoding considerations: Multipart content-types cannot have + encodings. + + Security considerations: Depends solely on the referenced type. + + Published specification: RFC-REL (this document). + + Person & email address to contact for further information: + Edward Levinson + 47 Clive Street + Metuchen, NJ 08840-1060 + +1 908 494 1606 + XIson@cnj.digex.net + +3. Intended usage + + The Multipart/Related media type is intended for compound objects + consisting of several inter-related body parts. For a + Multipart/Related object, proper display cannot be achieved by + individually displaying the constituent body parts. The content-type + of the Multipart/Related object is specified by the type parameter. + The "start" parameter, if given, points, via a content-ID, to the + body part that contains the object root. The default root is the + first body part within the Multipart/Related body. + + The relationships among the body parts of a compound object + distinguishes it from other object types. These relationships are + often represented by links internal to the object's components that + + + +Levinson Standards Track [Page 2] + +RFC 2387 Multipart/Related August 1998 + + + reference the other components. Within a single operating + environment the links are often file names, such links may be + represented within a MIME message using content-IDs or the value of + some other "Content-" headers. + +3.1. The Type Parameter + + The type parameter must be specified and its value is the MIME media + type of the "root" body part. It permits a MIME user agent to + determine the content-type without reference to the enclosed body + part. If the value of the type parameter and the root body part's + content-type differ then the User Agent's behavior is undefined. + +3.2. The Start Parameter + + The start parameter, if given, is the content-ID of the compound + object's "root". If not present the "root" is the first body part in + the Multipart/Related entity. The "root" is the element the + applications processes first. + +3.3. The Start-Info Parameter + + Additional information can be provided to an application by the + start-info parameter. It contains either a string or points, via a + content-ID, to another MIME entity in the message. A typical use + might be to provide additional command line parameters or a MIME + entity giving auxiliary information for processing the compound + object. + + Applications that use Multipart/Related must specify the + interpretation of start-info. User Agents shall provide the + parameter's value to the processing application. Processes can + distinguish a start-info reference from a token or quoted-string by + examining the first non-white-space character, "<" indicates a + reference. + +3.4. Syntax + + related-param := [ ";" "start" "=" cid ] + [ ";" "start-info" "=" + ( cid-list / value ) ] + [ ";" "type" "=" type "/" subtype ] + ; order independent + + cid-list := cid cid-list + + cid := msg-id ; c.f. [822] + + + + +Levinson Standards Track [Page 3] + +RFC 2387 Multipart/Related August 1998 + + + value := token / quoted-string ; c.f. [MIME] + ; value cannot begin with "<" + + Note that the parameter values will usually require quoting. Msg-id + contains the special characters "<", ">", "@", and perhaps other + special characters. If msg-id contains quoted-strings, those quote + marks must be escaped. Similarly, the type parameter contains the + special character "/". + +4. Handling Content-Disposition Headers + + Content-Disposition Headers [DISP] suggest presentation styles for + MIME body parts. [DISP] describes two presentation styles, called + the disposition type, INLINE and ATTACHMENT. These, used within a + multipart entity, allow the sender to suggest presentation + information. [DISP] also provides for an optional storage (file) + name. Content-Disposition headers could appear in one or more body + parts contained within a Multipart/Related entity. + + Using Content-Disposition headers in addition to Multipart/Related + provides presentation information to User Agents that do not + recognize Multipart/Related. They will treat the multipart as + Multipart/Mixed and they may find the Content-Disposition information + useful. + + With Multipart/Related however, the application processing the + compound object determines the presentation style for all the + contained parts. In that context the Content-Disposition header + information is redundant or even misleading. Hence, User Agents that + understand Multipart/Related shall ignore the disposition type within + a Multipart/Related body part. + + It may be possible for a User Agent capable of handling both + Multipart/Related and Content-Disposition headers to provide the + invoked application the Content-Disposition header's optional + filename parameter to the Multipart/Related. The use of that + information will depend on the specific application and should be + specified when describing the handling of the corresponding compound + object. Such descriptions would be appropriate in an RFC registering + that object's media type. + +5. Examples + +5.1 Application/X-FixedRecord + + The X-FixedRecord content-type consists of one or more octet-streams + and a list of the lengths of each record. The root, which lists the + record lengths of each record within the streams. The record length + + + +Levinson Standards Track [Page 4] + +RFC 2387 Multipart/Related August 1998 + + + list, type Application/X-FixedRecord, consists of a set of INTEGERs + in ASCII format, one per line. Each INTEGER gives the number of + octets from the octet-stream body part that constitute the next + "record". + + The example below, uses a single data block. + + Content-Type: Multipart/Related; boundary=example-1 + start="<950120.aaCC@XIson.com>"; + type="Application/X-FixedRecord" + start-info="-o ps" + + --example-1 + Content-Type: Application/X-FixedRecord + Content-ID: <950120.aaCC@XIson.com> + + 25 + 10 + 34 + 10 + 25 + 21 + 26 + 10 + --example-1 + Content-Type: Application/octet-stream + Content-Description: The fixed length records + Content-Transfer-Encoding: base64 + Content-ID: <950120.aaCB@XIson.com> + + T2xkIE1hY0RvbmFsZCBoYWQgYSBmYXJtCkUgSS + BFIEkgTwpBbmQgb24gaGlzIGZhcm0gaGUgaGFk + IHNvbWUgZHVja3MKRSBJIEUgSSBPCldpdGggYS + BxdWFjayBxdWFjayBoZXJlLAphIHF1YWNrIHF1 + YWNrIHRoZXJlLApldmVyeSB3aGVyZSBhIHF1YW + NrIHF1YWNrCkUgSSBFIEkgTwo= + + --example-1-- + + + + + + + + + + + + + +Levinson Standards Track [Page 5] + +RFC 2387 Multipart/Related August 1998 + + +5.2 Text/X-Okie + + The Text/X-Okie is an invented markup language permitting the + inclusion of images with text. A feature of this example is the + inclusion of two additional body parts, both picture. They are + referred to internally by the encapsulated document via each + picture's body part content-ID. Usage of "cid:", as in this example, + may be useful for a variety of compound objects. It is not, however, + a part of the Multipart/Related specification. + + Content-Type: Multipart/Related; boundary=example-2; + start="<950118.AEBH@XIson.com>" + type="Text/x-Okie" + + --example-2 + Content-Type: Text/x-Okie; charset=iso-8859-1; + declaration="<950118.AEB0@XIson.com>" + Content-ID: <950118.AEBH@XIson.com> + Content-Description: Document + + {doc} + This picture was taken by an automatic camera mounted ... + {image file=cid:950118.AECB@XIson.com} + {para} + Now this is an enlargement of the area ... + {image file=cid:950118:AFDH@XIson.com} + {/doc} + --example-2 + Content-Type: image/jpeg + Content-ID: <950118.AFDH@XIson.com> + Content-Transfer-Encoding: BASE64 + Content-Description: Picture A + + [encoded jpeg image] + --example-2 + Content-Type: image/jpeg + Content-ID: <950118.AECB@XIson.com> + Content-Transfer-Encoding: BASE64 + Content-Description: Picture B + + [encoded jpeg image] + --example-2-- + +5.3 Content-Disposition + + In the above example each image body part could also have a Content- + Disposition header. For example, + + + + +Levinson Standards Track [Page 6] + +RFC 2387 Multipart/Related August 1998 + + + --example-2 + Content-Type: image/jpeg + Content-ID: <950118.AECB@XIson.com> + Content-Transfer-Encoding: BASE64 + Content-Description: Picture B + Content-Disposition: INLINE + + [encoded jpeg image] + --example-2-- + + User Agents that recognize Multipart/Related will ignore the + Content-Disposition header's disposition type. Other User Agents + will process the Multipart/Related as Multipart/Mixed and may make + use of that header's information. + +6. User Agent Requirements + + User agents that do not recognize Multipart/Related shall, in + accordance with [MIME], treat the entire entity as Multipart/Mixed. + MIME User Agents that do recognize Multipart/Related entities but are + unable to process the given type should give the user the option of + suppressing the entire Multipart/Related body part shall be. + + Existing MIME-capable mail user agents (MUAs) handle the existing + media types in a straightforward manner. For discrete media types + (e.g. text, image, etc.) the body of the entity can be directly + passed to a display process. Similarly the existing composite + subtypes can be reduced to handing one or more discrete types. + Handling Multipart/Related differs in that processing cannot be + reduced to handling the individual entities. + + The following sections discuss what information the processing + application requires. + + It is possible that an application specific "receiving agent" will + manipulate the entities for display prior to invoking actual + application process. Okie, above, is an example of this; it may need + a receiving agent to parse the document and substitute local file + names for the originator's file names. Other applications may just + require a table showing the correspondence between the local file + names and the originator's. The receiving agent takes responsibility + for such processing. + +6.1 Data Requirements + + MIME-capable mail user agents (MUAs) are required to provide the + application: + + + + +Levinson Standards Track [Page 7] + +RFC 2387 Multipart/Related August 1998 + + + (a) the bodies of the MIME entities and the entity Content-* headers, + + (b) the parameters of the Multipart/Related Content-type header, and + + (c) the correspondence between each body's local file name, that + body's header data, and, if present, the body part's content-ID. + +6.2 Storing Multipart/Related Entities + + The Multipart/Related media type will be used for objects that have + internal linkages between the body parts. When the objects are + stored the linkages may require processing by the application or its + receiving agent. + +6.3 Recursion + + MIME is a recursive structure. Hence one must expect a + Multipart/Related entity to contain other Multipart/Related entities. + When a Multipart/Related entity is being processed for display or + storage, any enclosed Multipart/Related entities shall be processed + as though they were being stored. + +6.4 Configuration Considerations + + It is suggested that MUAs that use configuration mechanisms, see + [CFG] for an example, refer to Multipart/Related as Multi- + part/Related/<type>, were <type> is the value of the "type" + parameter. + +7. Security Considerations + + Security considerations relevant to Multipart/Related are identical + to those of the underlying content-type. + +8. Acknowledgments + + This proposal is the result of conversations the author has had with + many people. In particular, Harald A. Alvestrand, James Clark, + Charles Goldfarb, Gary Houston, Ned Freed, Ray Moody, and Don + Stinchfield, provided both encouragement and invaluable help. The + author, however, take full responsibility for all errors contained in + this document. + + + + + + + + + +Levinson Standards Track [Page 8] + +RFC 2387 Multipart/Related August 1998 + + +9. References + + [822] Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", STD 11, RFC 822, August 1982. + + [CID] Levinson, E., and J. Clark, "Message/External-Body + Content-ID Access Type", RFC 1873, December 1995, + Levinson, E., "Message/External-Body Content-ID Access + Type", Work in Progress. + + [CFG] Borenstein, N., "A User Agent Configuration Mechanism For + Multimedia Mail Format Information", RFC 1524, September + 1993. + + [DISP] Troost, R., and S. Dorner, "Communicating Presentation + Information in Internet Messages: The Content- + Disposition Header", RFC 1806, June 1995. + + [MIME] Borenstein, N., and Freed, N., "Multipurpose Internet + Mail Extensions (MIME) Part One: Format of Internet + Message Bodies", RFC 2045, November 1996. + +9. Author's Address + + Edward Levinson + 47 Clive Street + Metuchen, NJ 08840-1060 + USA + + Phone: +1 908 494 1606 + EMail: XIson@cnj.digex.com + +10. Changes from previous draft (RFC 2112) + + Corrected cid urls to conform to RFC 2111; the angle brackets were + removed. + + + + + + + + + + + + + + + +Levinson Standards Track [Page 9] + +RFC 2387 Multipart/Related August 1998 + + +11. Full Copyright Statement + + Copyright (C) The Internet Society (1998). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + + + + + + + + + + + + + + + + + + + + + + + +Levinson Standards Track [Page 10] + diff --git a/rfc/rfc2425.txt b/rfc/rfc2425.txt @@ -0,0 +1,1851 @@ + + + + + + +Network Working Group T. Howes +Request for Comments: 2425 M. Smith +Category: Standards Track Netscape Communications Corp. + F. Dawson + Lotus Development Corporation + September 1998 + + + A MIME Content-Type for Directory Information + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1998). All Rights Reserved. + +1. Abstract + + This document defines a MIME Content-Type for holding directory + information. The definition is independent of any particular + directory service or protocol. The text/directory Content-Type is + defined for holding a variety of directory information, for example, + name, or email address, or logo. The text/directory Content-Type can + also be used as the root body part in a multipart/related Content- + Type for handling more complicated situations, especially those in + which non-textual information that already has a natural MIME + representation, for example, a photograph or sound, is to be + represented. + + The text/directory Content-Type defines a general framework and + format for holding directory information in a simple "type:value" + form. We refer to "type" in this context meaning a property or + attribute with which the value is associated. Mechanisms are defined + to specify alternate languages, encodings and other meta-information. + This document also defines the procedure by which particular formats, + called profiles, for carrying application-specific information within + a text/directory Content-Type can be defined and registered, and the + conventions such formats must follow. It is expected that other + documents will be produced that define such formats for various + applications (e.g., white pages). + + + + + +Howes, et. al. Standards Track [Page 1] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this + document are to be interpreted as described in [RFC-2119]. + +2. Table of Contents + + Status of the Memo................................................ 1 + Copyright Notice.................................................. 1 + 1. Abstract...................................................... 1 + 2. Table of Contents............................................. 2 + 3. Need for a MIME Directory Type................................ 3 + 4. Overview...................................................... 4 + 5. The text/directory Content-Type............................... 4 + 5.1. MIME media type name........................................ 4 + 5.2. MIME subtype name........................................... 5 + 5.3. Required parameters......................................... 5 + 5.4. Optional parameters......................................... 5 + 5.5. Encoding considerations..................................... 5 + 5.6. Security considerations..................................... 6 + 5.7. Interoperability considerations............................. 6 + 5.8. Published specification..................................... 6 + 5.8.1. Line delimiting and folding............................... 6 + 5.8.2. ABNF content-type definition.............................. 7 + 5.8.3. Pre-defined Parameters.................................... 9 + 5.8.4. Pre-defined Value Types...................................11 + 5.9. Applications which use this media type......................14 + 5.10. Additional information.....................................14 + 5.11. Person & email address to contact for further information..14 + 5.12. Intended usage.............................................14 + 5.13. Author/Change controller...................................15 + 6. Predefined Types..............................................15 + 6.1. SOURCE Type Definition......................................15 + 6.2. NAME Type Definition........................................16 + 6.3. PROFILE Type Definition.....................................16 + 6.4. BEGIN Type Definition.......................................17 + 6.5. END Type Definition.........................................17 + 7. Use of the multipart/related Content-Type.....................18 + 8. Examples.......................................................18 + 8.1. Example 1...................................................19 + 8.2. Example 2...................................................19 + 8.3. Example 3...................................................20 + 8.4. Example 4...................................................21 + 9. Registration of new profiles..................................22 + 9.1. Define the profile..........................................22 + 9.2. Post the profile definition.................................23 + 9.3. Allow a comment period......................................23 + 9.4. Submit the profile for approval.............................23 + 10. Profile Change Control.......................................23 + + + +Howes, et. al. Standards Track [Page 2] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + 11. Registration of new types....................................24 + 11.1. Define the type............................................24 + 11.2. Post the type definition...................................25 + 11.3. Allow a comment period.....................................25 + 11.4. Submit the type for approval...............................25 + 12. Type Change Control..........................................25 + 13. Registration of new parameters...............................26 + 13.1. Define the parameter.......................................26 + 13.2. Post the parameter definition..............................27 + 13.3. Allow a comment period.....................................27 + 13.4. Submit the parameter for approval..........................27 + 14. Parameter Change Control.....................................28 + 15. Registration of new value types..............................28 + 15.1. Define the value type......................................28 + 15.2. Post the value type definition.............................29 + 15.3. Allow a comment period.....................................29 + 15.4. Submit the value type for approval.........................29 + 16. Security Considerations......................................30 + 17. Acknowledgements..............................................30 + 18. References....................................................30 + 19. Authors' Addresses...........................................32 + 20. Full Copyright Statement......................................33 + +3. Need for a MIME Directory Type + + For purposes of this document, a directory is a special-purpose + database that contains typed information. A directory usually + supports both read and search of the information it contains, and can + support creation and modification of the information as well. + Directory information is usually accessed far more often than it is + updated. Directories can be local or global in scope. They can be + distributed or centralized. The information they contain can be + replicated, with weak or strong consistency requirements. + + There are several situations in which users of Internet mail might + wish to exchange directory information: the email analogy of a + "business card" exchange; the conveyance of directory information to + a user having only email access to the Internet; the provision of + machine-parseable address information when purchasing goods or + services over the Internet; etc. As MIME [RFC-2045, RFC-2046] is + used increasingly by other protocols, most notably HTTP, it can also + be useful for these protocols to carry directory information in MIME + format. Such a format, for example, could be used to represent URC + (uniform resource characteristics) information about resources on the + World Wide Web, or to provide a rudimentary directory service over + HTTP. + + + + + +Howes, et. al. Standards Track [Page 3] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +4. Overview + + The scheme defined here for representing directory information in a + MIME Content-Type has two parts. First, the text/directory Content- + Type is defined for use in holding directory information within a + single body part, for example name, title, or email address. In its + simplest form, the format uses a "type:value" approach, which should + be easily parseable by existing MIME implementations and + understandable by users. More complicated situations can be + represented also. This document defines the general form the + information in the Content-Type should have, and the procedure by + which specific types and values (properties) for particular + applications can be defined. The framework is general enough to + handle information from any number of end directory services, + including LDAP [RFC-1777, RFC-1778], WHOIS++ [RFC-1835], and X.500 + [X500]. + + Directory entries can include far more than just textual information. + Some such information (e.g., an image or sound) overlaps with + predefined MIME Content-Types. In these cases it can be desirable to + include the information in its well-known MIME format. This situation + is handled by using a multipart/related Content-Type as defined in + [RFC-2112]. The root component of this type is a text/directory body + part specifying any in-line information, and for information + contained in other Content-Types, the Content-IDs (in URI form) of + those parts. + + In some applications, it can be useful to include a pointer (e.g, a + URI) to some directory information rather than the information + itself. This document defines a general mechanism for accomplishing + this. + +5. The text/directory Content-Type + + The text/directory Content-Type is used to hold basic directory + information and URIs referencing other information, including other + MIME body parts holding supplementary or non-textual directory + information, such as an image or sound. It is defined as follows, + using the MIME media type registration template from [RFC-2048]. + + To: ietf-types@uninett.no + Subject: Registration of MIME media type text/directory + +5.1. MIME media type name + + MIME media type name: text + + + + + +Howes, et. al. Standards Track [Page 4] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +5.2. MIME subtype name + + MIME subtype name: directory + +5.3. Required parameters + + Required parameters: charset + + The "charset" parameter is as defined in [RFC-2046] for other body + parts. It is used to identify the default character set used within + the body part. + +5.4. Optional parameters + + Optional parameters: profile + + The "profile" parameter is used to convey the type(s) of entity(ies) + to which the directory information pertains and the likely set of + information associated with the entity(ies). It is intended only as a + guide to applications interpreting the information contained within + the body part. It SHOULD NOT be used to exclude or require particular + pieces of information unless a profile definition specifically calls + for this behavior. Unless specifically forbidden by a particular + profile definition, a text/directory content type can contain + arbitrary attribute/value pairs. + + The value of the "profile" parameter is defined as follows. Profile + names are case insensitive (i.e., the profile name "vCard" is the + same as "VCARD" and "vcard" and "vcArD"). + + profile = x-name / iana-token + + x-name = "x-" 1*(ALPHA / DIGIT / "-") + ; Names beginning with "x-" or "X-" are + ; reserved for experimental use not intended for released + ; products, or for use in bilateral agreements. + + iana-token = <a publicly-defined extension token, registered + with IANA, as specified in Section 9 of this + document> + +5.5. Encoding considerations + + The default encoding is 8bit. Otherwise, as specified by the + Content-Transfer-Encoding header field. + + + + + + +Howes, et. al. Standards Track [Page 5] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +5.6. Security considerations + + Directory information can be public or it can be protected from + unauthorized access by the directory service in which it resides. + Once the information leaves its native service, there can be no + guarantee that the same care will be taken by all services handling + the information. Furthermore, this specification defines no access + control mechanism by which information can be protected, or by which + access control information can be conveyed. Note that the integrity + and privacy of a text/directory body part can be protected by + enclosing it within an appropriate MIME-based security mechanism. + +5.7. Interoperability considerations + + In order to make sense of directory information, applications must + share a common understanding of the types of information contained + within the Content-Type (the directory schema). This schema + information is not defined in this document, but rather in companion + documents (e.g., [MIME-VCARD]) that follow the requirements specified + in this document, or in bilateral agreements between communicating + parties. + +5.8. Published specification + + The text/directory Content-Type contains directory information, + typically pertaining to a single directory entity or group of + entities. The content consists of one or more lines in the format + given below. + +5.8.1. Line delimiting and folding + + Individual lines within the MIME text/directory Content Type body are + delimited by the [RFC-822] line break, which is a CRLF sequence + (ASCII decimal 13, followed by ASCII decimal 10). Long logical lines + of text can be split into a multiple-physical-line representation + using the following folding technique. + + A logical line MAY be continued on the next physical line anywhere + between two characters by inserting a CRLF immediately followed by a + single white space character (space, ASCII decimal 32, or horizontal + tab, ASCII decimal 9). At least one character must be present on the + folded line. Any sequence of CRLF followed immediately by a single + white space character is ignored (removed) when processing the + content type. For example the line: + + DESCRIPTION:This is a long description that exists on a long line. + + Can be represented as: + + + +Howes, et. al. Standards Track [Page 6] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + DESCRIPTION:This is a long description + that exists on a long line. + + It could also be represented as: + + DESCRIPTION:This is a long descrip + tion that exists o + n a long line. + + The process of moving from this folded multiple-line representation + of a type definition to its single line representation is called + unfolding. Unfolding is accomplished by regarding CRLF immediately + followed by a white space character (namely HTAB ASCII decimal 9 or + SPACE ASCII decimal 32) as equivalent to no characters at all (i.e., + the CRLF and single white space character are removed). + +5.8.2. ABNF content-type definition + + The following ABNF uses the notation of RFC 2234, which also defines + CRLF, WSP, DQUOTE, VCHAR, ALPHA, and DIGIT. After the unfolding of + any folded lines as described above, the syntax for a line of this + content type is as follows: + + contentline = [group "."] name *(";" param) ":" value CRLF + ; When parsing a content line, folded lines MUST first + ; be unfolded according to the unfolding procedure + ; described above. + ; When generating a content line, lines longer than 75 + ; characters SHOULD be folded according to the folding + ; procedure described above. + + group = 1*(ALPHA / DIGIT / "-") + + name = x-name / iana-token + + iana-token = 1*(ALPHA / DIGIT / "-") + ; identifier registered with IANA + + x-name = "x-" 1*(ALPHA / DIGIT / "-") + ; Names that begin with "x-" or "X-" are + ; reserved for experimental use, not intended for released + ; products, or for use in bilateral agreements. + + param = param-name "=" param-value *("," param-value) + + param-name = x-name / iana-token + + param-value = ptext / quoted-string + + + +Howes, et. al. Standards Track [Page 7] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + ptext = *SAFE-CHAR + + value = *VALUE-CHAR + / valuespec ; valuespec defined in section 5.8.4 + + quoted-string = DQUOTE *QSAFE-CHAR DQUOTE + + NON-ASCII = %x80-FF + ; use restricted by charset parameter + ; on outer MIME object (UTF-8 preferred) + + QSAFE-CHAR = WSP / %x21 / %x23-7E / NON-ASCII + ; Any character except CTLs, DQUOTE + + SAFE-CHAR = WSP / %x21 / %x23-2B / %x2D-39 / %x3C-7E / NON-ASCII + ; Any character except CTLs, DQUOTE, ";", ":", "," + + VALUE-CHAR = WSP / VCHAR / NON-ASCII + ; any textual character + + A line that begins with a white space character is a continuation of + the previous line, as described above. The white space character and + immediately preceeding CRLF should be discarded when reconstructing + the original line. Note that this line-folding convention differs + from that found in RFC 822, in that the sequence <CRLF><WSP> found + anywhere in the content indicates a continued line and should be + removed. + + Various type names and the format of the corresponding values are + defined as specified in Section 11. Specifications MAY impose + ordering on the type constructs within a body part, though none is + required by default. The various x-name constructs are used for + bilaterally-agreed upon type names, parameter names and parameter + values, or for use in experimental settings. + + Type names and parameter names are case insensitive (e.g., the type + name "fn" is the same as "FN" and "Fn"). Parameter values MAY be case + sensitive or case insensitive, depending on their definition. + + The group construct is used to group related attributes together. + The group name is a syntactic convention used to indicate that all + type names prefaced with the same group name SHOULD be grouped + together when displayed by an application. It has no other + significance. Implementations that do not understand or support + grouping MAY simply strip off any text before a "." to the left of + the type name and present the types and values as normal. + + + + + +Howes, et. al. Standards Track [Page 8] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Each attribute defined in the text/directory body MAY have multiple + values, if allowed in the definition of the profile in which the + attribute is used. The general rule for encoding multi-valued items + is to simply create a new content line for each value (including the + type name). However, it should be noted that some value types + support encoding multiple values in a single content line by + separating the values with a comma ",". This approach has been taken + for several of the content types defined below (date, time, integer, + float), for space-saving reasons. + +5.8.3. Pre-defined Parameters + + The following parameters and value types are defined for general use. + + predefined-param = encodingparm + / valuetypeparm + / languageparm + / contextparm + + encodingparm = "encoding" "=" encodingtype + + encodingtype = "b" ; from RFC 2047 + / iana-token ; registered as described in + ; section 15 of this document + + valuetypeparm = "value" "=" valuetype + + valuetype = "uri" ; genericurl from secion 5 of RFC 1738 + / "text" + / "date" + / "time" + / "date-time" ; date time + / "integer" + / "boolean" + / "float" + / x-name + / iana-token ; registered as described in + ; section 15 of this document + + languageparm = "language" "=" Language-Tag + ; Language-Tag is defined in section 2 of RFC 1766 + + contextparm = "context" "=" context + + context = x-name + / iana-token + + + + + +Howes, et. al. Standards Track [Page 9] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + The "language" type parameter is used to identify data in multiple + languages. There is no concept of "default" language, except as + specified by any "Content-Language" MIME header parameter that is + present. The value of the "language" type parameter is a language + tag as defined in Section 2 of [RFC-1766]. + + The "context" type parameter is used to identify a context (e.g., a + protocol) used in interpreting the value. This is used, for example, + in the "source" type, defined below. + + The "encoding" type parameter is used to specify an alternate + encoding for a value. If the value contains a CRLF, it must be + encoded, since CRLF is used to separate lines in the content-type + itself. Currently, only the "b" encoding is supported. + + The "b" encoding can also be useful for binary values that are mixed + with other text information in the body part (e.g., a certificate). + Using a per-value "b" encoding in this case leaves the other + information in a more readable form. The encoded base 64 value can be + split across multiple physical lines in the content type by using the + line folding technique described above. + + The Content-Transfer-Encoding header field is used to specify the + encoding used for the body part as a whole. The "encoding" type + parameter is used to specify an encoding for a particular value + (e.g., a certificate). In this case, the Content-Transfer-Encoding + header might specify "8bit", while the one certificate value might + specify an encoding of "b" via an "encoding=b" type parameter. + + The Content-Transfer-Encoding and the encodings of individual types + given by the "encoding" type parameter are independent of one + another. When encoding a text/directory body part for transmission, + individual type encodings are performed first, then the entire body + part is encoded according to the Content-Transfer-Encoding. When + decoding a text/directory body part, the Content-Transfer-Encoding is + decoded first, and then any individual types with an "encoding" type + parameter are decoded. + + The "value" parameter is optional, and is used to identify the value + type (data type) and format of the value. The use of these + predefined formats is encouraged even if the value parameter is not + explicity used. By defining a standard set of value types and their + formats, existing parsing and processing code can be leveraged. + + Including the value type explicitly as part of each property provides + an extra hint to keep parsing simple and support more generalized + applications. For example a search engine would not have to know the + particular value types for all of the items for which it is + + + +Howes, et. al. Standards Track [Page 10] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + searching. Because the value type is explicit in the definition, the + search engine could look for dates in any item type and provide + results that can still be interpreted. + +5.8.4. Pre-defined Value Types + + The format for values corresponding to the predefined valuetype + specifications given above are defined. + + valuespec = text-list + / genericurl ; from section 5 of RFC 1738 + / date-list + / time-list + / date-time-list + / boolean + / integer-list + / float-list + / iana-valuespec + + text-list = *TEXT-LIST-CHAR *("," *TEXT-LIST-CHAR) + + TEXT-LIST-CHAR = "\\" / "\," / "\n" + / <any VALUE-CHAR except , or \ or newline> + ; Backslashes, newlines, and commas must be encoded. + ; \n or \N can be used to encode a newline. + + date-list = date *("," date) + + time-list = time *("," time) + + date-time-list = date "T" time *("," date "T" time) + + boolean = "TRUE" / "FALSE" + + integer-list = integer *("," integer) + + integer = [sign] 1*DIGIT + + float-list = float *("," float) + + float = [sign] 1*DIGIT ["." 1*DIGIT] + + sign = "+" / "-" + + date = date-fullyear ["-"] date-month ["-"] date-mday + + date-fullyear = 4 DIGIT + + + + +Howes, et. al. Standards Track [Page 11] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + date-month = 2 DIGIT ;01-12 + + date-mday = 2 DIGIT ;01-28, 01-29, 01-30, 01-31 + ;based on month/year + + time = time-hour [":"] time-minute [":"] time-second [time-secfrac] + [time-zone] + + time-hour = 2 DIGIT ;00-23 + + time-minute = 2 DIGIT ;00-59 + + time-second = 2 DIGIT ;00-60 (leap second) + + time-secfrac = "," 1*DIGIT + + time-zone = "Z" / time-numzone + + time-numzome = sign time-hour [":"] time-minute + + iana-valuespec = <a publicly-defined valuetype format, registered + with IANA, as defined in section 15 of this + document> + + Some specific notes on the value types and formats: + + "text": The "text" value type should be used to identify values that + contain human-readable text. The character set and language in which + the text is represented is controlled by the charset content-header + and the language type parameter and content-header. + + Examples for "text": + this is a text value + this is one value,this is another + this is a single value\, with a comma encoded + + A formatted text line break in a text value type MUST be represented + as the character sequence backslash (ASCII decimal 92) followed by a + Latin small letter n (ASCII decimal 110) or a Latin capital letter N + (ASCII decimal 78), that is "\n" or "\N". + + For example a multiple line DESCRIPTION value of: + + Mythical Manager + Hyjinx Software Division + BabsCo, Inc. + + could be represented as: + + + +Howes, et. al. Standards Track [Page 12] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + DESCRIPTION:Mythical Manager\nHyjinx Software Division\n + BabsCo\, Inc.\n + + demonstrating the \n literal formatted line break technique, the + CRLF-followed-by-space line folding technique, and the backslash + escape technique. + + "uri": The "uri" value type should be used to identify values that + are referenced by a URI (including a Content-ID URI), instead of + encoded in-line. These value references might be used if the value is + too large, or otherwise undesirable to include directly. The format + for the URI is as defined in RFC 1738. + + Examples for "uri": + http://www.foobar.com/my/picture.jpg + ldap://ldap.foobar.com/cn=babs%20jensen + + "date", "time", and "date-time": Each of these value types is based + on a subset of the definitions in ISO 8601 standard. Profiles MAY + place further restrictions on "date" and "time" values. Multiple + "date" and "time" values can be specified using the comma-separated + notation, unless restricted by a profile. + + Examples for "date": + 1985-04-12 + 1996-08-05,1996-11-11 + 19850412 + + Examples for "time": + 10:22:00 + 102200 + 10:22:00.33 + 10:22:00.33Z + 10:22:33,11:22:00 + 10:22:00-08:00 + + Examples for "date-time": + 1996-10-22T14:00:00Z + 1996-08-11T12:34:56Z + 19960811T123456Z + 1996-10-22T14:00:00Z,1996-08-11T12:34:56Z + + "boolean": The "boolean" value type is used to express boolen values. + These values are case insensitive. + + Examples: TRUE + false + True + + + +Howes, et. al. Standards Track [Page 13] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + "integer": The "integer" value type is used to express signed + integers in decimal format. If sign is not specified, the value is + assumed positive "+". Multiple "integer" values can be specified + using the comma-separated notation, unless restricted by a profile. + + Examples: 1234567890 + -1234556790 + +1234556790,432109876 + + "float": The "float" value type is used to express real numbers. If + sign is not specified, the value is assumed positive "+". Multiple + "float" values can be specified using the comma-separated notation, + unless restricted by a profile. + + Examples: 20.30 + 1000000.0000001 + 1.333,3.14 + +5.9. Applications which use this media type + + Applications which use this media type: Various + +5.10. Additional information + + Additional information: None + +5.11. Person & email address to contact for further information + + Tim Howes + Netscape Communications Corp. + 501 East Middlefield Rd. + Mountain View, CA 94041 + USA + howes@netscape.com + +1 415 937 3419 + +5.12. Intended usage + + Intended usage: COMMON + + + + + + + + + + + + +Howes, et. al. Standards Track [Page 14] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +5.13. Author/Change controller + + Tim Howes + Netscape Communications Corp. + 501 East Middlefield Rd. + Mountain View, CA 94041 + USA + howes@netscape.com + +1 415 937 3419 + + Mark Smith + Netscape Communications Corp. + 501 East Middlefield Rd. + Mountain View, CA 94041 + USA + mcs@netscape.com + +1 415 937 3477 + + Frank Dawson + Lotus Development Corporation + 6544 Battleford Drive + Raleigh, NC 27613-3502 + USA + frank_dawson@lotus.com + +1-919-676-9515 + +6. Predefined Types + + The following types are generally useful regardless of the profile + being carried and are defined below using the text/directory MIME + type registration template defined in Section 11.1 of this document. + These types MAY be included in any profile, unless explicitly + forbidden in the profile definition. + +6.1. SOURCE Type Definition + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type SOURCE + + Type name: SOURCE + + Type purpose: To identify the source of directory information + contained in the content type. + + Type encoding: 8bit + + Type valuetype: uri + + + + +Howes, et. al. Standards Track [Page 15] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Type special notes: The SOURCE type is used to provide the means by + which applications knowledgable in the given directory service + protocol can obtain additional or more up-to-date information from + the directory service. It contains a URI as defined in [RFC-1738] + and/or other information referencing the directory entity or entities + to which the information pertains. When directory information is + available from more than one source, the sending entity can pick what + it considers to be the best source, or multiple SOURCE types can be + included. The interpretation of the value for a SOURCE type can + depend on the setting of the CONTEXT type parameter. The value of the + CONTEXT type parameter MUST be compatible with the value of the uri + prefix. + + Type example: + SOURCE;CONTEXT=LDAP:ldap://ldap.host/cn=Babs%20Jensen, + %20o=Babsco,%20c=US + +6.2. NAME Type Definition + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type NAME + + Type name: NAME + + Type purpose: To identify the displayable name of the directory + entity to which information in the content type pertains. + + Type encoding: 8bit + + Type valuetype: text + + Type special notes: The NAME type is used to convey the display name + of the entity to which the directory information pertains. + + Type example: + NAME:Babs Jensen's Contact Information + +6.3. PROFILE Type Definition + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type PROFILE + + Type name: PROFILE + + Type purpose: To identify the type of directory entity to which + information in the content type pertains. + + Type encoding: 8bit + + + +Howes, et. al. Standards Track [Page 16] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Type valuetype: A profile name, registered as described in Section 9 + of this document or bilaterally agreed upon as described in Section + 5. + + Type special notes: The PROFILE type is used to convey the type of + the entity to which the directory information in the rest of the body + part pertains. It should be the same as the "profile" header + parameter, if present. + + Type example: + PROFILE:vCard + +6.4. BEGIN Type Definition + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type BEGIN + + Type name: BEGIN + + Type purpose: To denote the beginning of a syntactic entity within a + text/directory content-type. + + Type encoding: 8bit + + Type valuetype: text, containing a profile name, registered as + described in Section 9 of this document or bilaterally-agreed upon as + described in Section 5. + + Type special notes: The BEGIN type is used in conjunction with the + END type to delimit a profile containing a related set of properties + within an text/directory content-type. This construct can be used + instead of or in addition to wrapping separate sets of information + inside additional MIME headers. It is provided for applications that + wish to define content that can contain multiple entities within the + same text/directory content-type or to define content that can be + identifiable outside of a MIME environment. + + Type example: + BEGIN:VCARD + +6.5. END Type Definition + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type END + + Type name: END + + + + + +Howes, et. al. Standards Track [Page 17] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Type purpose: To denote the end of a syntactic entity within a + text/directory content-type. + + Type encoding: 8bit + + Type valuetype: text, containing a profile name, registered as + described in Section 9 of this document or bilaterally-agreed upon as + described in Section 5. + + Type special notes: The END type is used in conjunction with the + BEGIN type to delimit a profile containing a related set of + properties within an text/directory content-type. This construct can + be used instead of or in addition to wrapping separate sets of + information inside additional MIME headers. It is provided for + applications that wish to define content that can contain multiple + entities within the same text/directory content-type or to define + content that can be identifiable outside of a MIME environment. + + Type example: + END: VCARD + +7. Use of the multipart/related Content-Type + + The multipart/related Content-Type can be used to hold directory + information comprised of both text and non-text information or + directory information that already has a natural MIME representation. + The root body part within the multipart/related body part is + specified as defined in [RFC-2112] by a "start" parameter, or it is + the first body part in the absence of such a parameter. The root + body part must have a Content-Type of "text/directory". This part + holds inline information and makes reference to subsequent body parts + holding additional text or non-text directory information via their + Content-ID URIs as explained in Section 5. + + The body parts referred to do not have to be in any particular order, + except as noted above for the root body part. + +8. Examples + + The following examples are for illustrative purposes only and are not + part of the definition. + + + + + + + + + + +Howes, et. al. Standards Track [Page 18] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +8.1. Example 1 + + The first example illustrates simple use of the text/directory + Content-Type. Note that no "profile" parameter is given, so an + application may not know what kind of directory entity the + information applies to. Note also the use of both hypothetical + official and bilaterally agreed upon types. + + From: Whomever@wherever.com + To: Someone@somewhere.com + Subject: whatever + MIME-Version: 1.0 + Message-ID: <id1@host.net> + Content-Type: text/directory + Content-ID: <id2@host.com> + + cn:Babs Jensen + cn:Barbara J Jensen + sn:Jensen + email:babs@umich.edu + phone:+1 313 747-4454 + x-id:1234567890 + +8.2. Example 2 + + The next example illustrates the use of the Quoted-Printable transfer + encoding defined in [RFC 2045] to include non-ASCII character in some + of the information returned, and the use of the optional "name" and + "source" types. It also illustrates the use of an "encoding" type + parameter to encode a certificate value in "b". A "vCard" profile + [MIME- VCARD] is used for the example. + +Content-Type: text/directory; + charset="iso-8859-1"; + profile="vCard" +Content-ID: <id3@host.com> +Content-Transfer-Encoding: Quoted-Printable + +begin:VCARD +source:ldap://cn=bjorn%20Jensen, o=university%20of%20Michigan, c=US +name:Bjorn Jensen +fn:Bj=F8rn Jensen +n:Jensen;Bj=F8rn +email;type=internet:bjorn@umich.edu +tel;type=work,voice,msg:+1 313 747-4454 +key;type=x509;encoding=B:dGhpcyBjb3VsZCBiZSAKbXkgY2VydGlmaWNhdGUK +end:VCARD + + + + +Howes, et. al. Standards Track [Page 19] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +8.3. Example 3 + + The next example illustrates the use of multi-valued type parameters, + the "language" type parameter, the "value" type parameter, folding of + long lines, the \n encoding for formatted lines, attribute grouping, + and the inline "b" encoding. A "vCard" profile [MIME-VCARD] is used + for the example. + +Content-Type: text/directory; profile="vcard"; charset=iso-8859-1 +Content-ID: <id3@host.com> +Content-Transfer-Encoding: Quoted-Printable + +begin:vcard +source:ldap://cn=Meister%20Berger,o=Universitaet%20Goerlitz,c=DE +name:Meister Berger +fn:Meister Berger +n:Berger;Meister +bday;value=date:1963-09-21 +o:Universit=E6t G=F6rlitz +title:Mayor +title;language=de;value=text:Burgermeister +note:The Mayor of the great city of + Goerlitz in the great country of Germany. +email;internet:mb@goerlitz.de +home.tel;type=fax,voice,msg:+49 3581 123456 +home.label:Hufenshlagel 1234\n + 02828 Goerlitz\n + Deutschland +key;type=X509;encoding=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcNAQEEBQ + AwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bmljYXRpb25zI + ENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0ZW1zMRwwGgYDVQQD + ExNyb290Y2EubmV0c2NhcGUuY29tMB4XDTk3MDYwNjE5NDc1OVoXDTk3MTIwMzE5NDc + 1OVowgYkxCzAJBgNVBAYTAlVTMSYwJAYDVQQKEx1OZXRzY2FwZSBDb21tdW5pY2F0aW + 9ucyBDb3JwLjEYMBYGA1UEAxMPVGltb3RoeSBBIEhvd2VzMSEwHwYJKoZIhvcNAQkBF + hJob3dlc0BuZXRzY2FwZS5jb20xFTATBgoJkiaJk/IsZAEBEwVob3dlczBcMA0GCSqG + SIb3DQEBAQUAA0sAMEgCQQC0JZf6wkg8pLMXHHCUvMfL5H6zjSk4vTTXZpYyrdN2dXc + oX49LKiOmgeJSzoiFKHtLOIboyludF90CgqcxtwKnAgMBAAGjNjA0MBEGCWCGSAGG+E + IBAQQEAwIAoDAfBgNVHSMEGDAWgBT84FToB/GV3jr3mcau+hUMbsQukjANBgkqhkiG9 + w0BAQQFAAOBgQBexv7o7mi3PLXadkmNP9LcIPmx93HGp0Kgyx1jIVMyNgsemeAwBM+M + SlhMfcpbTrONwNjZYW8vJDSoi//yrZlVt9bJbs7MNYZVsyF1unsqaln4/vy6Uawfg8V + UMk1U7jt8LYpo4YULU7UZHPYVUaSgVttImOHZIKi4hlPXBOhcUQ== +end:vcard + + + + + + + + + +Howes, et. al. Standards Track [Page 20] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +8.4. Example 4 + + The final example illustrates the use of the multipart/related + Content-Type to include non-textual directory data via the "uri" + encoding to refer to other body parts within the same message, or to + external values. Note that no "profile" parameter is given, so an + application may not know what kind of directory entity the + information applies to. Note also the use of both hypothetical + official and bilaterally agreed upon types. + +Content-Type: multipart/related; + boundary=woof; + type="text/directory"; + start="<id5@host.com>" +Content-ID: <id4@host.com> + +--woof +Content-Type: text/directory; charset="iso-8859-1" +Content-ID: <id5@host.com> +Content-Transfer-Encoding: Quoted-Printable + +source:ldap://cn=Bjorn%20Jensen,o=University%20of%20Michigan,c=US +cn:Bj=F8rn Jensen +sn:Jensen +email:bjorn@umich.edu +image;value=uri:cid:id6@host.com +image;value=uri;format=jpeg:ftp://some.host/some/path.jpg +sound;value=uri:cid:id7@host.com +phone:+1 313 747-4454 + +--woof +Content-Type: image/jpeg +Content-ID: <id6@host.com> + +<...image data...> + +--woof +Content-Type: message/external-body; + name="myvoice.au"; + site="myhost.com"; + access-type=ANON-FTP; + directory="pub/myname"; + mode="image" + +Content-Type: audio/basic +Content-ID: <id7@host.com> + +--woof-- + + + +Howes, et. al. Standards Track [Page 21] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +9. Registration of new profiles + + This section defines procedures by which new profiles are registered + with the IANA and made available to the Internet community. Note that + non-IANA profiles can be used by bilateral agreement, provided the + associated profile names follow the "X-" convention defined above. + + The procedures defined here are designed to allow public comment and + review of new profiles, while posing only a small impediment to the + definition of new profiles. + + Registration of a new profile is accomplished by the following steps. + +9.1. Define the profile + + A profile is defined by completing the following template. + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME profile XXX + + Profile name: + + Profile purpose: + + Profile types: + + Profile special notes (optional): + + Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) + + The explanation of what goes in each field in the template follows. + + Profile name: The name of the profile as it will appear in the + text/directory MIME Content-Type "profile" header parameter, or the + predefined "profile" type name. + + Profile purpose: The purpose of the profile (e.g., to represent + information about people, printers, documents, etc.). Give a short + but clear description. + + Profile types: The list of types associated with the profile. This + list of types is to be expected but not required in the profile, + unless otherwise noted in the profile definition. Other types not + mentioned in the profile definition MAY also be present. Note that + any new types referenced by the profile MUST be defined separately as + described in Section 10. + + + + + +Howes, et. al. Standards Track [Page 22] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Profile special notes: Any special notes about the profile, how it is + to be used, etc. This section of the template can also be used to + define an ordering on the types that appear in the Content-Type, if + such an ordering is required. + +9.2. Post the profile definition + + The profile description must be posted to the new profile discussion + list, ietf-mime-direct@imc.org + +9.3. Allow a comment period + + Discussion on the new profile must be allowed to take place on the + list for a minimum of two weeks. Consensus must be reached on the + profile before proceeding to step 4. + +9.4. Submit the profile for approval + + Once the two-week comment period has elapsed, and the proposer is + convinced consensus has been reached on the profile, the registration + application should be submitted to the Profile Reviewer for approval. + The Profile Reviewer is appointed by the Application Area Directors + and can either accept or reject the profile registration. An accepted + registration is passed on by the Profile Reviewer to the IANA for + inclusion in the official IANA profile registry. The registration may + be rejected for any of the following reasons. 1) Insufficient comment + period; 2) Consensus not reached; 3) Technical deficiencies raised on + the list or elsewhere have not been addressed. The Profile Reviewer's + decision to reject a profile can be appealed by the proposer to the + IESG, or the objections raised can be addressed by the proposer and + the profile resubmitted. + +10. Profile Change Control + + Existing profiles can be changed using the same process by which they + were registered. + + Define the change + + Post the change + + Allow a comment period + + Submit the changed profile for approval + + + + + + + +Howes, et. al. Standards Track [Page 23] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Note that the original author or any other interested party can + propose a change to an existing profile, but that such changes should + only be proposed when there are serious omissions or errors in the + published specification. The Profile Reviewer can object to a change + if it is not backwards compatible, but is not required to do so. + + Profile definitions can never be deleted from the IANA registry, but + profiles which are no longer believed to be useful can be declared + OBSOLETE by a change to their "intended use" field. + +11. Registration of new types + + This section defines procedures by which new types are registered + with the IANA. Note that non-IANA types can be used by bilateral + agreement, provided the associated types names follow the "X-" + convention defined above. + + The procedures defined here are designed to allow public comment and + review of new types, while posing only a small impediment to the + definition of new types. + + Registration of a new type is accomplished by the following steps. + +11.1. Define the type + + A type is defined by completing the following template. + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type XXX + + Type name: + + Type purpose: + + Type encoding: + + Type valuetype: + + Type special notes (optional): + + Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) + + The meaning of each field in the template is as follows. + + Type name: The name of the type, as it will appear in the body of an + text/directory MIME Content-Type "type: value" line to the left of + the colon ":". + + + + +Howes, et. al. Standards Track [Page 24] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Type purpose: The purpose of the type (e.g., to represent a name, + postal address, IP address, etc.). Give a short but clear + description. + + Type encoding: The default encoding a value of the type must have in + the body of a text/directory MIME Content-Type. + + Type valuetype: The format a value of the type must have in the body + of a text/directory MIME Content-Type. This description must be + precise and must not violate the general encoding rules defined in + section 5 of this document. + + Type special notes: Any special notes about the type, how it is to be + used, etc. + +11.2. Post the type definition + + The type description must be posted to the new type discussion list, + ietf-mime-direct@imc.org + +11.3. Allow a comment period + + Discussion on the new type must be allowed to take place on the list + for a minimum of two weeks. Consensus must be reached on the type + before proceeding to step 4. + +11.4. Submit the type for approval + + Once the two-week comment period has elapsed, and the proposer is + convinced consensus has been reached on the type, the registration + application should be submitted to the Profile Reviewer for approval. + The Profile Reviewer is appointed by the Application Area Directors + and can either accept or reject the type registration. An accepted + registration is passed on by the Profile Reviewer to the IANA for + inclusion in the official IANA profile registry. The registration can + be rejected for any of the following reasons. 1) Insufficient comment + period; 2) Consensus not reached; 3) Technical deficiencies raised on + the list or elsewhere have not been addressed. The Profile + Reviewer's decision to reject a type can be appealed by the proposer + to the IESG, or the objections raised can be addressed by the + proposer and the type resubmitted. + + + + + + + + + + +Howes, et. al. Standards Track [Page 25] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +12. Type Change Control + + Existing types can be changed using the same process by which they + were registered. + + Define the change + + Post the change + + Allow a comment period + + Submit the type for approval + + Note that the original author or any other interested party can + propose a change to an existing type, but that such changes should + only be proposed when there are serious omissions or errors in the + published specification. The Profile Reviewer can object to a change + if it is not backwards compatible, but is not required to do so. + + Type definitions can never be deleted from the IANA registry, but + types which are nolonger believed to be useful can be declared + OBSOLETE by a change to their "intended use" field. + +13. Registration of new parameters + + This section defines procedures by which new parameters are + registered with the IANA and made available to the Internet + community. Note that non-IANA parameters can be used by bilateral + agreement, provided the associated parameters names follow the "X-" + convention defined above. + + The procedures defined here are designed to allow public comment and + review of new parameters, while posing only a small impediment to the + definition of new parameters. + + Registration of a new parameter is accomplished by the following + steps. + +13.1. Define the parameter + + A parameter is defined by completing the following template. + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME type parameter XXX + + Parameter name: + + Parameter purpose: + + + +Howes, et. al. Standards Track [Page 26] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + Parameter values: + + Parameter special notes (optional): + + Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) + + The explanation of what goes in each field in the template follows. + + Parameter name: The name of the parameter as it will appear in the + text/directory MIME Content-Type. + + Parameter purpose: The purpose of the parameter (e.g., to represent + the format of an image, type of a phone number, etc.). Give a short + but clear description. If defining a general paramemter like "format" + or "type" keep in mind that other applications might wish to extend + its use. + + Parameter values: The list or description of values associated with + the parameter. + + Parameter special notes: Any special notes about the parameter, how + it is to be used, etc. + +13.2. Post the parameter definition + + The parameter description must be posted to the new parameter + discussion list, ietf-mime-direct@imc.org + +13.3. Allow a comment period + + Discussion on the new parameter must be allowed to take place on the + list for a minimum of two weeks. Consensus must be reached on the + parameter before proceeding to step 4. + +13.4. Submit the parameter for approval + + Once the two-week comment period has elapsed, and the proposer is + convinced consensus has been reached on the parameter, the + registration application should be submitted to the Profile Reviewer + for approval. The Profile Reviewer is appointed by the Application + Area Directors and can either accept or reject the parameter + registration. An accepted registration is passed on by the Profile + Reviewer to the IANA for inclusion in the official IANA parameter + registry. The registration can be rejected for any of the following + reasons. 1) Insufficient comment period; 2) Consensus not reached; 3) + Technical deficiencies raised on the list or elsewhere have not been + addressed. The Profile Reviewer's decision to reject a profile can be + appealed by the proposer to the IESG, or the objections raised can be + + + +Howes, et. al. Standards Track [Page 27] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + addressed by the proposer and the parameter registration resubmitted. + +14. Parameter Change Control + + Existing parameters can be changed using the same process by which + they were registered. + + Define the change + + Post the change + + Allow a comment period + + Submit the parameter for approval + + Note that the original author or any other interested party can + propose a change to an existing parameter, but that such changes + should only be proposed when there are serious omissions or errors in + the published specification. The Profile Reviewer can object to a + change if it is not backwards compatible, but is not required to do + so. + + Parameter definitions can never be deleted from the IANA registry, + but parameters which are nolonger believed to be useful can be + declared OBSOLETE by a change to their "intended use" field. + +15. Registration of new value types + + This section defines procedures by which new value types are + registered with the IANA and made available to the Internet + community. Note that non-IANA value types can be used by bilateral + agreement, provided the associated value types names follow the "X-" + convention defined above. + + The procedures defined here are designed to allow public comment and + review of new value types, while posing only a small impediment to + the definition of new value types. + + Registration of a new value types is accomplished by the following + steps. + +15.1. Define the value type + + A value type is defined by completing the following template. + + To: ietf-mime-direct@imc.org + Subject: Registration of text/directory MIME value type XXX + + + + +Howes, et. al. Standards Track [Page 28] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + value type name: + + value type purpose: + + value type format: + + value type special notes (optional): + + Intended usage: (one of COMMON, LIMITED USE or OBSOLETE) + + The explanation of what goes in each field in the template follows. + + value type name: The name of the value type as it will appear in the + text/directory MIME Content-Type. + + value type purpose: The purpose of the value type. Give a short but + clear description. + + value type format: The definition of the format for the value, + usually using ABNF grammar. + + value type special notes: Any special notes about the value type, how + it is to be used, etc. + +15.2. Post the value type definition + + The value type description must be posted to the new value type + discussion list, ietf-mime-direct@imc.org + +15.3. Allow a comment period + + Discussion on the new value type must be allowed to take place on the + list for a minimum of two weeks. Consensus must be reached before + proceeding to step 4. + +15.4. Submit the value type for approval + + Once the two-week comment period has elapsed, and the proposer is + convinced consensus has been reached on the value type, the + registration application should be submitted to the Profile Reviewer + for approval. The Profile Reviewer is appointed by the Application + Area Directors and can either accept or reject the value type + registration. An accepted registration should be passed on by the + Profile Reviewer to the IANA for inclusion in the official IANA value + type registry. The registration can be rejected for any of the + following reasons. 1) Insufficient comment period; 2) Consensus not + reached; 3) Technical deficiencies raised on the list or elsewhere + have not been addressed. The Profile Reviewer's decision to reject a + + + +Howes, et. al. Standards Track [Page 29] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + profile can be appealed by the proposer to the IESG, or the + objections raised can be addressed by the proposer and the value type + registration resubmitted. + +16. Security Considerations + + Internet mail is subject to many well known security attacks, + including monitoring, replay, and forgery. Care should be taken by + any directory service in allowing information to leave the scope of + the service itself, where any access controls can no longer be + guaranteed. Applications should also take care to display directory + data in a "safe" environment (e.g., PostScript-valued types). + +17. Acknowledgements + + The registration procedures defined here were shamelessly lifted from + the MIME registration RFC. + + The many valuable comments contributed by members of the IETF ASID + working group are gratefully acknowledged, as are the contributions + of the Versit Consortium. Chris Newman was especially helpful in + navigating the intricacies of ABNF lore. + +18. References + + [RFC-1777] Yeong, W., Howes, T., and S. Kille, "Lightweight + Directory Access Protocol", RFC 1777, March 1995. + + [RFC-1778] Howes, T., Kille, S., Yeong, W., and C. Robbins, "The + String Representation of Standard Attribute Syntaxes", + RFC 1778, March 1995. + + [RFC-822] Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", STD 11, RFC 822, August 1982. + + [RFC-2045] Borenstein, N., and N. Freed, "Multipurpose Internet + Mail Extensions (MIME) Part One: Format of Internet + Message Bodies", RFC 2045, November 1996. + + [RFC-2046] Moore, K., "Multipurpose Internet Mail Extensions (MIME) + Part Two: Media Types", RFC 2046, November 1996. + + [RFC-2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) Part Four: Registration + Procedures", RFC 2048, November 1996. + + [RFC-1766] Alvestrand, H., "Tags for the Identification of + Languages", RFC 1766, March 1995. + + + +Howes, et. al. Standards Track [Page 30] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + + [RFC-2112] Levinson, E., "The MIME Multipart/Related Content-type", + RFC 2112, March 1997. + + [X500] "Information Processing Systems - Open Systems + Interconnection - The Directory: Overview of Concepts, + Models and Services", ISO/IEC JTC 1/SC21, International + Standard 9594-1, 1988. + + [RFC-1835] Deutsch, P., Schoultz, R., Faltstrom, P., and C. Weider, + "Architecture of the WHOIS++ service", RFC 1835, August + 1995. + + [RFC-1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform + Resource Locators (URL)", RFC 1738, December 1994. + + [MIME-VCARD] Dawson, F., and T. Howes, "VCard MIME Directory + Profile", RFC 2426, September 1998. + + [VCARD] Internet Mail Consortium, "vCard - The Electronic + Business Card", Version 2.1, + http://www.imc.com/pdi/vcard-21.txt, September, 1996. + + [RFC-2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC-2234] Crocker, D., and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 2234, November 1997. + + + + + + + + + + + + + + + + + + + + + + + + +Howes, et. al. Standards Track [Page 31] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +19. Authors' Addresses + + Tim Howes + Netscape Communications Corp. + 501 East Middlefield Rd. + Mountain View, CA 94041 + USA + + Phone: +1.415.937.3419 + EMail: howes@netscape.com + + + Mark Smith + Netscape Communications Corp. + 501 East Middlefield Rd. + Mountain View, CA 94041 + USA + + Phone: +1.415.937.3477 + EMail: mcs@netscape.com + + + Frank Dawson + Lotus Development Corporation + 6544 Battleford Drive + Raleigh, NC 27613 + USA + + Phone: +1-919-676-9515 + EMail: frank_dawson@lotus.com + + + + + + + + + + + + + + + + + + + + + +Howes, et. al. Standards Track [Page 32] + +RFC 2425 MIME Content-Type for Directory Information September 1998 + + +20. Full Copyright Statement + + Copyright (C) The Internet Society (1998). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + + + + + + + + + + + + + + + + + + + + + + + +Howes, et. al. Standards Track [Page 33] + diff --git a/rfc/rfc2426.txt b/rfc/rfc2426.txt @@ -0,0 +1,2355 @@ + + + + + + +Network Working Group F. Dawson +Request for Comments: 2426 Lotus Development Corporation +Category: Standards Track T. Howes + Netscape Communications + September 1998 + + + vCard MIME Directory Profile + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1998). All Rights Reserved. + +Abstract + + This memo defines the profile of the MIME Content-Type [MIME-DIR] for + directory information for a white-pages person object, based on a + vCard electronic business card. The profile definition is independent + of any particular directory service or protocol. The profile is + defined for representing and exchanging a variety of information + about an individual (e.g., formatted and structured name and delivery + addresses, email address, multiple telephone numbers, photograph, + logo, audio clips, etc.). The directory information used by this + profile is based on the attributes for the person object defined in + the X.520 and X.521 directory services recommendations. The profile + also provides the method for including a [VCARD] representation of a + white-pages directory entry within the MIME Content-Type defined by + the [MIME-DIR] document. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY" and "OPTIONAL" in this + document are to be interpreted as described in [RFC 2119]. + + + + + + + + + + + +Dawson & Howes Standards Track [Page 1] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +Table of Contents + + Overview.........................................................3 + 1. THE VCARD MIME DIRECTORY PROFILE REGISTRATION.................4 + 2. MIME DIRECTORY FEATURES.......................................5 + 2.1 PREDEFINED TYPE USAGE ......................................5 + 2.1.1 BEGIN and END Type ......................................5 + 2.1.2 NAME Type ...............................................5 + 2.1.3 PROFILE Type ............................................5 + 2.1.4 SOURCE Type .............................................5 + 2.2 PREDEFINED TYPE PARAMETER USAGE ............................6 + 2.3 PREDEFINED VALUE TYPE USAGE ................................6 + 2.4 EXTENSIONS TO THE PREDEFINED VALUE TYPES ...................6 + 2.4.1 BINARY ..................................................6 + 2.4.2 VCARD ...................................................6 + 2.4.3 PHONE-NUMBER ............................................7 + 2.4.4 UTC-OFFSET ..............................................7 + 2.5 STRUCTURED TYPE VALUES .....................................7 + 2.6 LINE DELIMITING AND FOLDING ................................8 + 3. VCARD PROFILE FEATURES........................................8 + 3.1 IDENTIFICATION TYPES .......................................8 + 3.1.1 FN Type Definition ......................................8 + 3.1.2 N Type Definition .......................................9 + 3.1.3 NICKNAME Type Definition ................................9 + 3.1.4 PHOTO Type Definition ..................................10 + 3.1.5 BDAY Type Definition ...................................11 + 3.2 DELIVERY ADDRESSING TYPES .................................11 + 3.2.1 ADR Type Definition ....................................11 + 3.2.2 LABEL Type Definition ..................................13 + 3.3 TELECOMMUNICATIONS ADDRESSING TYPES .......................13 + 3.3.1 TEL Type Definition ....................................14 + 3.3.2 EMAIL Type Definition ..................................15 + 3.3.3 MAILER Type Definition .................................15 + 3.4 GEOGRAPHICAL TYPES ........................................16 + 3.4.1 TZ Type Definition .....................................16 + 3.4.2 GEO Type Definition ....................................16 + 3.5 ORGANIZATIONAL TYPES ......................................17 + 3.5.1 TITLE Type Definition ..................................17 + 3.5.2 ROLE Type Definition ...................................18 + 3.5.3 LOGO Type Definition ...................................18 + 3.5.4 AGENT Type Definition ..................................19 + 3.5.5 ORG Type Definition ....................................20 + 3.6 EXPLANATORY TYPES .........................................20 + 3.6.1 CATEGORIES Type Definition .............................20 + 3.6.2 NOTE Type Definition ...................................21 + 3.6.3 PRODID Type Definition .................................21 + 3.6.4 REV Type Definition ....................................22 + 3.6.5 SORT-STRING Type Definition ............................22 + + + +Dawson & Howes Standards Track [Page 2] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + 3.6.6 SOUND Type Definition ..................................23 + 3.6.7 UID Type Definition ....................................24 + 3.6.8 URL Type Definition ....................................25 + 3.6.9 VERSION Type Definition ................................25 + 3.7 SECURITY TYPES ............................................25 + 3.7.1 CLASS Type Definition ..................................26 + 3.7.2 KEY Type Definition ....................................26 + 3.8 EXTENDED TYPES ............................................27 + 4. FORMAL GRAMMAR...............................................27 + 5. DIFFERENCES FROM VCARD V2.1..................................37 + 6. ACKNOWLEDGEMENTS.............................................39 + 7. AUTHORS' ADDRESSES...........................................39 + 8. SECURITY CONSIDERATIONS......................................39 + 9. REFERENCES...................................................40 + 10. FULL COPYRIGHT STATEMENT....................................42 + +Overview + + The [MIME-DIR] document defines a MIME Content-Type for holding + different kinds of directory information. The directory information + can be based on any of a number of directory schemas. This document + defines a [MIME-DIR] usage profile for conveying directory + information based on one such schema; that of the white-pages type of + person object. + + The schema is based on the attributes for the person object defined + in the X.520 and X.521 directory services recommendations. The schema + has augmented the basic attributes defined in the X.500 series + recommendation in order to provide for an electronic representation + of the information commonly found on a paper business card. This + schema was first defined in the [VCARD] document. Hence, this [MIME- + DIR] profile is referred to as the vCard MIME Directory Profile. + + A directory entry based on this usage profile can include traditional + directory, white-pages information such as the distinguished name + used to uniquely identify the entry, a formatted representation of + the name used for user-interface or presentation purposes, both the + structured and presentation form of the delivery address, various + telephone numbers and organizational information associated with the + entry. In addition, traditional paper business card information such + as an image of an organizational logo or identify photograph can be + included in this person object. + + The vCard MIME Directory Profile also provides support for + representing other important information about the person associated + with the directory entry. For instance, the date of birth of the + person; an audio clip describing the pronunciation of the name + associated with the directory entry, or some other application of the + + + +Dawson & Howes Standards Track [Page 3] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + digital sound; longitude and latitude geo-positioning information + related to the person associated with the directory entry; date and + time that the directory information was last updated; annotations + often written on a business card; Uniform Resource Locators (URL) for + a website; public key information. The profile also provides support + for non-standard extensions to the schema. This provides the + flexibility for implementations to augment the current capabilities + of the profile in a standardized way. More information about this + electronic business card format can be found in [VCARD]. + +1. The vCard Mime Directory Profile Registration + + This profile is identified by the following [MIME-DIR] registration + template information. Subsequent sections define the profile + definition. + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME profile VCARD + + Profile name: VCARD + + Profile purpose: To hold person object or white-pages type of + directory information. The person schema captured in the directory + entries is that commonly found in an electronic business card. + + Predefined MIME Directory value specifications used: uri, date, + date-time, float + + New value specifications: This profile places further constraints on + the [MIME-DIR] text value specification. In addition, it adds a + binary, phone-number, utc-offset and vcard value specifications. + + Predefined MIME Directory types used: SOURCE, NAME, PROFILE, BEGIN, + END. + + Predefined MIME Directory parameters used: ENCODING, VALUE, CHARSET, + LANGUAGE, CONTEXT. + + New types: FN, N, NICKNAME, PHOTO, BDAY, ADR, LABEL, TEL, EMAIL, + MAILER, TZ, GEO, TITLE, ROLE, LOGO, AGENT, ORG, CATEGORIES, NOTE, + PRODID, REV, SORT-STRING, SOUND, URL, UID, VERSION, CLASS, KEY + + New parameters: TYPE + + Profile special notes: The vCard object MUST contain the FN, N and + VERSION types. The type-grouping feature of [MIME-DIR] is supported + by this profile to group related vCard properties about a directory + + + +Dawson & Howes Standards Track [Page 4] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + entry. For example, vCard properties describing WORK or HOME related + characteristics can be grouped with a unique group label. + + The profile permits the use of non-standard types (i.e., those + identified with the prefix string "X-") as a flexible method for + implementations to extend the functionality currently defined within + this profile. + +2. MIME Directory Features + + The vCard MIME Directory Profile makes use of many of the features + defined by [MIME-DIR]. The following sections either clarify or + extend the content-type definition of [MIME-DIR]. + +2.1 Predefined Type Usage + + The vCard MIME Directory Profile uses the following predefined types + from [MIME-DIR]. + +2.1.1 BEGIN and END Type + + The content entity MUST begin with the BEGIN type with a value of + "VCARD". The content entity MUST end with the END type with a value + of "VCARD". + +2.1.2 NAME Type + + If the NAME type is present, then its value is the displayable, + presentation text associated with the source for the vCard, as + specified in the SOURCE type. + +2.1.3 PROFILE Type + + If the PROFILE type is present, then its value MUST be "VCARD". + +2.1.4 SOURCE Type + + If the SOURCE type is present, then its value provides information + how to find the source for the vCard. + + + + + + + + + + + + +Dawson & Howes Standards Track [Page 5] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +2.2 Predefined Type Parameter Usage + + The vCard MIME Directory Profile uses the following predefined type + parameters as defined by [MIME-DIR]. + + - LANGUAGE + + - ENCODING + + - VALUE + +2.3 Predefined VALUE Type Usage + + The predefined data type values specified in [MIME-DIR] MUST NOT be + repeated in COMMA separated value lists except within the N, + NICKNAME, ADR and CATEGORIES value types. + + The text value type defined in [MIME-DIR] is further restricted such + that any SEMI-COLON character (ASCII decimal 59) in the value MUST be + escaped with the BACKSLASH character (ASCII decimal 92). + +2.4 Extensions To The Predefined VALUE Types + + The predefined data type values specified in [MIME-DIR] have been + extended by the vCard profile to include a number of value types that + are specific to this profile. + +2.4.1 BINARY + + The "binary" value type specifies that the type value is inline, + encoded binary data. This value type can be specified in the PHOTO, + LOGO, SOUND, and KEY types. + + If inline encoded binary data is specified, the ENCODING type + parameter MUST be used to specify the encoding format. The binary + data MUST be encoded using the "B" encoding format. Long lines of + encoded binary data SHOULD BE folded to 75 characters using the + folding method defined in [MIME-DIR]. + + The value type is defined by the following notation: + + binary = <A "B" binary encoded string as defined by [RFC 2047].> + +2.4.2 VCARD + + The "vcard" value type specifies that the type value is another + vCard. This value type can be specified in the AGENT type. The value + type is defined by this specification. Since each of the type + + + +Dawson & Howes Standards Track [Page 6] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + declarations with in the vcard value type are being specified within + a text value themselves, they MUST be terminated with the backslash + escape sequence "\n" or "\N", instead of the normal newline character + sequence CRLF. In addition, any COMMA character (ASCII decimal 44), + SEMI-COLON character (ASCII decimal 59) and COLON character (ASCII + decimal 58) MUST be escaped with the BACKSLASH character (ASCII + decimal 92). For example, with the AGENT type a value would be + specified as: + + AGENT:BEGIN:VCARD\nFN:Joe Friday\nTEL:+1-919-555-7878\n + TITLE:Area Administrator\, Assistant\n EMAIL\;TYPE=INTERN\n + ET:jfriday@host.com\nEND:VCARD\n + +2.4.3 PHONE-NUMBER + + The "phone-number" value type specifies that the type value is a + telephone number. This value type can be specified in the TEL type. + The value type is a text value that has the special semantics of a + telephone number as defined in [CCITT E.163] and [CCITT X.121]. + +2.4.4 UTC-OFFSET + + The "utc-offset" value type specifies that the type value is a signed + offset from UTC. This value type can be specified in the TZ type. + + The value type is an offset from Coordinated Universal Time (UTC). It + is specified as a positive or negative difference in units of hours + and minutes (e.g., +hh:mm). The time is specified as a 24-hour clock. + Hour values are from 00 to 23, and minute values are from 00 to 59. + Hour and minutes are 2-digits with high order zeroes required to + maintain digit count. The extended format for ISO 8601 UTC offsets + MUST be used. The extended format makes use of a colon character as a + separator of the hour and minute text fields. + + The value is defined by the following notation: + + time-hour = 2DIGIT ;00-23 + time-minute = 2DIGIT ;00-59 + utc-offset = ("+" / "-") time-hour ":" time-minute + +2.5 Structured Type Values + + Compound type values are delimited by a field delimiter, specified by + the SEMI-COLON character (ASCII decimal 59). A SEMI-COLON in a + component of a compound property value MUST be escaped with a + BACKSLASH character (ASCII decimal 92). + + + + + +Dawson & Howes Standards Track [Page 7] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Lists of values are delimited by a list delimiter, specified by the + COMMA character (ASCII decimal 44). A COMMA character in a value MUST + be escaped with a BACKSLASH character (ASCII decimal 92). + + This profile supports the type grouping mechanism defined in [MIME- + DIR]. Grouping of related types is a useful technique to communicate + common semantics concerning the properties of a vCard. + +2.6 Line Delimiting and Folding + + This profile supports the same line delimiting and folding methods + defined in [MIME-DIR]. Specifically, when parsing a content line, + folded lines must first be unfolded according to the unfolding + procedure described in [MIME-DIR]. After generating a content line, + lines longer than 75 characters SHOULD be folded according to the + folding procedure described in [MIME DIR]. + + Folding is done after any content encoding of a type value. Unfolding + is done before any decoding of a type value in a content line. + +3. vCard Profile Features + + The vCard MIME Directory Profile Type contains directory information, + typically pertaining to a single directory entry. The information is + described using an attribute schema that is tailored for capturing + personal contact information. The vCard can include attributes that + describe identification, delivery addressing, telecommunications + addressing, geographical, organizational, general explanatory and + security and access information about the particular object + associated with the vCard. + +3.1 Identification Types + + These types are used in the vCard profile to capture information + associated with the identification and naming of the person or + resource associated with the vCard. + +3.1.1 FN Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type FN + + Type name:FN + + Type purpose: To specify the formatted text corresponding to the name + of the object the vCard represents. + + + + +Dawson & Howes Standards Track [Page 8] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: This type is based on the semantics of the X.520 + Common Name attribute. The property MUST be present in the vCard + object. + + Type example: + + FN:Mr. John Q. Public\, Esq. + +3.1.2 N Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type N + + Type name: N + + Type purpose: To specify the components of the name of the object the + vCard represents. + + Type encoding: 8bit + + Type value: A single structured text value. Each component can have + multiple values. + + Type special note: The structured type value corresponds, in + sequence, to the Family Name, Given Name, Additional Names, Honorific + Prefixes, and Honorific Suffixes. The text components are separated + by the SEMI-COLON character (ASCII decimal 59). Individual text + components can include multiple text values (e.g., multiple + Additional Names) separated by the COMMA character (ASCII decimal + 44). This type is based on the semantics of the X.520 individual name + attributes. The property MUST be present in the vCard object. + + Type example: + + N:Public;John;Quinlan;Mr.;Esq. + + N:Stevenson;John;Philip,Paul;Dr.;Jr.,M.D.,A.C.P. + +3.1.3 NICKNAME Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type NICKNAME + + + +Dawson & Howes Standards Track [Page 9] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type name: NICKNAME + + Type purpose: To specify the text corresponding to the nickname of + the object the vCard represents. + + Type encoding: 8bit + + Type value: One or more text values separated by a COMMA character + (ASCII decimal 44). + + Type special note: The nickname is the descriptive name given instead + of or in addition to the one belonging to a person, place, or thing. + It can also be used to specify a familiar form of a proper name + specified by the FN or N types. + + Type example: + + NICKNAME:Robbie + + NICKNAME:Jim,Jimmie + +3.1.4 PHOTO Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type PHOTO + + Type name: PHOTO + + Type purpose: To specify an image or photograph information that + annotates some aspect of the object the vCard represents. + + Type encoding: The encoding MUST be reset to "b" using the ENCODING + parameter in order to specify inline, encoded binary data. If the + value is referenced by a URI value, then the default encoding of 8bit + is used and no explicit ENCODING parameter is needed. + + Type value: A single value. The default is binary value. It can also + be reset to uri value. The uri value can be used to specify a value + outside of this MIME entity. + + Type special notes: The type can include the type parameter "TYPE" to + specify the graphic image format type. The TYPE parameter values MUST + be one of the IANA registered image formats or a non-standard image + format. + + + + + + +Dawson & Howes Standards Track [Page 10] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type example: + + PHOTO;VALUE=uri:http://www.abc.com/pub/photos + /jqpublic.gif + + + PHOTO;ENCODING=b;TYPE=JPEG:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN + AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm + ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 + <...remainder of "B" encoded binary data...> + +3.1.5 BDAY Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type BDAY + + Type name: BDAY + + Type purpose: To specify the birth date of the object the vCard + represents. + + Type encoding: 8bit + + Type value: The default is a single date value. It can also be reset + to a single date-time value. + + Type examples: + + BDAY:1996-04-15 + + BDAY:1953-10-15T23:10:00Z + + BDAY:1987-09-27T08:30:00-06:00 + +3.2 Delivery Addressing Types + + These types are concerned with information related to the delivery + addressing or label for the vCard object. + +3.2.1 ADR Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type ADR + + Type name: ADR + + + + +Dawson & Howes Standards Track [Page 11] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type purpose: To specify the components of the delivery address for + the vCard object. + + Type encoding: 8bit + + Type value: A single structured text value, separated by the + SEMI-COLON character (ASCII decimal 59). + + Type special notes: The structured type value consists of a sequence + of address components. The component values MUST be specified in + their corresponding position. The structured type value corresponds, + in sequence, to the post office box; the extended address; the street + address; the locality (e.g., city); the region (e.g., state or + province); the postal code; the country name. When a component value + is missing, the associated component separator MUST still be + specified. + + The text components are separated by the SEMI-COLON character (ASCII + decimal 59). Where it makes semantic sense, individual text + components can include multiple text values (e.g., a "street" + component with multiple lines) separated by the COMMA character + (ASCII decimal 44). + + The type can include the type parameter "TYPE" to specify the + delivery address type. The TYPE parameter values can include "dom" to + indicate a domestic delivery address; "intl" to indicate an + international delivery address; "postal" to indicate a postal + delivery address; "parcel" to indicate a parcel delivery address; + "home" to indicate a delivery address for a residence; "work" to + indicate delivery address for a place of work; and "pref" to indicate + the preferred delivery address when more than one address is + specified. These type parameter values can be specified as a + parameter list (i.e., "TYPE=dom;TYPE=postal") or as a value list + (i.e., "TYPE=dom,postal"). This type is based on semantics of the + X.520 geographical and postal addressing attributes. The default is + "TYPE=intl,postal,parcel,work". The default can be overridden to some + other set of values by specifying one or more alternate values. For + example, the default can be reset to "TYPE=dom,postal,work,home" to + specify a domestic delivery address for postal delivery to a + residence that is also used for work. + + Type example: In this example the post office box and the extended + address are absent. + + ADR;TYPE=dom,home,postal,parcel:;;123 Main + Street;Any Town;CA;91921-1234 + + + + + +Dawson & Howes Standards Track [Page 12] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +3.2.2 LABEL Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type LABEL + + Type name: LABEL + + Type purpose: To specify the formatted text corresponding to delivery + address of the object the vCard represents. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: The type value is formatted text that can be used + to present a delivery address label for the vCard object. The type + can include the type parameter "TYPE" to specify delivery label type. + The TYPE parameter values can include "dom" to indicate a domestic + delivery label; "intl" to indicate an international delivery label; + "postal" to indicate a postal delivery label; "parcel" to indicate a + parcel delivery label; "home" to indicate a delivery label for a + residence; "work" to indicate delivery label for a place of work; and + "pref" to indicate the preferred delivery label when more than one + label is specified. These type parameter values can be specified as a + parameter list (i.e., "TYPE=dom;TYPE=postal") or as a value list + (i.e., "TYPE=dom,postal"). This type is based on semantics of the + X.520 geographical and postal addressing attributes. The default is + "TYPE=intl,postal,parcel,work". The default can be overridden to some + other set of values by specifying one or more alternate values. For + example, the default can be reset to "TYPE=intl,post,parcel,home" to + specify an international delivery label for both postal and parcel + delivery to a residential location. + + Type example: A multi-line address label. + + LABEL;TYPE=dom,home,postal,parcel:Mr.John Q. Public\, Esq.\n + Mail Drop: TNE QB\n123 Main Street\nAny Town\, CA 91921-1234 + \nU.S.A. + +3.3 Telecommunications Addressing Types + + These types are concerned with information associated with the + telecommunications addressing of the object the vCard represents. + + + + + + + +Dawson & Howes Standards Track [Page 13] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +3.3.1 TEL Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type TEL + + Type name: TEL + + Type purpose: To specify the telephone number for telephony + communication with the object the vCard represents. + + Type encoding: 8bit + + Type value: A single phone-number value. + + Type special notes: The value of this type is specified in a + canonical form in order to specify an unambiguous representation of + the globally unique telephone endpoint. This type is based on the + X.500 Telephone Number attribute. + + The type can include the type parameter "TYPE" to specify intended + use for the telephone number. The TYPE parameter values can include: + "home" to indicate a telephone number associated with a residence, + "msg" to indicate the telephone number has voice messaging support, + "work" to indicate a telephone number associated with a place of + work, "pref" to indicate a preferred-use telephone number, "voice" to + indicate a voice telephone number, "fax" to indicate a facsimile + telephone number, "cell" to indicate a cellular telephone number, + "video" to indicate a video conferencing telephone number, "pager" to + indicate a paging device telephone number, "bbs" to indicate a + bulletin board system telephone number, "modem" to indicate a MODEM + connected telephone number, "car" to indicate a car-phone telephone + number, "isdn" to indicate an ISDN service telephone number, "pcs" to + indicate a personal communication services telephone number. The + default type is "voice". These type parameter values can be specified + as a parameter list (i.e., "TYPE=work;TYPE=voice") or as a value list + (i.e., "TYPE=work,voice"). The default can be overridden to another + set of values by specifying one or more alternate values. For + example, the default TYPE of "voice" can be reset to a WORK and HOME, + VOICE and FAX telephone number by the value list + "TYPE=work,home,voice,fax". + + Type example: + + TEL;TYPE=work,voice,pref,msg:+1-213-555-1234 + + + + + + +Dawson & Howes Standards Track [Page 14] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +3.3.2 EMAIL Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type EMAIL + + Type name: EMAIL + + Type purpose: To specify the electronic mail address for + communication with the object the vCard represents. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: The type can include the type parameter "TYPE" to + specify the format or preference of the electronic mail address. The + TYPE parameter values can include: "internet" to indicate an Internet + addressing type, "x400" to indicate a X.400 addressing type or "pref" + to indicate a preferred-use email address when more than one is + specified. Another IANA registered address type can also be + specified. The default email type is "internet". A non-standard value + can also be specified. + + Type example: + + EMAIL;TYPE=internet:jqpublic@xyz.dom1.com + + EMAIL;TYPE=internet:jdoe@isp.net + + EMAIL;TYPE=internet,pref:jane_doe@abc.com + +3.3.3 MAILER Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type MAILER + + Type name: MAILER + + Type purpose: To specify the type of electronic mail software that is + used by the individual associated with the vCard. + + Type encoding: 8bit + + Type value: A single text value. + + + + + +Dawson & Howes Standards Track [Page 15] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type special notes: This information can provide assistance to a + correspondent regarding the type of data representation which can be + used, and how they can be packaged. This property is based on the + private MIME type X-Mailer that is generally implemented by MIME user + agent products. + + Type example: + + MAILER:PigeonMail 2.1 + +3.4 Geographical Types + + These types are concerned with information associated with + geographical positions or regions associated with the object the + vCard represents. + +3.4.1 TZ Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type TZ + + Type name: TZ + + Type purpose: To specify information related to the time zone of the + object the vCard represents. + + Type encoding: 8bit + + Type value: The default is a single utc-offset value. It can also be + reset to a single text value. + + Type special notes: The type value consists of a single value. + + Type examples: + + TZ:-05:00 + + TZ;VALUE=text:-05:00; EST; Raleigh/North America + ;This example has a single value, not a structure text value. + +3.4.2 GEO Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type GEO + + Type name: GEO + + + +Dawson & Howes Standards Track [Page 16] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type purpose: To specify information related to the global + positioning of the object the vCard represents. + + Type encoding: 8bit + + Type value: A single structured value consisting of two float values + separated by the SEMI-COLON character (ASCII decimal 59). + + Type special notes: This type specifies information related to the + global position of the object associated with the vCard. The value + specifies latitude and longitude, in that order (i.e., "LAT LON" + ordering). The longitude represents the location east and west of the + prime meridian as a positive or negative real number, respectively. + The latitude represents the location north and south of the equator + as a positive or negative real number, respectively. The longitude + and latitude values MUST be specified as decimal degrees and should + be specified to six decimal places. This will allow for granularity + within a meter of the geographical position. The text components are + separated by the SEMI-COLON character (ASCII decimal 59). The simple + formula for converting degrees-minutes-seconds into decimal degrees + is: + + decimal = degrees + minutes/60 + seconds/3600. + + Type example: + + GEO:37.386013;-122.082932 + +3.5 Organizational Types + + These types are concerned with information associated with + characteristics of the organization or organizational units of the + object the vCard represents. + +3.5.1 TITLE Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type TITLE + + Type name: TITLE + + Type purpose: To specify the job title, functional position or + function of the object the vCard represents. + + Type encoding: 8bit + + Type value: A single text value. + + + +Dawson & Howes Standards Track [Page 17] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type special notes: This type is based on the X.520 Title attribute. + + Type example: + + TITLE:Director\, Research and Development + +3.5.2 ROLE Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type ROLE + + Type name: ROLE + + Type purpose: To specify information concerning the role, occupation, + or business category of the object the vCard represents. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: This type is based on the X.520 Business Category + explanatory attribute. This property is included as an organizational + type to avoid confusion with the semantics of the TITLE type and + incorrect usage of that type when the semantics of this type is + intended. + + Type example: + + ROLE:Programmer + +3.5.3 LOGO Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type LOGO + + Type name: LOGO + + Type purpose: To specify a graphic image of a logo associated with + the object the vCard represents. + + Type encoding: The encoding MUST be reset to "b" using the ENCODING + parameter in order to specify inline, encoded binary data. If the + value is referenced by a URI value, then the default encoding of 8bit + is used and no explicit ENCODING parameter is needed. + + + + + +Dawson & Howes Standards Track [Page 18] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type value: A single value. The default is binary value. It can also + be reset to uri value. The uri value can be used to specify a value + outside of this MIME entity. + + Type special notes: The type can include the type parameter "TYPE" to + specify the graphic image format type. The TYPE parameter values MUST + be one of the IANA registered image formats or a non-standard image + format. + + Type example: + + LOGO;VALUE=uri:http://www.abc.com/pub/logos/abccorp.jpg + + LOGO;ENCODING=b;TYPE=JPEG:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN + AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm + ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 + <...the remainder of "B" encoded binary data...> + +3.5.4 AGENT Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type AGENT + + Type name: AGENT + + Type purpose: To specify information about another person who will + act on behalf of the individual or resource associated with the + vCard. + + Type encoding: 8-bit + + Type value: The default is a single vcard value. It can also be reset + to either a single text or uri value. The text value can be used to + specify textual information. The uri value can be used to specify + information outside of this MIME entity. + + Type special notes: This type typically is used to specify an area + administrator, assistant, or secretary for the individual associated + with the vCard. A key characteristic of the Agent type is that it + represents somebody or something that is separately addressable. + + Type example: + + AGENT;VALUE=uri: + CID:JQPUBLIC.part3.960129T083020.xyzMail@host3.com + + + + + +Dawson & Howes Standards Track [Page 19] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + AGENT:BEGIN:VCARD\nFN:Susan Thomas\nTEL:+1-919-555- + 1234\nEMAIL\;INTERNET:sthomas@host.com\nEND:VCARD\n + +3.5.5 ORG Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type ORG + + Type name: ORG + + Type purpose: To specify the organizational name and units associated + with the vCard. + + Type encoding: 8bit + + Type value: A single structured text value consisting of components + separated the SEMI-COLON character (ASCII decimal 59). + + Type special notes: The type is based on the X.520 Organization Name + and Organization Unit attributes. The type value is a structured type + consisting of the organization name, followed by one or more levels + of organizational unit names. + + Type example: A type value consisting of an organizational name, + organizational unit #1 name and organizational unit #2 name. + + ORG:ABC\, Inc.;North American Division;Marketing + +3.6 Explanatory Types + + These types are concerned with additional explanations, such as that + related to informational notes or revisions specific to the vCard. + +3.6.1 CATEGORIES Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type CATEGORIES + + Type name: CATEGORIES + + Type purpose: To specify application category information about the + vCard. + + Type encoding: 8bit + + + + + +Dawson & Howes Standards Track [Page 20] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type value: One or more text values separated by a COMMA character + (ASCII decimal 44). + + Type example: + + CATEGORIES:TRAVEL AGENT + + CATEGORIES:INTERNET,IETF,INDUSTRY,INFORMATION TECHNOLOGY + +3.6.2 NOTE Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type NOTE + + Type name: NOTE + + Type purpose: To specify supplemental information or a comment that + is associated with the vCard. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: The type is based on the X.520 Description + attribute. + + Type example: + + NOTE:This fax number is operational 0800 to 1715 + EST\, Mon-Fri. + +3.6.3 PRODID Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type PRODID + + Type name: PRODID + + Type purpose: To specify the identifier for the product that created + the vCard object. + + Type encoding: 8-bit + + Type value: A single text value. + + + + + +Dawson & Howes Standards Track [Page 21] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type special notes: Implementations SHOULD use a method such as that + specified for Formal Public Identifiers in ISO 9070 to assure that + the text value is unique. + + Type example: + + PRODID:-//ONLINE DIRECTORY//NONSGML Version 1//EN + +3.6.4 REV Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type REV + + Type name: REV + + Type purpose: To specify revision information about the current + vCard. + + Type encoding: 8-bit + + Type value: The default is a single date-time value. Can also be + reset to a single date value. + + Type special notes: The value distinguishes the current revision of + the information in this vCard for other renditions of the + information. + + Type example: + + REV:1995-10-31T22:27:10Z + + REV:1997-11-15 + +3.6.5 SORT-STRING Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type SORT-STRING + + Type Name: SORT-STRING + + Type purpose: To specify the family name or given name text to be + used for national-language-specific sorting of the FN and N types. + + Type encoding: 8bit + + Type value: A single text value. + + + +Dawson & Howes Standards Track [Page 22] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type special notes: The sort string is used to provide family name or + given name text that is to be used in locale- or national-language- + specific sorting of the formatted name and structured name types. + Without this information, sorting algorithms could incorrectly sort + this vCard within a sequence of sorted vCards. When this type is + present in a vCard, then this family name or given name value is used + for sorting the vCard. + + Type examples: For the case of family name sorting, the following + examples define common sort string usage with the FN and N types. + + FN:Rene van der Harten + N:van der Harten;Rene;J.;Sir;R.D.O.N. + SORT-STRING:Harten + + FN:Robert Pau Shou Chang + N:Pau;Shou Chang;Robert + SORT-STRING:Pau + + FN:Osamu Koura + N:Koura;Osamu + SORT-STRING:Koura + + FN:Oscar del Pozo + N:del Pozo Triscon;Oscar + SORT-STRING:Pozo + + FN:Chistine d'Aboville + N:d'Aboville;Christine + SORT-STRING:Aboville + +3.6.6 SOUND Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type SOUND + + Type name: SOUND + + Type purpose: To specify a digital sound content information that + annotates some aspect of the vCard. By default this type is used to + specify the proper pronunciation of the name type value of the vCard. + + Type encoding: The encoding MUST be reset to "b" using the ENCODING + parameter in order to specify inline, encoded binary data. If the + value is referenced by a URI value, then the default encoding of 8bit + is used and no explicit ENCODING parameter is needed. + + + + +Dawson & Howes Standards Track [Page 23] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type value: A single value. The default is binary value. It can also + be reset to uri value. The uri value can be used to specify a value + outside of this MIME entity. + + Type special notes: The type can include the type parameter "TYPE" to + specify the audio format type. The TYPE parameter values MUST be one + of the IANA registered audio formats or a non-standard audio format. + + Type example: + + SOUND;TYPE=BASIC;VALUE=uri:CID:JOHNQPUBLIC.part8. + 19960229T080000.xyzMail@host1.com + + SOUND;TYPE=BASIC;ENCODING=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcN + AQEEBQAwdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENvbW11bm + ljYXRpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 + <...the remainder of "B" encoded binary data...> + +3.6.7 UID Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type UID + + Type name: UID + + Type purpose: To specify a value that represents a globally unique + identifier corresponding to the individual or resource associated + with the vCard. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: The type is used to uniquely identify the object + that the vCard represents. + + The type can include the type parameter "TYPE" to specify the format + of the identifier. The TYPE parameter value should be an IANA + registered identifier format. The value can also be a non-standard + format. + + Type example: + + UID:19950401-080045-40000F192713-0052 + + + + + + +Dawson & Howes Standards Track [Page 24] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +3.6.8 URL Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type URL + + Type name: URL + + Type purpose: To specify a uniform resource locator associated with + the object that the vCard refers to. + + Type encoding: 8bit + + Type value: A single uri value. + + Type example: + + URL:http://www.swbyps.restaurant.french/~chezchic.html + +3.6.9 VERSION Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type VERSION + + Type name: VERSION + + Type purpose: To specify the version of the vCard specification used + to format this vCard. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: The property MUST be present in the vCard object. + The value MUST be "3.0" if the vCard corresponds to this + specification. + + Type example: + + VERSION:3.0 + +3.7 Security Types + + These types are concerned with the security of communication pathways + or access to the vCard. + + + + + +Dawson & Howes Standards Track [Page 25] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +3.7.1 CLASS Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type CLASS + + Type name: CLASS + + Type purpose: To specify the access classification for a vCard + object. + + Type encoding: 8bit + + Type value: A single text value. + + Type special notes: An access classification is only one component of + the general security model for a directory service. The + classification attribute provides a method of capturing the intent of + the owner for general access to information described by the vCard + object. + + Type examples: + + CLASS:PUBLIC + + CLASS:PRIVATE + + CLASS:CONFIDENTIAL + +3.7.2 KEY Type Definition + + To: ietf-mime-directory@imc.org + + Subject: Registration of text/directory MIME type KEY + + Type name: KEY + + Type purpose: To specify a public key or authentication certificate + associated with the object that the vCard represents. + + Type encoding: The encoding MUST be reset to "b" using the ENCODING + parameter in order to specify inline, encoded binary data. If the + value is a text value, then the default encoding of 8bit is used and + no explicit ENCODING parameter is needed. + + Type value: A single value. The default is binary. It can also be + reset to text value. The text value can be used to specify a text + key. + + + +Dawson & Howes Standards Track [Page 26] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + Type special notes: The type can also include the type parameter TYPE + to specify the public key or authentication certificate format. The + parameter type should specify an IANA registered public key or + authentication certificate format. The parameter type can also + specify a non-standard format. + + Type example: + + KEY;ENCODING=b:MIICajCCAdOgAwIBAgICBEUwDQYJKoZIhvcNAQEEBQA + wdzELMAkGA1UEBhMCVVMxLDAqBgNVBAoTI05ldHNjYXBlIENbW11bmljYX + Rpb25zIENvcnBvcmF0aW9uMRwwGgYDVQQLExNJbmZvcm1hdGlvbiBTeXN0 + ZW1zMRwwGgYDVQQDExNyb290Y2EubmV0c2NhcGUuY29tMB4XDTk3MDYwNj + E5NDc1OVoXDTk3MTIwMzE5NDc1OVowgYkxCzAJBgNVBAYTAlVTMSYwJAYD + VQQKEx1OZXRzY2FwZSBDb21tdW5pY2F0aW9ucyBDb3JwLjEYMBYGA1UEAx + MPVGltb3RoeSBBIEhvd2VzMSEwHwYJKoZIhvcNAQkBFhJob3dlc0BuZXRz + Y2FwZS5jb20xFTATBgoJkiaJk/IsZAEBEwVob3dlczBcMA0GCSqGSIb3DQ + EBAQUAA0sAMEgCQQC0JZf6wkg8pLMXHHCUvMfL5H6zjSk4vTTXZpYyrdN2 + dXcoX49LKiOmgeJSzoiFKHtLOIboyludF90CgqcxtwKnAgMBAAGjNjA0MB + EGCWCGSAGG+EIBAQQEAwIAoDAfBgNVHSMEGDAWgBT84FToB/GV3jr3mcau + +hUMbsQukjANBgkqhkiG9w0BAQQFAAOBgQBexv7o7mi3PLXadkmNP9LcIP + mx93HGp0Kgyx1jIVMyNgsemeAwBM+MSlhMfcpbTrONwNjZYW8vJDSoi//y + rZlVt9bJbs7MNYZVsyF1unsqaln4/vy6Uawfg8VUMk1U7jt8LYpo4YULU7 + UZHPYVUaSgVttImOHZIKi4hlPXBOhcUQ== + +3.8 Extended Types + + The types defined by this document can be extended with private types + using the non-standard, private values mechanism defined in [RFC + 2045]. Non-standard, private types with a name starting with "X-" may + be defined bilaterally between two cooperating agents without outside + registration or standardization. + +4. Formal Grammar + + The following formal grammar is provided to assist developers in + building parsers for the vCard. + + This syntax is written according to the form described in RFC 2234, + but it references just this small subset of RFC 2234 literals: + + ;******************************************* + ; Commonly Used Literal Definition + ;******************************************* + + ALPHA = %x41-5A / %x61-7A + ; Latin Capital Letter A-Latin Capital Letter Z / + ; Latin Small Letter a-Latin Small Letter z + + + + +Dawson & Howes Standards Track [Page 27] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + CHAR = %x01-7F + ; Any C0 Controls and Basic Latin, excluding NULL from + ; Code Charts, pages 7-6 through 7-9 in [UNICODE] + + CR = %x0D + ; Carriage Return + + LF = %0A + ; Line Feed + + CRLF = CR LF + ; Internet standard newline + + ;CTL = %x00-1F / %x7F + ; Controls. Not used, but referenced in comments. + + DIGIT = %x30-39 + ; Digit Zero-Digit Nine + + DQUOTE = %x22 + ; Quotation Mark + + HTAB = %x09 + ; Horizontal Tabulation + + SP = %x20 + ; space + + VCHAR = %x21-7E + ; Visible (printing) characters + + WSP = SP / HTAB + ; White Space + + ;******************************************* + ; Basic vCard Definition + ;******************************************* + + vcard_entity = 1*(vcard) + + vcard = [group "."] "BEGIN" ":" "VCARD" 1*CRLF + 1*(contentline) + ;A vCard object MUST include the VERSION, FN and N types. + [group "."] "END" ":" "VCARD" 1*CRLF + + contentline = [group "."] name *(";" param ) ":" value CRLF + ; When parsing a content line, folded lines must first + ; be unfolded according to the unfolding procedure + + + +Dawson & Howes Standards Track [Page 28] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + ; described above. When generating a content line, lines + ; longer than 75 characters SHOULD be folded according to + ; the folding procedure described in [MIME DIR]. + + group = 1*(ALPHA / DIGIT / "-") + + name = iana-token / x-name + ; Parsing of the param and value is + ; based on the "name" or type identifier + ; as defined in ABNF sections below + + iana-token = 1*(ALPHA / DIGIT / "-") + ; vCard type or parameter identifier registered with IANA + + x-name = "X-" 1*(ALPHA / DIGIT / "-") + ; Reserved for non-standard use + + param = param-name "=" param-value *("," param-value) + + param-name = iana-token / x-name + + param-value = ptext / quoted-string + + ptext = *SAFE-CHAR + + value = *VALUE-CHAR + + quoted-string = DQUOTE QSAFE-CHAR DQUOTE + + NON-ASCII = %x80-FF + ; Use is restricted by CHARSET parameter + ; on outer MIME object (UTF-8 preferred) + + QSAFE-CHAR = WSP / %x21 / %x23-7E / NON-ASCII + ; Any character except CTLs, DQUOTE + + SAFE-CHAR = WSP / %x21 / %x23-2B / %x2D-39 / %x3C-7E / NON-ASCII + ; Any character except CTLs, DQUOTE, ";", ":", "," + + VALUE-CHAR = WSP / VCHAR / NON-ASCII + ; Any textual character + + ;******************************************* + ; vCard Type Definition + ; + ; Provides type-specific definitions for how the + ; "value" and "param" are defined. + ;******************************************* + + + +Dawson & Howes Standards Track [Page 29] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + ;For name="NAME" + param = "" + ; No parameters allowed + + value = text-value + + ;For name="PROFILE" + param = "" + ; No parameters allowed + + value = text-value + ; Value MUST be the case insensitive value "VCARD + + ;For name="SOURCE" + param = source-param + ; No parameters allowed + + value = uri + + source-param = ("VALUE" "=" "uri") + / ("CONTEXT" "=" "word") + ; Parameter value specifies the protocol context + ; for the uri value. + / (x-name "=" *SAFE-CHAR) + + ;For name="FN" + ;This type MUST be included in a vCard object. + param = text-param + ; Text parameters allowed + + value = text-value + + ;For name="N" + ;This type MUST be included in a vCard object. + + param = text-param + ; Text parameters allowed + + value = n-value + + n-value = 0*4(text-value *("," text-value) ";") + text-value *("," text-value) + ; Family; Given; Middle; Prefix; Suffix. + ; Example: Public;John;Quincy,Adams;Reverend Dr. III + + ;For name="NICKNAME" + param = text-param + ; Text parameters allowed + + + +Dawson & Howes Standards Track [Page 30] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + value = text-list + + ;For name="PHOTO" + param = img-inline-param + ; Only image parameters allowed + + param =/ img-refer-param + ; Only image parameters allowed + + value = img-inline-value + ; Value and parameter MUST match + + value =/ img-refer-value + ; Value and parameter MUST match + + ;For name="BDAY" + param = ("VALUE" "=" "date") + ; Only value parameter allowed + + param =/ ("VALUE" "=" "date-time") + ; Only value parameter allowed + + value = date-value + ; Value MUST match value type + + value =/ date-time-value + ; Value MUST match value type + + ;For name="ADR" + param = adr-param / text-param + ; Only adr and text parameters allowed + + value = adr-value + + ;For name="LABEL" + param = adr-param / text-param + ; Only adr and text parameters allowed + + value = text-value + + ;For name="TEL" + param = tel-param + ; Only tel parameters allowed + + value = phone-number-value + + tel-param = "TYPE" "=" tel-type *("," tel-type) + + + + +Dawson & Howes Standards Track [Page 31] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + tel-type = "HOME" / "WORK" / "PREF" / "VOICE" / "FAX" / "MSG" + / "CELL" / "PAGER" / "BBS" / "MODEM" / "CAR" / "ISDN" + / "VIDEO" / "PCS" / iana-token / x-name + ; Values are case insensitive + + ;For name="EMAIL" + param = email-param + ; Only email parameters allowed + + value = text-value + + email-param = "TYPE" "=" email-type ["," "PREF"] + ; Value is case insensitive + + email-type = "INTERNET" / "X400" / iana-token / "X-" word + ; Values are case insensitive + + ;For name="MAILER" + param = text-param + ; Only text parameters allowed + + value = text-value + + ;For name="TZ" + param = "" + ; No parameters allowed + + value = utc-offset-value + + ;For name="GEO" + param = "" + ; No parameters allowed + + value = float-value ";" float-value + + ;For name="TITLE" + param = text-param + ; Only text parameters allowed + + value = text-value + + ;For name="ROLE" + param = text-param + ; Only text parameters allowed + + value = text-value + + ;For name="LOGO" + + + +Dawson & Howes Standards Track [Page 32] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + param = img-inline-param / img-refer-param + ; Only image parameters allowed + + value = img-inline-value / img-refer-value + ; Value and parameter MUST match + + ;For name="AGENT" + param = agent-inline-param + + param =/ agent-refer-param + + value = agent-inline-value + ; Value and parameter MUST match + + value =/ agent-refer-value + ; Value and parameter MUST match + + agent-inline-param = "" + ; No parameters allowed + + agent-refer-param = "VALUE" "=" "uri" + ; Only value parameter allowed + + agent-inline-value = text-value + ; Value MUST be a valid vCard object + + agent-refer-value = uri + ; URI MUST refer to image content of given type + + ;For name="ORG" + + param = text-param + ; Only text parameters allowed + + value = org-value + + org-value = *(text-value ";") text-value + ; First is Organization Name, remainder are Organization Units. + + ;For name="CATEGORIES" + param = text-param + ; Only text parameters allowed + + value = text-list + + ;For name="NOTE" + param = text-param + ; Only text parameters allowed + + + +Dawson & Howes Standards Track [Page 33] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + value = text-value + + ;For name="PRODID" + param = "" + ; No parameters allowed + + value = text-value + + ;For name="REV" + param = ["VALUE" =" "date-time"] + ; Only value parameters allowed. Values are case insensitive. + + param =/ "VALUE" =" "date" + ; Only value parameters allowed. Values are case insensitive. + + value = date-time-value + + value =/ date-value + + ;For name="SORT-STRING" + param = text-param + ; Only text parameters allowed + + value = text-value + + ;For name="SOUND" + param = snd-inline-param + ; Only sound parameters allowed + + param =/ snd-refer-param + ; Only sound parameters allowed + + value = snd-line-value + ; Value MUST match value type + + value =/ snd-refer-value + ; Value MUST match value type + + snd-inline-value = binary-value CRLF + ; Value MUST be "b" encoded audio content + + snd-inline-param = ("VALUE" "=" "binary"]) + / ("ENCODING" "=" "b") + / ("TYPE" "=" *SAFE-CHAR) + ; Value MUST be an IANA registered audio type + + snd-refer-value = uri + ; URI MUST refer to audio content of given type + + + +Dawson & Howes Standards Track [Page 34] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + snd-refer-param = ("VALUE" "=" "uri") + / ("TYPE" "=" word) + ; Value MUST be an IANA registered audio type + + ;For name="UID" + param = "" + ; No parameters allowed + + value = text-value + + ;For name="URL" + param = "" + ; No parameters allowed + + value = uri + + ;For name="VERSION" + ;This type MUST be included in a vCard object. + param = "" + ; No parameters allowed + + value = text-value + ; Value MUST be "3.0" + + ;For name="CLASS" + param = "" + ; No parameters allowed + + value = "PUBLIC" / "PRIVATE" / "CONFIDENTIAL" + / iana-token / x-name + ; Value are case insensitive + + ;For name="KEY" + param = key-txt-param + ; Only value and type parameters allowed + + param =/ key-bin-param + ; Only value and type parameters allowed + + value = text-value + + value =/ binary-value + + key-txt-param = "TYPE" "=" keytype + + key-bin-param = ("TYPE" "=" keytype) + / ("ENCODING" "=" "b") + ; Value MUST be a "b" encoded key or certificate + + + +Dawson & Howes Standards Track [Page 35] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + keytype = "X509" / "PGP" / iana-token / x-name + ; Values are case insensitive + + ;For name="X-" non-standard type + param = text-param / (x-name "=" param-value) + ; Only text or non-standard parameters allowed + + value = text-value + + ;******************************************* + ; vCard Commonly Used Parameter Definition + ;******************************************* + + text-param = ("VALUE" "=" "ptext") + / ("LANGUAGE" "=" langval) + / (x-name "=" param-value) + + langval = <a language string as defined in RFC 1766> + + img-inline-value = binary-value + ;Value MUST be "b" encoded image content + + img-inline-param + + img-inline-param = ("VALUE" "=" "binary") + / ("ENCODING" "=" "b") + / ("TYPE" "=" param-value + ;TYPE value MUST be an IANA registered image type + + img-refer-value = uri + ;URI MUST refer to image content of given type + + img-refer-param = ("VALUE" "=" "uri") + / ("TYPE" "=" param-value) + ;TYPE value MUST be an IANA registered image type + + adr-param = ("TYPE" "=" adr-type *("," adr-type)) + / (text-param) + + adr-type = "dom" / "intl" / "postal" / "parcel" / "home" + / "work" / "pref" / iana-type / x-name + + adr-value = 0*6(text-value ";") text-value + ; PO Box, Extended Address, Street, Locality, Region, Postal + ; Code, Country Name + + + + + + +Dawson & Howes Standards Track [Page 36] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + ;******************************************* + ; vCard Type Value Definition + ;******************************************* + + text-value-list = 1*text-value *("," 1*text-value) + + text-value = *(SAFE-CHAR / ":" / DQUOTE / ESCAPED-CHAR) + + ESCAPED-CHAR = "\\" / "\;" / "\," / "\n" / "\N") + ; \\ encodes \, \n or \N encodes newline + ; \; encodes ;, \, encodes , + + binary-value = <A "b" encoded text value as defined in [RFC 2047]> + + date-value = <A single date value as defined in [MIME-DIR]> + + time-value = <A single time value as defined in [MIME-DIR]> + + date-time-value = <A single date-time value as defined in [MIME-DIR] + + float-value = <A single float value as defined in [MIME-DIR]> + + phone-number-value = <A single text value as defined in [CCITT + E.163] and [CCITT X.121]> + + uri-value = <A uri value as defined in [MIME-DIR]> + + utc-offset-value = ("+" / "-") time-hour ":" time-minute + time-hour = 2DIGIT ;00-23 + time-minute = 2DIGIT ;00-59 + +5. Differences From vCard v2.1 + + This specification has been reviewed by the IETF community. The + review process introduced a number of differences from the [VCARD] + version 2.1. These differences require that vCard objects conforming + to this specification have a different version number than a vCard + conforming to [VCARD]. The differences include the following: + + . The QUOTED-PRINTABLE inline encoding has been eliminated. + Only the "B" encoding of [RFC 2047] is an allowed value for + the ENCODING parameter. + + . The method for specifying CRLF character sequences in text + type values has been changed. The CRLF character sequence in + a text type value is specified with the backslash character + sequence "\n" or "\N". + + + + +Dawson & Howes Standards Track [Page 37] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + . Any COMMA or SEMICOLON in a text type value must be backslash + escaped. + + . VERSION value corresponding to this specification MUST be + "3.0". + + . The [MIME-DIR] predefined types of SOURCE, NAME and PROFILE + are allowed. + + . The [MIME-DIR] VALUE type parameter for value data typing is + allowed. In addition, there are extensions made to these type + values for additional value types used in this specification. + + . The [VCARD] CHARSET type parameter has been eliminated. + Character set can only be specified on the CHARSET parameter + on the Content-Type MIME header field. + + . The [VCARD] support for non-significant WSP character has + been eliminated. + + . The "TYPE=" prefix to parameter values is required. In + [VCARD] this was optional. + + . LOGO, PHOTO and SOUND multimedia formats MUST be either IANA + registered types or non-standard types. + + . Inline binary content must be "B" encoded and folded. A blank + line after the encoded binary content is no longer required. + + . TEL values can be identified as personal communication + services telephone numbers with the PCS type parameter value. + + . The CATEGORIES, CLASS, NICKNAME, PRODID and SORT-STRING types + have been added. + + . The VERSION, N and FN types MUST be specified in a vCard. + This identifies the version of the specification that the + object was formatted to. It also assures that every vCard + will include both a structured and formatted name that can be + used to identify the object. + + + + + + + + + + + +Dawson & Howes Standards Track [Page 38] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +6. Acknowledgements + + The many valuable comments contributed by members of the IETF ASID + working group are gratefully acknowledged, as are the contributions + by Roland Alden, Stephen Bartlett, Alec Dun, Patrik Faltstrom, Daniel + Gurney, Bruce Johnston, Daniel Klaussen, Pete Miller, Keith Moore, + Vinod Seraphin, Michelle Watkins. Chris Newman was especially helpful + in navigating the intricacies of ABNF lore. + +7. Authors' Addresses + + BEGIN:vCard + VERSION:3.0 + FN:Frank Dawson + ORG:Lotus Development Corporation + ADR;TYPE=WORK,POSTAL,PARCEL:;;6544 Battleford Drive + ;Raleigh;NC;27613-3502;U.S.A. + TEL;TYPE=VOICE,MSG,WORK:+1-919-676-9515 + TEL;TYPE=FAX,WORK:+1-919-676-9564 + EMAIL;TYPE=INTERNET,PREF:Frank_Dawson@Lotus.com + EMAIL;TYPE=INTERNET:fdawson@earthlink.net + URL:http://home.earthlink.net/~fdawson + END:vCard + + + BEGIN:vCard + VERSION:3.0 + FN:Tim Howes + ORG:Netscape Communications Corp. + ADR;TYPE=WORK:;;501 E. Middlefield Rd.;Mountain View; + CA; 94043;U.S.A. + TEL;TYPE=VOICE,MSG,WORK:+1-415-937-3419 + TEL;TYPE=FAX,WORK:+1-415-528-4164 + EMAIL;TYPE=INTERNET:howes@netscape.com + END:vCard + +8. Security Considerations + + vCards can carry cryptographic keys or certificates, as described in + Section 3.7.2. + + Section 3.7.1 specifies a desired security classification policy for + a particular vCard. That policy is not enforced in any way. + + The vCard objects have no inherent authentication or privacy, but can + easily be carried by any security mechanism that transfers MIME + objects with authentication or privacy. In cases where threats of + "spoofed" vCard information is a concern, the vCard SHOULD BE + + + +Dawson & Howes Standards Track [Page 39] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + transported using one of these secure mechanisms. + + The information in a vCard may become out of date. In cases where the + vitality of data is important to an originator of a vCard, the "URL" + type described in section 3.6.8 SHOULD BE specified. In addition, the + "REV" type described in section 3.6.4 can be specified to indicate + the last time that the vCard data was updated. + +9. References + + [ISO 8601] ISO 8601:1988 - Data elements and interchange formats - + Information interchange - Representation of dates and + times - The International Organization for + Standardization, June, 1988. + + [ISO 8601 TC] ISO 8601, Technical Corrigendum 1 - Data elements and + interchange formats - Information interchange - + Representation of dates and times - The International + Organization for Standardization, May, 1991. + + [ISO 9070] ISO 9070, Information Processing - SGML support + facilities - Registration Procedures for Public Text + Owner Identifiers, April, 1991. + + [CCITT E.163] Recommendation E.163 - Numbering Plan for The + International Telephone Service, CCITT Blue Book, + Fascicle II.2, pp. 128-134, November, 1988. + + [CCITT X.121] Recommendation X.121 - International Numbering Plan for + Public Data Networks, CCITT Blue Book, Fascicle VIII.3, + pp. 317-332, November, 1988. + + [CCITT X.520] Recommendation X.520 - The Directory - Selected + Attribute Types, November 1988. + + [CCITT X.521] Recommendation X.521 - The Directory - Selected Object + Classes, November 1988. + + [MIME-DIR] Howes, T., Smith, M., and F. Dawson, "A MIME Content- + Type for Directory Information", RFC 2425, September + 1998. + + [RFC 1738] Berners-Lee, T., Masinter, L., and M. McCahill, + "Uniform Resource Locators (URL)", RFC 1738, December + 1994. + + [RFC 1766] Alvestrand, H., "Tags for the Identification of + Languages", RFC 1766, March 1995. + + + +Dawson & Howes Standards Track [Page 40] + +RFC 2426 vCard MIME Directory Profile September 1998 + + + [RFC 1872] Levinson, E., "The MIME Multipart/Related Content- + type", RFC 1872, December 1995. + + [RFC 2045] Freed, N., and N. Borenstein, "Multipurpose Internet + Mail Extensions (MIME) - Part One: Format of Internet + Message Bodies", RFC 2045, November 1996. + + [RFC 2046] Freed, N., and N. Borenstein, "Multipurpose Internet + Mail Extensions (MIME) - Part Two: Media Types", RFC + 2046, November 1996. + + [RFC 2047] Moore, K., "Multipurpose Internet Mail Extensions + (MIME) - Part Three: Message Header Extensions for + Non-ASCII Text", RFC 2047, November 1996. + + [RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) - Part Four: + Registration Procedures", RFC 2048, January 1997. + + [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC 2234] Crocker, D., and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 2234, November 1997. + + [UNICODE] "The Unicode Standard - Version 2.0", The Unicode + Consortium, July 1996. + + [VCARD] Internet Mail Consortium, "vCard - The Electronic + Business Card Version 2.1", + http://www.imc.org/pdi/vcard-21.txt, September 18, + 1996. + + + + + + + + + + + + + + + + + + + +Dawson & Howes Standards Track [Page 41] + +RFC 2426 vCard MIME Directory Profile September 1998 + + +10. Full Copyright Statement + + Copyright (C) The Internet Society (1998). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + + + + + + + + + + + + + + + + + + + + + + + +Dawson & Howes Standards Track [Page 42] + diff --git a/rfc/rfc2595.txt b/rfc/rfc2595.txt @@ -0,0 +1,843 @@ + + + + + + +Network Working Group C. Newman +Request for Comments: 2595 Innosoft +Category: Standards Track June 1999 + + + Using TLS with IMAP, POP3 and ACAP + + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1999). All Rights Reserved. + +1. Motivation + + The TLS protocol (formerly known as SSL) provides a way to secure an + application protocol from tampering and eavesdropping. The option of + using such security is desirable for IMAP, POP and ACAP due to common + connection eavesdropping and hijacking attacks [AUTH]. Although + advanced SASL authentication mechanisms can provide a lightweight + version of this service, TLS is complimentary to simple + authentication-only SASL mechanisms or deployed clear-text password + login commands. + + Many sites have a high investment in authentication infrastructure + (e.g., a large database of a one-way-function applied to user + passwords), so a privacy layer which is not tightly bound to user + authentication can protect against network eavesdropping attacks + without requiring a new authentication infrastructure and/or forcing + all users to change their password. Recognizing that such sites will + desire simple password authentication in combination with TLS + encryption, this specification defines the PLAIN SASL mechanism for + use with protocols which lack a simple password authentication + command such as ACAP and SMTP. (Note there is a separate RFC for the + STARTTLS command in SMTP [SMTPTLS].) + + There is a strong desire in the IETF to eliminate the transmission of + clear-text passwords over unencrypted channels. While SASL can be + used for this purpose, TLS provides an additional tool with different + deployability characteristics. A server supporting both TLS with + + + + +Newman Standards Track [Page 1] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + simple passwords and a challenge/response SASL mechanism is likely to + interoperate with a wide variety of clients without resorting to + unencrypted clear-text passwords. + + The STARTTLS command rectifies a number of the problems with using a + separate port for a "secure" protocol variant. Some of these are + mentioned in section 7. + +1.1. Conventions Used in this Document + + The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", + "MAY", and "OPTIONAL" in this document are to be interpreted as + described in "Key words for use in RFCs to Indicate Requirement + Levels" [KEYWORDS]. + + Terms related to authentication are defined in "On Internet + Authentication" [AUTH]. + + Formal syntax is defined using ABNF [ABNF]. + + In examples, "C:" and "S:" indicate lines sent by the client and + server respectively. + +2. Basic Interoperability and Security Requirements + + The following requirements apply to all implementations of the + STARTTLS extension for IMAP, POP3 and ACAP. + +2.1. Cipher Suite Requirements + + Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher + suite is REQUIRED. This is important as it assures that any two + compliant implementations can be configured to interoperate. + + All other cipher suites are OPTIONAL. + +2.2. Privacy Operational Mode Security Requirements + + Both clients and servers SHOULD have a privacy operational mode which + refuses authentication unless successful activation of an encryption + layer (such as that provided by TLS) occurs prior to or at the time + of authentication and which will terminate the connection if that + encryption layer is deactivated. Implementations are encouraged to + have flexability with respect to the minimal encryption strength or + cipher suites permitted. A minimalist approach to this + recommendation would be an operational mode where the + TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA cipher suite is mandatory prior to + permitting authentication. + + + +Newman Standards Track [Page 2] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + Clients MAY have an operational mode which uses encryption only when + it is advertised by the server, but authentication continues + regardless. For backwards compatibility, servers SHOULD have an + operational mode where only the authentication mechanisms required by + the relevant base protocol specification are needed to successfully + authenticate. + +2.3. Clear-Text Password Requirements + + Clients and servers which implement STARTTLS MUST be configurable to + refuse all clear-text login commands or mechanisms (including both + standards-track and nonstandard mechanisms) unless an encryption + layer of adequate strength is active. Servers which allow + unencrypted clear-text logins SHOULD be configurable to refuse + clear-text logins both for the entire server, and on a per-user + basis. + +2.4. Server Identity Check + + During the TLS negotiation, the client MUST check its understanding + of the server hostname against the server's identity as presented in + the server Certificate message, in order to prevent man-in-the-middle + attacks. Matching is performed according to these rules: + + - The client MUST use the server hostname it used to open the + connection as the value to compare against the server name as + expressed in the server certificate. The client MUST NOT use any + form of the server hostname derived from an insecure remote source + (e.g., insecure DNS lookup). CNAME canonicalization is not done. + + - If a subjectAltName extension of type dNSName is present in the + certificate, it SHOULD be used as the source of the server's + identity. + + - Matching is case-insensitive. + + - A "*" wildcard character MAY be used as the left-most name + component in the certificate. For example, *.example.com would + match a.example.com, foo.example.com, etc. but would not match + example.com. + + - If the certificate contains multiple names (e.g. more than one + dNSName field), then a match with any one of the fields is + considered acceptable. + + If the match fails, the client SHOULD either ask for explicit user + confirmation, or terminate the connection and indicate the server's + identity is suspect. + + + +Newman Standards Track [Page 3] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + +2.5. TLS Security Policy Check + + Both the client and server MUST check the result of the STARTTLS + command and subsequent TLS negotiation to see whether acceptable + authentication or privacy was achieved. Ignoring this step + completely invalidates using TLS for security. The decision about + whether acceptable authentication or privacy was achieved is made + locally, is implementation-dependent, and is beyond the scope of this + document. + +3. IMAP STARTTLS extension + + When the TLS extension is present in IMAP, "STARTTLS" is listed as a + capability in response to the CAPABILITY command. This extension + adds a single command, "STARTTLS" to the IMAP protocol which is used + to begin a TLS negotiation. + +3.1. STARTTLS Command + + Arguments: none + + Responses: no specific responses for this command + + Result: OK - begin TLS negotiation + BAD - command unknown or arguments invalid + + A TLS negotiation begins immediately after the CRLF at the end of + the tagged OK response from the server. Once a client issues a + STARTTLS command, it MUST NOT issue further commands until a + server response is seen and the TLS negotiation is complete. + + The STARTTLS command is only valid in non-authenticated state. + The server remains in non-authenticated state, even if client + credentials are supplied during the TLS negotiation. The SASL + [SASL] EXTERNAL mechanism MAY be used to authenticate once TLS + client credentials are successfully exchanged, but servers + supporting the STARTTLS command are not required to support the + EXTERNAL mechanism. + + Once TLS has been started, the client MUST discard cached + information about server capabilities and SHOULD re-issue the + CAPABILITY command. This is necessary to protect against + man-in-the-middle attacks which alter the capabilities list prior + to STARTTLS. The server MAY advertise different capabilities + after STARTTLS. + + The formal syntax for IMAP is amended as follows: + + + + +Newman Standards Track [Page 4] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + command_any =/ "STARTTLS" + + Example: C: a001 CAPABILITY + S: * CAPABILITY IMAP4rev1 STARTTLS LOGINDISABLED + S: a001 OK CAPABILITY completed + C: a002 STARTTLS + S: a002 OK Begin TLS negotiation now + <TLS negotiation, further commands are under TLS layer> + C: a003 CAPABILITY + S: * CAPABILITY IMAP4rev1 AUTH=EXTERNAL + S: a003 OK CAPABILITY completed + C: a004 LOGIN joe password + S: a004 OK LOGIN completed + +3.2. IMAP LOGINDISABLED capability + + The current IMAP protocol specification (RFC 2060) requires the + implementation of the LOGIN command which uses clear-text passwords. + Many sites may choose to disable this command unless encryption is + active for security reasons. An IMAP server MAY advertise that the + LOGIN command is disabled by including the LOGINDISABLED capability + in the capability response. Such a server will respond with a tagged + "NO" response to any attempt to use the LOGIN command. + + An IMAP server which implements STARTTLS MUST implement support for + the LOGINDISABLED capability on unencrypted connections. + + An IMAP client which complies with this specification MUST NOT issue + the LOGIN command if this capability is present. + + This capability is useful to prevent clients compliant with this + specification from sending an unencrypted password in an environment + subject to passive attacks. It has no impact on an environment + subject to active attacks as a man-in-the-middle attacker can remove + this capability. Therefore this does not relieve clients of the need + to follow the privacy mode recommendation in section 2.2. + + Servers advertising this capability will fail to interoperate with + many existing compliant IMAP clients and will be unable to prevent + those clients from disclosing the user's password. + +4. POP3 STARTTLS extension + + The POP3 STARTTLS extension adds the STLS command to POP3 servers. + If this is implemented, the POP3 extension mechanism [POP3EXT] MUST + also be implemented to avoid the need for client probing of multiple + commands. The capability name "STLS" indicates this command is + present and permitted in the current state. + + + +Newman Standards Track [Page 5] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + STLS + + Arguments: none + + Restrictions: + Only permitted in AUTHORIZATION state. + + Discussion: + A TLS negotiation begins immediately after the CRLF at the + end of the +OK response from the server. A -ERR response + MAY result if a security layer is already active. Once a + client issues a STLS command, it MUST NOT issue further + commands until a server response is seen and the TLS + negotiation is complete. + + The STLS command is only permitted in AUTHORIZATION state + and the server remains in AUTHORIZATION state, even if + client credentials are supplied during the TLS negotiation. + The AUTH command [POP-AUTH] with the EXTERNAL mechanism + [SASL] MAY be used to authenticate once TLS client + credentials are successfully exchanged, but servers + supporting the STLS command are not required to support the + EXTERNAL mechanism. + + Once TLS has been started, the client MUST discard cached + information about server capabilities and SHOULD re-issue + the CAPA command. This is necessary to protect against + man-in-the-middle attacks which alter the capabilities list + prior to STLS. The server MAY advertise different + capabilities after STLS. + + Possible Responses: + +OK -ERR + + Examples: + C: STLS + S: +OK Begin TLS negotiation + <TLS negotiation, further commands are under TLS layer> + ... + C: STLS + S: -ERR Command not permitted when TLS active + + + + + + + + + + +Newman Standards Track [Page 6] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + +5. ACAP STARTTLS extension + + When the TLS extension is present in ACAP, "STARTTLS" is listed as a + capability in the ACAP greeting. No arguments to this capability are + defined at this time. This extension adds a single command, + "STARTTLS" to the ACAP protocol which is used to begin a TLS + negotiation. + +5.1. STARTTLS Command + + Arguments: none + + Responses: no specific responses for this command + + Result: OK - begin TLS negotiation + BAD - command unknown or arguments invalid + + A TLS negotiation begins immediately after the CRLF at the end of + the tagged OK response from the server. Once a client issues a + STARTTLS command, it MUST NOT issue further commands until a + server response is seen and the TLS negotiation is complete. + + The STARTTLS command is only valid in non-authenticated state. + The server remains in non-authenticated state, even if client + credentials are supplied during the TLS negotiation. The SASL + [SASL] EXTERNAL mechanism MAY be used to authenticate once TLS + client credentials are successfully exchanged, but servers + supporting the STARTTLS command are not required to support the + EXTERNAL mechanism. + + After the TLS layer is established, the server MUST re-issue an + untagged ACAP greeting. This is necessary to protect against + man-in-the-middle attacks which alter the capabilities list prior + to STARTTLS. The client MUST discard cached capability + information and replace it with the information from the new ACAP + greeting. The server MAY advertise different capabilities after + STARTTLS. + + The formal syntax for ACAP is amended as follows: + + command_any =/ "STARTTLS" + + Example: S: * ACAP (SASL "CRAM-MD5") (STARTTLS) + C: a002 STARTTLS + S: a002 OK "Begin TLS negotiation now" + <TLS negotiation, further commands are under TLS layer> + S: * ACAP (SASL "CRAM-MD5" "PLAIN" "EXTERNAL") + + + + +Newman Standards Track [Page 7] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + +6. PLAIN SASL mechanism + + Clear-text passwords are simple, interoperate with almost all + existing operating system authentication databases, and are useful + for a smooth transition to a more secure password-based + authentication mechanism. The drawback is that they are unacceptable + for use over an unencrypted network connection. + + This defines the "PLAIN" SASL mechanism for use with ACAP and other + protocols with no clear-text login command. The PLAIN SASL mechanism + MUST NOT be advertised or used unless a strong encryption layer (such + as the provided by TLS) is active or backwards compatibility dictates + otherwise. + + The mechanism consists of a single message from the client to the + server. The client sends the authorization identity (identity to + login as), followed by a US-ASCII NUL character, followed by the + authentication identity (identity whose password will be used), + followed by a US-ASCII NUL character, followed by the clear-text + password. The client may leave the authorization identity empty to + indicate that it is the same as the authentication identity. + + The server will verify the authentication identity and password with + the system authentication database and verify that the authentication + credentials permit the client to login as the authorization identity. + If both steps succeed, the user is logged in. + + The server MAY also use the password to initialize any new + authentication database, such as one suitable for CRAM-MD5 + [CRAM-MD5]. + + Non-US-ASCII characters are permitted as long as they are represented + in UTF-8 [UTF-8]. Use of non-visible characters or characters which + a user may be unable to enter on some keyboards is discouraged. + + The formal grammar for the client message using Augmented BNF [ABNF] + follows. + + message = [authorize-id] NUL authenticate-id NUL password + authenticate-id = 1*UTF8-SAFE ; MUST accept up to 255 octets + authorize-id = 1*UTF8-SAFE ; MUST accept up to 255 octets + password = 1*UTF8-SAFE ; MUST accept up to 255 octets + NUL = %x00 + UTF8-SAFE = %x01-09 / %x0B-0C / %x0E-7F / UTF8-2 / + UTF8-3 / UTF8-4 / UTF8-5 / UTF8-6 + UTF8-1 = %x80-BF + UTF8-2 = %xC0-DF UTF8-1 + UTF8-3 = %xE0-EF 2UTF8-1 + + + +Newman Standards Track [Page 8] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + UTF8-4 = %xF0-F7 3UTF8-1 + UTF8-5 = %xF8-FB 4UTF8-1 + UTF8-6 = %xFC-FD 5UTF8-1 + + Here is an example of how this might be used to initialize a CRAM-MD5 + authentication database for ACAP: + + Example: S: * ACAP (SASL "CRAM-MD5") (STARTTLS) + C: a001 AUTHENTICATE "CRAM-MD5" + S: + "<1896.697170952@postoffice.reston.mci.net>" + C: "tim b913a602c7eda7a495b4e6e7334d3890" + S: a001 NO (TRANSITION-NEEDED) + "Please change your password, or use TLS to login" + C: a002 STARTTLS + S: a002 OK "Begin TLS negotiation now" + <TLS negotiation, further commands are under TLS layer> + S: * ACAP (SASL "CRAM-MD5" "PLAIN" "EXTERNAL") + C: a003 AUTHENTICATE "PLAIN" {21+} + C: <NUL>tim<NUL>tanstaaftanstaaf + S: a003 OK CRAM-MD5 password initialized + + Note: In this example, <NUL> represents a single ASCII NUL octet. + +7. imaps and pop3s ports + + Separate "imaps" and "pop3s" ports were registered for use with SSL. + Use of these ports is discouraged in favor of the STARTTLS or STLS + commands. + + A number of problems have been observed with separate ports for + "secure" variants of protocols. This is an attempt to enumerate some + of those problems. + + - Separate ports lead to a separate URL scheme which intrudes into + the user interface in inappropriate ways. For example, many web + pages use language like "click here if your browser supports SSL." + This is a decision the browser is often more capable of making than + the user. + + - Separate ports imply a model of either "secure" or "not secure." + This can be misleading in a number of ways. First, the "secure" + port may not in fact be acceptably secure as an export-crippled + cipher suite might be in use. This can mislead users into a false + sense of security. Second, the normal port might in fact be + secured by using a SASL mechanism which includes a security layer. + Thus the separate port distinction makes the complex topic of + security policy even more confusing. One common result of this + confusion is that firewall administrators are often misled into + + + +Newman Standards Track [Page 9] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + permitting the "secure" port and blocking the standard port. This + could be a poor choice given the common use of SSL with a 40-bit + key encryption layer and plain-text password authentication is less + secure than strong SASL mechanisms such as GSSAPI with Kerberos 5. + + - Use of separate ports for SSL has caused clients to implement only + two security policies: use SSL or don't use SSL. The desirable + security policy "use TLS when available" would be cumbersome with + the separate port model, but is simple with STARTTLS. + + - Port numbers are a limited resource. While they are not yet in + short supply, it is unwise to set a precedent that could double (or + worse) the speed of their consumption. + + +8. IANA Considerations + + This constitutes registration of the "STARTTLS" and "LOGINDISABLED" + IMAP capabilities as required by section 7.2.1 of RFC 2060 [IMAP]. + + The registration for the POP3 "STLS" capability follows: + + CAPA tag: STLS + Arguments: none + Added commands: STLS + Standard commands affected: May enable USER/PASS as a side-effect. + CAPA command SHOULD be re-issued after successful completion. + Announced states/Valid states: AUTHORIZATION state only. + Specification reference: this memo + + The registration for the ACAP "STARTTLS" capability follows: + + Capability name: STARTTLS + Capability keyword: STARTTLS + Capability arguments: none + Published Specification(s): this memo + Person and email address for further information: + see author's address section below + + The registration for the PLAIN SASL mechanism follows: + + SASL mechanism name: PLAIN + Security Considerations: See section 9 of this memo + Published specification: this memo + Person & email address to contact for further information: + see author's address section below + Intended usage: COMMON + Author/Change controller: see author's address section below + + + +Newman Standards Track [Page 10] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + +9. Security Considerations + + TLS only provides protection for data sent over a network connection. + Messages transferred over IMAP or POP3 are still available to server + administrators and usually subject to eavesdropping, tampering and + forgery when transmitted through SMTP or NNTP. TLS is no substitute + for an end-to-end message security mechanism using MIME security + multiparts [MIME-SEC]. + + A man-in-the-middle attacker can remove STARTTLS from the capability + list or generate a failure response to the STARTTLS command. In + order to detect such an attack, clients SHOULD warn the user when + session privacy is not active and/or be configurable to refuse to + proceed without an acceptable level of security. + + A man-in-the-middle attacker can always cause a down-negotiation to + the weakest authentication mechanism or cipher suite available. For + this reason, implementations SHOULD be configurable to refuse weak + mechanisms or cipher suites. + + Any protocol interactions prior to the TLS handshake are performed in + the clear and can be modified by a man-in-the-middle attacker. For + this reason, clients MUST discard cached information about server + capabilities advertised prior to the start of the TLS handshake. + + Clients are encouraged to clearly indicate when the level of + encryption active is known to be vulnerable to attack using modern + hardware (such as encryption keys with 56 bits of entropy or less). + + The LOGINDISABLED IMAP capability (discussed in section 3.2) only + reduces the potential for passive attacks, it provides no protection + against active attacks. The responsibility remains with the client + to avoid sending a password over a vulnerable channel. + + The PLAIN mechanism relies on the TLS encryption layer for security. + When used without TLS, it is vulnerable to a common network + eavesdropping attack. Therefore PLAIN MUST NOT be advertised or used + unless a suitable TLS encryption layer is active or backwards + compatibility dictates otherwise. + + When the PLAIN mechanism is used, the server gains the ability to + impersonate the user to all services with the same password + regardless of any encryption provided by TLS or other network privacy + mechanisms. While many other authentication mechanisms have similar + weaknesses, stronger SASL mechanisms such as Kerberos address this + issue. Clients are encouraged to have an operational mode where all + mechanisms which are likely to reveal the user's password to the + server are disabled. + + + +Newman Standards Track [Page 11] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + The security considerations for TLS apply to STARTTLS and the + security considerations for SASL apply to the PLAIN mechanism. + Additional security requirements are discussed in section 2. + +10. References + + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 2234, November 1997. + + [ACAP] Newman, C. and J. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November 1997. + + [AUTH] Haller, N. and R. Atkinson, "On Internet Authentication", + RFC 1704, October 1994. + + [CRAM-MD5] Klensin, J., Catoe, R. and P. Krumviede, "IMAP/POP + AUTHorize Extension for Simple Challenge/Response", RFC + 2195, September 1997. + + [IMAP] Crispin, M., "Internet Message Access Protocol - Version + 4rev1", RFC 2060, December 1996. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [MIME-SEC] Galvin, J., Murphy, S., Crocker, S. and N. Freed, + "Security Multiparts for MIME: Multipart/Signed and + Multipart/Encrypted", RFC 1847, October 1995. + + [POP3] Myers, J. and M. Rose, "Post Office Protocol - Version 3", + STD 53, RFC 1939, May 1996. + + [POP3EXT] Gellens, R., Newman, C. and L. Lundblade, "POP3 Extension + Mechanism", RFC 2449, November 1998. + + [POP-AUTH] Myers, J., "POP3 AUTHentication command", RFC 1734, + December 1994. + + [SASL] Myers, J., "Simple Authentication and Security Layer + (SASL)", RFC 2222, October 1997. + + [SMTPTLS] Hoffman, P., "SMTP Service Extension for Secure SMTP over + TLS", RFC 2487, January 1999. + + [TLS] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", + RFC 2246, January 1999. + + + + + +Newman Standards Track [Page 12] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", RFC 2279, January 1998. + + +11. Author's Address + + Chris Newman + Innosoft International, Inc. + 1050 Lakes Drive + West Covina, CA 91790 USA + + EMail: chris.newman@innosoft.com + + +A. Appendix -- Compliance Checklist + + An implementation is not compliant if it fails to satisfy one or more + of the MUST requirements for the protocols it implements. An + implementation that satisfies all the MUST and all the SHOULD + requirements for its protocols is said to be "unconditionally + compliant"; one that satisfies all the MUST requirements but not all + the SHOULD requirements for its protocols is said to be + "conditionally compliant". + + Rules Section + ----- ------- + Mandatory-to-implement Cipher Suite 2.1 + SHOULD have mode where encryption required 2.2 + server SHOULD have mode where TLS not required 2.2 + MUST be configurable to refuse all clear-text login + commands or mechanisms 2.3 + server SHOULD be configurable to refuse clear-text + login commands on entire server and on per-user basis 2.3 + client MUST check server identity 2.4 + client MUST use hostname used to open connection 2.4 + client MUST NOT use hostname from insecure remote lookup 2.4 + client SHOULD support subjectAltName of dNSName type 2.4 + client SHOULD ask for confirmation or terminate on fail 2.4 + MUST check result of STARTTLS for acceptable privacy 2.5 + client MUST NOT issue commands after STARTTLS + until server response and negotiation done 3.1,4,5.1 + client MUST discard cached information 3.1,4,5.1,9 + client SHOULD re-issue CAPABILITY/CAPA command 3.1,4 + IMAP server with STARTTLS MUST implement LOGINDISABLED 3.2 + IMAP client MUST NOT issue LOGIN if LOGINDISABLED 3.2 + POP server MUST implement POP3 extensions 4 + ACAP server MUST re-issue ACAP greeting 5.1 + + + + +Newman Standards Track [Page 13] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + + client SHOULD warn when session privacy not active and/or + refuse to proceed without acceptable security level 9 + SHOULD be configurable to refuse weak mechanisms or + cipher suites 9 + + The PLAIN mechanism is an optional part of this specification. + However if it is implemented the following rules apply: + + Rules Section + ----- ------- + MUST NOT use PLAIN unless strong encryption active + or backwards compatibility dictates otherwise 6,9 + MUST use UTF-8 encoding for characters in PLAIN 6 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Newman Standards Track [Page 14] + +RFC 2595 Using TLS with IMAP, POP3 and ACAP June 1999 + + +Full Copyright Statement + + Copyright (C) The Internet Society (1999). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Newman Standards Track [Page 15] + diff --git a/rfc/rfc2646.txt b/rfc/rfc2646.txt @@ -0,0 +1,787 @@ + + + + + + +Network Working Group R. Gellens, Editor +Request for Comments: 2646 Qualcomm +Updates: 2046 August 1999 +Category: Standards Track + + + The Text/Plain Format Parameter + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1999). All Rights Reserved. + +Table of Contents + + 1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . 2 + 2. Conventions Used in this Document . . . . . . . . . . . . . 2 + 3. The Problem . . . . . . . . . . . . . . . . . . . . . . . . 2 + 3.1. Paragraph Text . . . . . . . . . . . . . . . . . . . . 3 + 3.2. Embarrassing Line Wrap . . . . . . . . . . . . . . . . . 3 + 3.3. New Media Types . . . . . . . . . . . . . . . . . . . . 4 + 4. The Format Parameter to the Text/Plain Media Type . . . . . 4 + 4.1. Generating Format=Flowed . . . . . . . . . . . . . . . 5 + 4.2. Interpreting Format=Flowed . . . . . . . . . . . . . . . 6 + 4.3. Usenet Signature Convention . . . . . . . . . . . . . . 7 + 4.4. Space-Stuffing . . . . . . . . . . . . . . . . . . . . . 7 + 4.5. Quoting . . . . . . . . . . . . . . . . . . . . . . . . 8 + 4.6. Digital Signatures and Encryption . . . . . . . . . . . 9 + 4.7. Line Analysis Table . . . . . . . . . . . . . . . . . . 10 + 4.8. Examples . . . . . . . . . . . . . . . . . . . . . . . . 10 + 5. ABNF . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 + 6. Failure Modes . . . . . . . . . . . . . . . . . . . . . . . 11 + 6.1. Trailing White Space Corruption . . . . . . . . . . . . 11 + 7. Security Considerations . . . . . . . . . . . . . . . . . . 12 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . 12 + 9. Internationalization Considerations . . . . . . . . . . . . 12 + 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 12 + 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 + 12. Editor's Address . . . . . . . . . . . . . . . . . . . . . 13 + 13. Full Copyright Statement . . . . . . . . . . . . . . . . . . 14 + + + + +Gellens Standards Track [Page 1] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + +1. Abstract + + Interoperability problems have been observed with erroneous labelling + of paragraph text as Text/Plain, and with various forms of + "embarrassing line wrap." (See section 3.) + + Attempts to deploy new media types, such as Text/Enriched [RICH] and + Text/HTML [HTML] have suffered from a lack of backwards compatibility + and an often hostile user reaction at the receiving end. + + What is required is a format which is in all significant ways + Text/Plain, and therefore is quite suitable for display as + Text/Plain, and yet allows the sender to express to the receiver + which lines can be considered a logical paragraph, and thus flowed + (wrapped and joined) as appropriate. + + This memo proposes a new parameter to be used with Text/Plain, and, + in the presence of this parameter, the use of trailing whitespace to + indicate flowed lines. This results in an encoding which appears as + normal Text/Plain in older implementations, since it is in fact + normal Text/Plain. + +2. Conventions Used in this Document + + The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", + and "MAY" in this document are to be interpreted as described in "Key + words for use in RFCs to Indicate Requirement Levels" [KEYWORDS]. + +3. The Problem + + The Text/Plain media type is the lowest common denominator of + Internet email, with lines of no more than 997 characters (by + convention usually no more than 80), and where the CRLF sequence + represents a line break [MIME-IMT]. + + Text/Plain is usually displayed as preformatted text, often in a + fixed font. That is, the characters start at the left margin of the + display window, and advance to the right until a CRLF sequence is + seen, at which point a new line is started, again at the left margin. + When a line length exceeds the display window, some clients will wrap + the line, while others invoke a horizontal scroll bar. + + Text which meets this description is defined by this memo as "fixed". + + Some interoperability problems have been observed with this media + type: + + + + + +Gellens Standards Track [Page 2] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + +3.1. Paragraph Text + + Many modern programs use a proportional-spaced font and CRLF to + represent paragraph breaks. Line breaks are "soft", occurring as + needed on display. That is, characters are grouped into a paragraph + until a CRLF sequence is seen, at which point a new paragraph is + started. Each paragraph is displayed, starting at the left margin + (or paragraph indent), and continuing to the right until a word is + encountered which does not fit in the remaining display width. This + word is displayed at the left margin of the next line. This + continues until the paragraph ends (a CRLF is seen). Extra vertical + space is left between paragraphs. + + Text which meets this description is defined by this memo as + "flowed". + + Numerous software products erroneously label this media type as + Text/Plain, resulting in much user discomfort. + +3.2. Embarrassing Line Wrap + + As Text/Plain messages get quoted in replies or forwarded messages, + the length of each line gradually increases, resulting in + "embarrassing line wrap." This results in text which is at best hard + to read, and often confuses attributions. + + Example: + + >>>>>>This is a comment from the first message to show a + >quoting example. + >>>>>This is a comment from the second message to show a + >quoting example. + >>>>This is a comment from the third message. + >>>This is a comment from the fourth message. + + It can be confusing to assign attribution to lines 2 and 4 above. + + In addition, as devices with display widths smaller than 80 + characters become more popular, embarrassing line wrap has become + even more prevalent, even with unquoted text. + + + + + + + + + + + +Gellens Standards Track [Page 3] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + Example: + + This is paragraph text that is + meant to be flowed across + several lines. + However, the sending mailer is + converting it to fixed text at + a width of 72 + characters, which causes it to + look like this when shown on a + PDA with only + 30 character lines. + +3.3. New Media Types + + Attempts to deploy new media types, such as Text/Enriched [RICH] and + Text/HTML [HTML] have suffered from a lack of backwards compatibility + and an often hostile user reaction at the receiving end. + + In particular, Text/Enriched requires that open angle brackets ("<") + and hard line breaks be doubled, with resulting user unhappiness when + viewed as Text/Plain. Text/HTML requires even more alteration of + text, with a corresponding increase in user complaints. + + A proposal to define a new media type to explicitly represent the + paragraph form suffered from a lack of interoperability with + currently deployed software. Some programs treat unknown subtypes of + Text as an attachment. + + What is desired is a format which is in all significant ways + Text/Plain, and therefore is quite suitable for display as + Text/Plain, and yet allows the sender to express to the receiver + which lines can be considered a logical paragraph, and thus flowed + (wrapped and joined) as appropriate. + +4. The Format Parameter to the Text/Plain Media Type + + This document defines a new MIME parameter for use with Text/Plain: + + Name: Format + Value: Fixed, Flowed + + (Neither the parameter name nor its value are case sensitive.) + + If not specified, a value of Fixed is assumed. The semantics of the + Fixed value are the usual associated with Text/Plain [MIME-IMT]. + + + + + +Gellens Standards Track [Page 4] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + A value of Flowed indicates that the definition of flowed text (as + specified in this memo) was used on generation, and MAY be used on + reception. + + This section discusses flowed text; section 5 provides a formal + definition. + + Because flowed lines are all-but-indistinguishable from fixed lines, + currently deployed software treats flowed lines as normal Text/Plain + (which is what they are). Thus, no interoperability problems are + expected. + + Note that this memo describes an on-the-wire format. It does not + address formats for local file storage. + +4.1. Generating Format=Flowed + + When generating Format=Flowed text, lines SHOULD be shorter than 80 + characters. As suggested values, any paragraph longer than 79 + characters in total length could be wrapped using lines of 72 or + fewer characters. While the specific line length used is a matter of + aesthetics and preference, longer lines are more likely to require + rewrapping and to encounter difficulties with older mailers. It has + been suggested that 66 character lines are the most readable. + + (The reason for the restriction to 79 or fewer characters between + CRLFs on the wire is to ensure that all lines, even when displayed by + a non-flowed-aware program, will fit in a standard 80-column screen + without having to be wrapped. The limit is 79, not 80, because while + 80 fit on a line, the last column is often reserved for a line-wrap + indicator.) + + When creating flowed text, the generating agent wraps, that is, + inserts 'soft' line breaks as needed. Soft line breaks are added + between words. Because a soft line break is a SP CRLF sequence, the + generating agent creates one by inserting a CRLF after the occurance + of a space. + + A generating agent SHOULD NOT insert white space into a word (a + sequence of printable characters not containing spaces). If faced + with a word which exceeds 79 characters (but less than 998 + characters, the [SMTP] limit on line length), the agent SHOULD send + the word as is and exceed the 79-character limit on line length. + + + + + + + + +Gellens Standards Track [Page 5] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + A generating agent SHOULD: + + 1. Ensure all lines (fixed and flowed) are 79 characters or + fewer in length, counting the trailing space but not + counting the CRLF, unless a word by itself exceeds 79 + characters. + 2. Trim spaces before user-inserted hard line breaks. + 3. Space-stuff lines which start with a space, "From ", or + ">". + + In order to create messages which do not require space-stuffing, and + are thus more aesthetically pleasing when viewed as Format=Fixed, a + generating agent MAY avoid wrapping immediately before ">", "From ", + or space. + + (See sections 4.4 and 4.5 for more information on space-stuffing and + quoting, respectively.) + + A Format=Flowed message consists of zero or more paragraphs, each + containing one or more flowed lines followed by one fixed line. The + usual case is a series of flowed text lines with blank (empty) fixed + lines between them. + + Any number of fixed lines can appear between paragraphs. + + [Quoted-Printable] encoding SHOULD NOT be used with Format=Flowed + unless absolutely necessary (for example, non-US-ASCII (8-bit) + characters over a strictly 7-bit transport such as unextended SMTP). + In particular, a message SHOULD NOT be encoded in Quoted-Printable + for the sole purpose of protecting the trailing space on flowed lines + unless the body part is cryptographically signed or encrypted (see + Section 4.6). + + The intent of Format=Flowed is to allow user agents to generate + flowed text which is non-obnoxious when viewed as pure, raw + Text/Plain (without any decoding); use of Quoted-Printable hinders + this and may cause Format=Flowed to be rejected by end users. + +4.2. Interpreting Format=Flowed + + If the first character of a line is a quote mark (">"), the line is + considered to be quoted (see section 4.5). Logically, all quote + marks are counted and deleted, resulting in a line with a non-zero + quote depth, and content. (The agent is of course free to display the + content with quote marks or excerpt bars or anything else.) + Logically, this test for quoted lines is done before any other tests + (that is, before checking for space-stuffed and flowed). + + + + +Gellens Standards Track [Page 6] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + If the first character of a line is a space, the line has been + space-stuffed (see section 4.4). Logically, this leading space is + deleted before examining the line further (that is, before checking + for flowed). + + If the line ends in one or more spaces, the line is flowed. + Otherwise it is fixed. Trailing spaces are part of the line's + content, but the CRLF of a soft line break is not. + + A series of one or more flowed lines followed by one fixed line is + considered a paragraph, and MAY be flowed (wrapped and unwrapped) as + appropriate on display and in the construction of new messages (see + section 4.5). + + A line consisting of one or more spaces (after deleting a stuffed + space) is considered a flowed line. + +4.3. Usenet Signature Convention + + There is a convention in Usenet news of using "-- " as the separator + line between the body and the signature of a message. When + generating a Format=Flowed message containing a Usenet-style + separator before the signature, the separator line is sent as-is. + This is a special case; an (optionally quoted) line consisting of + DASH DASH SP is not considered flowed. + +4.4. Space-Stuffing + + In order to allow for unquoted lines which start with ">", and to + protect against systems which "From-munge" in-transit messages + (modifying any line which starts with "From " to ">From "), + Format=Flowed provides for space-stuffing. + + Space-stuffing adds a single space to the start of any line which + needs protection when the message is generated. On reception, if the + first character of a line is a space, it is logically deleted. This + occurs after the test for a quoted line, and before the test for a + flowed line. + + On generation, any unquoted lines which start with ">", and any lines + which start with a space or "From " SHOULD be space-stuffed. Other + lines MAY be space-stuffed as desired. + + (Note that space-stuffing is similar to dot-stuffing as specified in + [SMTP].) + + + + + + +Gellens Standards Track [Page 7] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + If a space-stuffed message is received by an agent which handles + Format=Flowed, the space-stuffing is reversed and thus the message + appears unchanged. An agent which is not aware of Format=Flowed will + of course not undo any space-stuffing, thus Format=Flowed messages + may appear with a leading space on some lines (those which start with + a space, ">" which is not a quote indicator, or "From "). Since + lines which require space-stuffing rarely occur, and the aesthetic + consequences of unreversed space-stuffing are minimal, this is not + expected to be a significant problem. + +4.5. Quoting + + In Format=Flowed, the canonical quote indicator (or quote mark) is + one or more close angle bracket (">") characters. Lines which start + with the quote indicator are considered quoted. The number of ">" + characters at the start of the line specifies the quote depth. + Flowed lines which are also quoted may require special handling on + display and when copied to new messages. + + When creating quoted flowed lines, each such line starts with the + quote indicator. + + Note that because of space-stuffing, the lines + >> Exit, Stage Left + and + >>Exit, Stage Left + are semantically identical; both have a quote-depth of two, and a + content of "Exit, Stage Left". + + However, the line + > > Exit, Stage Left + is different. It has a quote-depth of one, and a content of + "> Exit, Stage Left". + + When generating quoted flowed lines, an agent needs to pay attention + to changes in quote depth. A sequence of quoted lines of the same + quote depth SHOULD be encoded as a paragraph, with the last line + generated as fixed and prior lines generated as flowed. + + If a receiving agent wishes to reformat flowed quoted lines (joining + and/or wrapping them) on display or when generating new messages, the + lines SHOULD be de-quoted, reformatted, and then re-quoted. To + de-quote, the number of close angle brackets in the quote indicator + at the start of each line is counted. Consecutive lines with the + same quoting depth are considered one paragraph and are reformatted + together. To re-quote after reformatting, a quote indicator + containing the same number of close angle brackets originally present + is prefixed to each line. + + + +Gellens Standards Track [Page 8] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + On reception, if a change in quoting depth occurs on a flowed line, + this is an improperly formatted message. The receiver SHOULD handle + this error by using the 'quote-depth-wins' rule, which is to ignore + the flowed indicator and treat the line as fixed. That is, the + change in quote depth ends the paragraph. + + For example, consider the following sequence of lines (using '*' to + indicate a soft line break, i.e., SP CRLF, and '#' to indicate a hard + line break, i.e., CRLF): + + > Thou villainous ill-breeding spongy dizzy-eyed* + > reeky elf-skinned pigeon-egg!* <--- problem ---< + >> Thou artless swag-bellied milk-livered* + >> dismal-dreaming idle-headed scut!# + >>> Thou errant folly-fallen spleeny reeling-ripe* + >>> unmuzzled ratsbane!# + >>>> Henceforth, the coding style is to be strictly* + >>>> enforced, including the use of only upper case.# + >>>>> I've noticed a lack of adherence to the coding* + >>>>> styles, of late.# + >>>>>> Any complaints?# + + The second line ends in a soft line break, even though it is the last + line of the one-deep quote block. The question then arises as to how + this line should be interpreted, considering that the next line is + the first line of the two-deep quote block. + + The example text above, when processed according to quote-depth wins, + results in the first two lines being considered as one quoted, flowed + section, with a quote depth of 1; the third and fourth lines become a + quoted, flowed section, with a quote depth of 2. + + A generating agent SHOULD NOT create this situation; a receiving + agent SHOULD handle it using quote-depth wins. + +4.6. Digital Signatures and Encryption + + If a message is digitally signed or encrypted it is important that + cryptographic processing use the on-the-wire Format=Flowed format. + That is, during generation the message SHOULD be prepared for + transmission, including addition of soft line breaks, space-stuffing, + and [Quoted-Printable] encoding (to protect soft line breaks) before + being digitally signed or encrypted; similarly, on receipt the + message SHOULD have the signature verified or be decrypted before + [Quoted-Printable] decoding and removal of stuffed spaces, soft line + breaks and quote marks, and reflowing. + + + + + +Gellens Standards Track [Page 9] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + +4.7. Line Analysis Table + + Lines contained in a Text/Plain body part with Format=Flowed can be + analyzed by examining the start and end of the line. If the line + starts with the quote indicator, it is quoted. If the line ends with + one or more space characters, it is flowed. This is summarized by + the following table: + + Starts Ends in + with One or Line + Quote More Spaces Type + ------ ----------- --------------- + no no unquoted, fixed + yes no quoted, fixed + no yes unquoted, flowed + yes yes quoted, flowed + +4.8. Examples + + The following example contains three paragraphs: + + `Take some more tea,' the March Hare said to Alice, very + earnestly. + + `I've had nothing yet,' Alice replied in an offended tone, `so I + can't take more.' + + `You mean you can't take LESS,' said the Hatter: `it's very easy + to take MORE than nothing.' + + This could be encoded as follows (using '*' to indicate a soft line + break, that is, SP CRLF sequence, and '#' to indicate a hard line + break, that is, CRLF): + + `Take some more tea,' the March Hare said to Alice, very* + earnestly.* + # + `I've had nothing yet,' Alice replied in an offended tone, `so* + I can't take more.'* + # + `You mean you can't take LESS,' said the Hatter: `it's very* + easy to take MORE than nothing.'# + + + + + + + + + +Gellens Standards Track [Page 10] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + To show an example of quoting, here we have the same exchange, + presented as a series of direct quotes: + + >>>Take some more tea.# + >>I've had nothing yet, so I can't take more.# + >You mean you can't take LESS, it's very easy to take* + >MORE than nothing.# + +5. ABNF + + The constructs used in Text/Plain; Format=Flowed body parts are + described using [ABNF], including the Core Rules: + + paragraph = 1*flowed-line fixed-line + fixed-line = fixed / sig-sep + fixed = [quote] [stuffing] *text-char non-sp CRLF + flowed-line = flow-qt / flow-unqt + flow-qt = quote [stuffing] *text-char 1*SP CRLF + flow-unqt = [stuffing] *text-char 1*SP CRLF + non-sp = %x01-09 / %x0B / %x0C / %x0E-1F / %x21-7F + ; any 7-bit US-ASCII character, excluding + ; NUL, CR, LF, and SP + quote = 1*">" + sig-sep = [quote] "--" SP CRLF + stuffing = [SP] ; space-stuffed, added on generation if + ; needed, deleted on reception + text-char = non-sp / SP + +6. Failure Modes + +6.1. Trailing White Space Corruption + + There are systems in existence which alter trailing whitespace on + messages which pass through them. Such systems may strip, or in + rarer cases, add trailing whitespace, in violation of RFC 821 [SMTP] + section 4.5.2. + + Stripping trailing whitespace has the effect of converting flowed + lines to fixed lines, which results in a message no worse than if + Format=Flowed had not been used. + + Adding trailing whitespace to a Format=Flowed message may result in a + malformed display or reply. + + Since most systems which add trailing white space do so to create a + line which fills an internal record format, the result is almost + always a line which contains an even number of characters (counting + the added trailing white space). + + + +Gellens Standards Track [Page 11] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + One possible avoidance, therefore, would be to define Format=Flowed + lines to use either one or two trailing space characters to indicate + a flowed line, such that the total line length is odd. However, + considering the scarcity of such systems today, it is not worth the + added complexity. + +7. Security Considerations + + This parameter introduces no security considerations beyond those + which apply to Text/Plain. + + Section 4.6 discusses the interaction between Format=Flowed and + digital signatures or encryption. + +8. IANA Considerations + + IANA is requested to add a reference to this specification in the + Text/Plain Media Type registration. + +9. Internationalization Considerations + + The line wrap and quoting specifications of Format=Flowed may not be + suitable for certain charsets, such as for Arabic and Hebrew + characters that read from right to left. Care should be taken in + applying format=flowed in these cases, as format=fixed combined with + quoted-printable encoding may be more suitable. + +10. Acknowledgments + + This proposal evolved from a discussion of Chris Newman's + Text/Paragraph draft which took place on the IETF 822 mailing list. + Special thanks to Ian Bell, Steve Dorner, Brian Kelley, Dan Kohn, + Laurence Lundblade, and Dan Wing for their reviews, comments, + suggestions, and discussions. + +11. References + + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", RFC 2234, November + 1997. + + [KEYWORDS] S. Bradner, "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RICH] Resnick, P. and A. Walker, "The text/enriched MIME + Content-type", RFC 1896, February 1996. + + + + + +Gellens Standards Track [Page 12] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + + [MIME-IMT] Freed, N. and N. Borenstein, "Multipurpose + Internet Mail Extensions (MIME) Part Two: Media + Types", RFC 2046, November 1996. + + [Quoted-Printable] Freed, N. and N. Borenstein, "Multipurpose + Internet Mail Extensions (MIME) Part One: Format + of Internet Message Bodies", RFC 2045, November + 1996. + + [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD + 10, RFC 821, August 1982. + + [HTML] Berners-Lee, T. and D. Connolly, "Hypertext Markup + Language -- 2.0", RFC 1866, November 1995. + + +12. Editor's Address + + Randall Gellens + QUALCOMM Incorporated + 5775 Morehouse Dr. + San Diego, CA 92121-2779 + USA + + Phone: +1 619 651 5115 + EMail: randy@qualcomm.com + + + + + + + + + + + + + + + + + + + + + + + + + +Gellens Standards Track [Page 13] + +RFC 2646 The Text/Plain Format Parameter August 1999 + + +13. Full Copyright Statement + + Copyright (C) The Internet Society (1999). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Gellens Standards Track [Page 14] + diff --git a/rfc/rfc2821.txt b/rfc/rfc2821.txt @@ -0,0 +1,4427 @@ + + + + + + +Network Working Group J. Klensin, Editor +Request for Comments: 2821 AT&T Laboratories +Obsoletes: 821, 974, 1869 April 2001 +Updates: 1123 +Category: Standards Track + + + Simple Mail Transfer Protocol + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2001). All Rights Reserved. + +Abstract + + This document is a self-contained specification of the basic protocol + for the Internet electronic mail transport. It consolidates, updates + and clarifies, but doesn't add new or change existing functionality + of the following: + + - the original SMTP (Simple Mail Transfer Protocol) specification of + RFC 821 [30], + + - domain name system requirements and implications for mail + transport from RFC 1035 [22] and RFC 974 [27], + + - the clarifications and applicability statements in RFC 1123 [2], + and + + - material drawn from the SMTP Extension mechanisms [19]. + + It obsoletes RFC 821, RFC 974, and updates RFC 1123 (replaces the + mail transport materials of RFC 1123). However, RFC 821 specifies + some features that were not in significant use in the Internet by the + mid-1990s and (in appendices) some additional transport models. + Those sections are omitted here in the interest of clarity and + brevity; readers needing them should refer to RFC 821. + + + + + + +Klensin Standards Track [Page 1] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + It also includes some additional material from RFC 1123 that required + amplification. This material has been identified in multiple ways, + mostly by tracking flaming on various lists and newsgroups and + problems of unusual readings or interpretations that have appeared as + the SMTP extensions have been deployed. Where this specification + moves beyond consolidation and actually differs from earlier + documents, it supersedes them technically as well as textually. + + Although SMTP was designed as a mail transport and delivery protocol, + this specification also contains information that is important to its + use as a 'mail submission' protocol, as recommended for POP [3, 26] + and IMAP [6]. Additional submission issues are discussed in RFC 2476 + [15]. + + Section 2.3 provides definitions of terms specific to this document. + Except when the historical terminology is necessary for clarity, this + document uses the current 'client' and 'server' terminology to + identify the sending and receiving SMTP processes, respectively. + + A companion document [32] discusses message headers, message bodies + and formats and structures for them, and their relationship. + +Table of Contents + + 1. Introduction .................................................. 4 + 2. The SMTP Model ................................................ 5 + 2.1 Basic Structure .............................................. 5 + 2.2 The Extension Model .......................................... 7 + 2.2.1 Background ................................................. 7 + 2.2.2 Definition and Registration of Extensions .................. 8 + 2.3 Terminology .................................................. 9 + 2.3.1 Mail Objects ............................................... 10 + 2.3.2 Senders and Receivers ...................................... 10 + 2.3.3 Mail Agents and Message Stores ............................. 10 + 2.3.4 Host ....................................................... 11 + 2.3.5 Domain ..................................................... 11 + 2.3.6 Buffer and State Table ..................................... 11 + 2.3.7 Lines ...................................................... 12 + 2.3.8 Originator, Delivery, Relay, and Gateway Systems ........... 12 + 2.3.9 Message Content and Mail Data .............................. 13 + 2.3.10 Mailbox and Address ....................................... 13 + 2.3.11 Reply ..................................................... 13 + 2.4 General Syntax Principles and Transaction Model .............. 13 + 3. The SMTP Procedures: An Overview .............................. 15 + 3.1 Session Initiation ........................................... 15 + 3.2 Client Initiation ............................................ 16 + 3.3 Mail Transactions ............................................ 16 + 3.4 Forwarding for Address Correction or Updating ................ 19 + + + +Klensin Standards Track [Page 2] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + 3.5 Commands for Debugging Addresses ............................. 20 + 3.5.1 Overview ................................................... 20 + 3.5.2 VRFY Normal Response ....................................... 22 + 3.5.3 Meaning of VRFY or EXPN Success Response ................... 22 + 3.5.4 Semantics and Applications of EXPN ......................... 23 + 3.6 Domains ...................................................... 23 + 3.7 Relaying ..................................................... 24 + 3.8 Mail Gatewaying .............................................. 25 + 3.8.1 Header Fields in Gatewaying ................................ 26 + 3.8.2 Received Lines in Gatewaying ............................... 26 + 3.8.3 Addresses in Gatewaying .................................... 26 + 3.8.4 Other Header Fields in Gatewaying .......................... 27 + 3.8.5 Envelopes in Gatewaying .................................... 27 + 3.9 Terminating Sessions and Connections ......................... 27 + 3.10 Mailing Lists and Aliases ................................... 28 + 3.10.1 Alias ..................................................... 28 + 3.10.2 List ...................................................... 28 + 4. The SMTP Specifications ....................................... 29 + 4.1 SMTP Commands ................................................ 29 + 4.1.1 Command Semantics and Syntax ............................... 29 + 4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO) ................... 29 + 4.1.1.2 MAIL (MAIL) .............................................. 31 + 4.1.1.3 RECIPIENT (RCPT) ......................................... 31 + 4.1.1.4 DATA (DATA) .............................................. 33 + 4.1.1.5 RESET (RSET) ............................................. 34 + 4.1.1.6 VERIFY (VRFY) ............................................ 35 + 4.1.1.7 EXPAND (EXPN) ............................................ 35 + 4.1.1.8 HELP (HELP) .............................................. 35 + 4.1.1.9 NOOP (NOOP) .............................................. 35 + 4.1.1.10 QUIT (QUIT) ............................................. 36 + 4.1.2 Command Argument Syntax .................................... 36 + 4.1.3 Address Literals ........................................... 38 + 4.1.4 Order of Commands .......................................... 39 + 4.1.5 Private-use Commands ....................................... 40 + 4.2 SMTP Replies ................................................ 40 + 4.2.1 Reply Code Severities and Theory ........................... 42 + 4.2.2 Reply Codes by Function Groups ............................. 44 + 4.2.3 Reply Codes in Numeric Order .............................. 45 + 4.2.4 Reply Code 502 ............................................. 46 + 4.2.5 Reply Codes After DATA and the Subsequent <CRLF>.<CRLF> .... 46 + 4.3 Sequencing of Commands and Replies ........................... 47 + 4.3.1 Sequencing Overview ........................................ 47 + 4.3.2 Command-Reply Sequences .................................... 48 + 4.4 Trace Information ............................................ 49 + 4.5 Additional Implementation Issues ............................. 53 + 4.5.1 Minimum Implementation ..................................... 53 + 4.5.2 Transparency ............................................... 53 + 4.5.3 Sizes and Timeouts ......................................... 54 + + + +Klensin Standards Track [Page 3] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + 4.5.3.1 Size limits and minimums ................................. 54 + 4.5.3.2 Timeouts ................................................. 56 + 4.5.4 Retry Strategies ........................................... 57 + 4.5.4.1 Sending Strategy ......................................... 58 + 4.5.4.2 Receiving Strategy ....................................... 59 + 4.5.5 Messages with a null reverse-path .......................... 59 + 5. Address Resolution and Mail Handling .......................... 60 + 6. Problem Detection and Handling ................................ 62 + 6.1 Reliable Delivery and Replies by Email ....................... 62 + 6.2 Loop Detection ............................................... 63 + 6.3 Compensating for Irregularities .............................. 63 + 7. Security Considerations ....................................... 64 + 7.1 Mail Security and Spoofing ................................... 64 + 7.2 "Blind" Copies ............................................... 65 + 7.3 VRFY, EXPN, and Security ..................................... 65 + 7.4 Information Disclosure in Announcements ...................... 66 + 7.5 Information Disclosure in Trace Fields ....................... 66 + 7.6 Information Disclosure in Message Forwarding ................. 67 + 7.7 Scope of Operation of SMTP Servers ........................... 67 + 8. IANA Considerations ........................................... 67 + 9. References .................................................... 68 + 10. Editor's Address ............................................. 70 + 11. Acknowledgments .............................................. 70 + Appendices ....................................................... 71 + A. TCP Transport Service ......................................... 71 + B. Generating SMTP Commands from RFC 822 Headers ................. 71 + C. Source Routes ................................................. 72 + D. Scenarios ..................................................... 73 + E. Other Gateway Issues .......................................... 76 + F. Deprecated Features of RFC 821 ................................ 76 + Full Copyright Statement ......................................... 79 + +1. Introduction + + The objective of the Simple Mail Transfer Protocol (SMTP) is to + transfer mail reliably and efficiently. + + SMTP is independent of the particular transmission subsystem and + requires only a reliable ordered data stream channel. While this + document specifically discusses transport over TCP, other transports + are possible. Appendices to RFC 821 describe some of them. + + An important feature of SMTP is its capability to transport mail + across networks, usually referred to as "SMTP mail relaying" (see + section 3.8). A network consists of the mutually-TCP-accessible + hosts on the public Internet, the mutually-TCP-accessible hosts on a + firewall-isolated TCP/IP Intranet, or hosts in some other LAN or WAN + environment utilizing a non-TCP transport-level protocol. Using + + + +Klensin Standards Track [Page 4] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + SMTP, a process can transfer mail to another process on the same + network or to some other network via a relay or gateway process + accessible to both networks. + + In this way, a mail message may pass through a number of intermediate + relay or gateway hosts on its path from sender to ultimate recipient. + The Mail eXchanger mechanisms of the domain name system [22, 27] (and + section 5 of this document) are used to identify the appropriate + next-hop destination for a message being transported. + +2. The SMTP Model + +2.1 Basic Structure + + The SMTP design can be pictured as: + + +----------+ +----------+ + +------+ | | | | + | User |<-->| | SMTP | | + +------+ | Client- |Commands/Replies| Server- | + +------+ | SMTP |<-------------->| SMTP | +------+ + | File |<-->| | and Mail | |<-->| File | + |System| | | | | |System| + +------+ +----------+ +----------+ +------+ + SMTP client SMTP server + + When an SMTP client has a message to transmit, it establishes a two- + way transmission channel to an SMTP server. The responsibility of an + SMTP client is to transfer mail messages to one or more SMTP servers, + or report its failure to do so. + + The means by which a mail message is presented to an SMTP client, and + how that client determines the domain name(s) to which mail messages + are to be transferred is a local matter, and is not addressed by this + document. In some cases, the domain name(s) transferred to, or + determined by, an SMTP client will identify the final destination(s) + of the mail message. In other cases, common with SMTP clients + associated with implementations of the POP [3, 26] or IMAP [6] + protocols, or when the SMTP client is inside an isolated transport + service environment, the domain name determined will identify an + intermediate destination through which all mail messages are to be + relayed. SMTP clients that transfer all traffic, regardless of the + target domain names associated with the individual messages, or that + do not maintain queues for retrying message transmissions that + initially cannot be completed, may otherwise conform to this + specification but are not considered fully-capable. Fully-capable + SMTP implementations, including the relays used by these less capable + + + + +Klensin Standards Track [Page 5] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + ones, and their destinations, are expected to support all of the + queuing, retrying, and alternate address functions discussed in this + specification. + + The means by which an SMTP client, once it has determined a target + domain name, determines the identity of an SMTP server to which a + copy of a message is to be transferred, and then performs that + transfer, is covered by this document. To effect a mail transfer to + an SMTP server, an SMTP client establishes a two-way transmission + channel to that SMTP server. An SMTP client determines the address + of an appropriate host running an SMTP server by resolving a + destination domain name to either an intermediate Mail eXchanger host + or a final target host. + + An SMTP server may be either the ultimate destination or an + intermediate "relay" (that is, it may assume the role of an SMTP + client after receiving the message) or "gateway" (that is, it may + transport the message further using some protocol other than SMTP). + SMTP commands are generated by the SMTP client and sent to the SMTP + server. SMTP replies are sent from the SMTP server to the SMTP + client in response to the commands. + + In other words, message transfer can occur in a single connection + between the original SMTP-sender and the final SMTP-recipient, or can + occur in a series of hops through intermediary systems. In either + case, a formal handoff of responsibility for the message occurs: the + protocol requires that a server accept responsibility for either + delivering a message or properly reporting the failure to do so. + + Once the transmission channel is established and initial handshaking + completed, the SMTP client normally initiates a mail transaction. + Such a transaction consists of a series of commands to specify the + originator and destination of the mail and transmission of the + message content (including any headers or other structure) itself. + When the same message is sent to multiple recipients, this protocol + encourages the transmission of only one copy of the data for all + recipients at the same destination (or intermediate relay) host. + + The server responds to each command with a reply; replies may + indicate that the command was accepted, that additional commands are + expected, or that a temporary or permanent error condition exists. + Commands specifying the sender or recipients may include server- + permitted SMTP service extension requests as discussed in section + 2.2. The dialog is purposely lock-step, one-at-a-time, although this + can be modified by mutually-agreed extension requests such as command + pipelining [13]. + + + + + +Klensin Standards Track [Page 6] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + Once a given mail message has been transmitted, the client may either + request that the connection be shut down or may initiate other mail + transactions. In addition, an SMTP client may use a connection to an + SMTP server for ancillary services such as verification of email + addresses or retrieval of mailing list subscriber addresses. + + As suggested above, this protocol provides mechanisms for the + transmission of mail. This transmission normally occurs directly + from the sending user's host to the receiving user's host when the + two hosts are connected to the same transport service. When they are + not connected to the same transport service, transmission occurs via + one or more relay SMTP servers. An intermediate host that acts as + either an SMTP relay or as a gateway into some other transmission + environment is usually selected through the use of the domain name + service (DNS) Mail eXchanger mechanism. + + Usually, intermediate hosts are determined via the DNS MX record, not + by explicit "source" routing (see section 5 and appendices C and + F.2). + +2.2 The Extension Model + +2.2.1 Background + + In an effort that started in 1990, approximately a decade after RFC + 821 was completed, the protocol was modified with a "service + extensions" model that permits the client and server to agree to + utilize shared functionality beyond the original SMTP requirements. + The SMTP extension mechanism defines a means whereby an extended SMTP + client and server may recognize each other, and the server can inform + the client as to the service extensions that it supports. + + Contemporary SMTP implementations MUST support the basic extension + mechanisms. For instance, servers MUST support the EHLO command even + if they do not implement any specific extensions and clients SHOULD + preferentially utilize EHLO rather than HELO. (However, for + compatibility with older conforming implementations, SMTP clients and + servers MUST support the original HELO mechanisms as a fallback.) + Unless the different characteristics of HELO must be identified for + interoperability purposes, this document discusses only EHLO. + + SMTP is widely deployed and high-quality implementations have proven + to be very robust. However, the Internet community now considers + some services to be important that were not anticipated when the + protocol was first designed. If support for those services is to be + added, it must be done in a way that permits older implementations to + continue working acceptably. The extension framework consists of: + + + + +Klensin Standards Track [Page 7] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + - The SMTP command EHLO, superseding the earlier HELO, + + - a registry of SMTP service extensions, + + - additional parameters to the SMTP MAIL and RCPT commands, and + + - optional replacements for commands defined in this protocol, such + as for DATA in non-ASCII transmissions [33]. + + SMTP's strength comes primarily from its simplicity. Experience with + many protocols has shown that protocols with few options tend towards + ubiquity, whereas protocols with many options tend towards obscurity. + + Each and every extension, regardless of its benefits, must be + carefully scrutinized with respect to its implementation, deployment, + and interoperability costs. In many cases, the cost of extending the + SMTP service will likely outweigh the benefit. + +2.2.2 Definition and Registration of Extensions + + The IANA maintains a registry of SMTP service extensions. A + corresponding EHLO keyword value is associated with each extension. + Each service extension registered with the IANA must be defined in a + formal standards-track or IESG-approved experimental protocol + document. The definition must include: + + - the textual name of the SMTP service extension; + + - the EHLO keyword value associated with the extension; + + - the syntax and possible values of parameters associated with the + EHLO keyword value; + + - any additional SMTP verbs associated with the extension + (additional verbs will usually be, but are not required to be, the + same as the EHLO keyword value); + + - any new parameters the extension associates with the MAIL or RCPT + verbs; + + - a description of how support for the extension affects the + behavior of a server and client SMTP; and, + + - the increment by which the extension is increasing the maximum + length of the commands MAIL and/or RCPT, over that specified in + this standard. + + + + + +Klensin Standards Track [Page 8] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + In addition, any EHLO keyword value starting with an upper or lower + case "X" refers to a local SMTP service extension used exclusively + through bilateral agreement. Keywords beginning with "X" MUST NOT be + used in a registered service extension. Conversely, keyword values + presented in the EHLO response that do not begin with "X" MUST + correspond to a standard, standards-track, or IESG-approved + experimental SMTP service extension registered with IANA. A + conforming server MUST NOT offer non-"X"-prefixed keyword values that + are not described in a registered extension. + + Additional verbs and parameter names are bound by the same rules as + EHLO keywords; specifically, verbs beginning with "X" are local + extensions that may not be registered or standardized. Conversely, + verbs not beginning with "X" must always be registered. + +2.3 Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described below. + + 1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that + the definition is an absolute requirement of the specification. + + 2. MUST NOT This phrase, or the phrase "SHALL NOT", mean that the + definition is an absolute prohibition of the specification. + + 3. SHOULD This word, or the adjective "RECOMMENDED", mean that + there may exist valid reasons in particular circumstances to + ignore a particular item, but the full implications must be + understood and carefully weighed before choosing a different + course. + + 4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean + that there may exist valid reasons in particular circumstances + when the particular behavior is acceptable or even useful, but the + full implications should be understood and the case carefully + weighed before implementing any behavior described with this + label. + + 5. MAY This word, or the adjective "OPTIONAL", mean that an item is + truly optional. One vendor may choose to include the item because + a particular marketplace requires it or because the vendor feels + that it enhances the product while another vendor may omit the + same item. An implementation which does not include a particular + option MUST be prepared to interoperate with another + implementation which does include the option, though perhaps with + reduced functionality. In the same vein an implementation which + + + +Klensin Standards Track [Page 9] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + does include a particular option MUST be prepared to interoperate + with another implementation which does not include the option + (except, of course, for the feature the option provides.) + +2.3.1 Mail Objects + + SMTP transports a mail object. A mail object contains an envelope + and content. + + The SMTP envelope is sent as a series of SMTP protocol units + (described in section 3). It consists of an originator address (to + which error reports should be directed); one or more recipient + addresses; and optional protocol extension material. Historically, + variations on the recipient address specification command (RCPT TO) + could be used to specify alternate delivery modes, such as immediate + display; those variations have now been deprecated (see appendix F, + section F.6). + + The SMTP content is sent in the SMTP DATA protocol unit and has two + parts: the headers and the body. If the content conforms to other + contemporary standards, the headers form a collection of field/value + pairs structured as in the message format specification [32]; the + body, if structured, is defined according to MIME [12]. The content + is textual in nature, expressed using the US-ASCII repertoire [1]. + Although SMTP extensions (such as "8BITMIME" [20]) may relax this + restriction for the content body, the content headers are always + encoded using the US-ASCII repertoire. A MIME extension [23] defines + an algorithm for representing header values outside the US-ASCII + repertoire, while still encoding them using the US-ASCII repertoire. + +2.3.2 Senders and Receivers + + In RFC 821, the two hosts participating in an SMTP transaction were + described as the "SMTP-sender" and "SMTP-receiver". This document + has been changed to reflect current industry terminology and hence + refers to them as the "SMTP client" (or sometimes just "the client") + and "SMTP server" (or just "the server"), respectively. Since a + given host may act both as server and client in a relay situation, + "receiver" and "sender" terminology is still used where needed for + clarity. + +2.3.3 Mail Agents and Message Stores + + Additional mail system terminology became common after RFC 821 was + published and, where convenient, is used in this specification. In + particular, SMTP servers and clients provide a mail transport service + and therefore act as "Mail Transfer Agents" (MTAs). "Mail User + Agents" (MUAs or UAs) are normally thought of as the sources and + + + +Klensin Standards Track [Page 10] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + targets of mail. At the source, an MUA might collect mail to be + transmitted from a user and hand it off to an MTA; the final + ("delivery") MTA would be thought of as handing the mail off to an + MUA (or at least transferring responsibility to it, e.g., by + depositing the message in a "message store"). However, while these + terms are used with at least the appearance of great precision in + other environments, the implied boundaries between MUAs and MTAs + often do not accurately match common, and conforming, practices with + Internet mail. Hence, the reader should be cautious about inferring + the strong relationships and responsibilities that might be implied + if these terms were used elsewhere. + +2.3.4 Host + + For the purposes of this specification, a host is a computer system + attached to the Internet (or, in some cases, to a private TCP/IP + network) and supporting the SMTP protocol. Hosts are known by names + (see "domain"); identifying them by numerical address is discouraged. + +2.3.5 Domain + + A domain (or domain name) consists of one or more dot-separated + components. These components ("labels" in DNS terminology [22]) are + restricted for SMTP purposes to consist of a sequence of letters, + digits, and hyphens drawn from the ASCII character set [1]. Domain + names are used as names of hosts and of other entities in the domain + name hierarchy. For example, a domain may refer to an alias (label + of a CNAME RR) or the label of Mail eXchanger records to be used to + deliver mail instead of representing a host name. See [22] and + section 5 of this specification. + + The domain name, as described in this document and in [22], is the + entire, fully-qualified name (often referred to as an "FQDN"). A + domain name that is not in FQDN form is no more than a local alias. + Local aliases MUST NOT appear in any SMTP transaction. + +2.3.6 Buffer and State Table + + SMTP sessions are stateful, with both parties carefully maintaining a + common view of the current state. In this document we model this + state by a virtual "buffer" and a "state table" on the server which + may be used by the client to, for example, "clear the buffer" or + "reset the state table," causing the information in the buffer to be + discarded and the state to be returned to some previous state. + + + + + + + +Klensin Standards Track [Page 11] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +2.3.7 Lines + + SMTP commands and, unless altered by a service extension, message + data, are transmitted in "lines". Lines consist of zero or more data + characters terminated by the sequence ASCII character "CR" (hex value + 0D) followed immediately by ASCII character "LF" (hex value 0A). + This termination sequence is denoted as <CRLF> in this document. + Conforming implementations MUST NOT recognize or generate any other + character or character sequence as a line terminator. Limits MAY be + imposed on line lengths by servers (see section 4.5.3). + + In addition, the appearance of "bare" "CR" or "LF" characters in text + (i.e., either without the other) has a long history of causing + problems in mail implementations and applications that use the mail + system as a tool. SMTP client implementations MUST NOT transmit + these characters except when they are intended as line terminators + and then MUST, as indicated above, transmit them only as a <CRLF> + sequence. + +2.3.8 Originator, Delivery, Relay, and Gateway Systems + + This specification makes a distinction among four types of SMTP + systems, based on the role those systems play in transmitting + electronic mail. An "originating" system (sometimes called an SMTP + originator) introduces mail into the Internet or, more generally, + into a transport service environment. A "delivery" SMTP system is + one that receives mail from a transport service environment and + passes it to a mail user agent or deposits it in a message store + which a mail user agent is expected to subsequently access. A + "relay" SMTP system (usually referred to just as a "relay") receives + mail from an SMTP client and transmits it, without modification to + the message data other than adding trace information, to another SMTP + server for further relaying or for delivery. + + A "gateway" SMTP system (usually referred to just as a "gateway") + receives mail from a client system in one transport environment and + transmits it to a server system in another transport environment. + Differences in protocols or message semantics between the transport + environments on either side of a gateway may require that the gateway + system perform transformations to the message that are not permitted + to SMTP relay systems. For the purposes of this specification, + firewalls that rewrite addresses should be considered as gateways, + even if SMTP is used on both sides of them (see [11]). + + + + + + + + +Klensin Standards Track [Page 12] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +2.3.9 Message Content and Mail Data + + The terms "message content" and "mail data" are used interchangeably + in this document to describe the material transmitted after the DATA + command is accepted and before the end of data indication is + transmitted. Message content includes message headers and the + possibly-structured message body. The MIME specification [12] + provides the standard mechanisms for structured message bodies. + +2.3.10 Mailbox and Address + + As used in this specification, an "address" is a character string + that identifies a user to whom mail will be sent or a location into + which mail will be deposited. The term "mailbox" refers to that + depository. The two terms are typically used interchangeably unless + the distinction between the location in which mail is placed (the + mailbox) and a reference to it (the address) is important. An + address normally consists of user and domain specifications. The + standard mailbox naming convention is defined to be "local- + part@domain": contemporary usage permits a much broader set of + applications than simple "user names". Consequently, and due to a + long history of problems when intermediate hosts have attempted to + optimize transport by modifying them, the local-part MUST be + interpreted and assigned semantics only by the host specified in the + domain part of the address. + +2.3.11 Reply + + An SMTP reply is an acknowledgment (positive or negative) sent from + receiver to sender via the transmission channel in response to a + command. The general form of a reply is a numeric completion code + (indicating failure or success) usually followed by a text string. + The codes are for use by programs and the text is usually intended + for human users. Recent work [34] has specified further structuring + of the reply strings, including the use of supplemental and more + specific completion codes. + +2.4 General Syntax Principles and Transaction Model + + SMTP commands and replies have a rigid syntax. All commands begin + with a command verb. All Replies begin with a three digit numeric + code. In some commands and replies, arguments MUST follow the verb + or reply code. Some commands do not accept arguments (after the + verb), and some reply codes are followed, sometimes optionally, by + free form text. In both cases, where text appears, it is separated + from the verb or reply code by a space character. Complete + definitions of commands and replies appear in section 4. + + + + +Klensin Standards Track [Page 13] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + Verbs and argument values (e.g., "TO:" or "to:" in the RCPT command + and extension name keywords) are not case sensitive, with the sole + exception in this specification of a mailbox local-part (SMTP + Extensions may explicitly specify case-sensitive elements). That is, + a command verb, an argument value other than a mailbox local-part, + and free form text MAY be encoded in upper case, lower case, or any + mixture of upper and lower case with no impact on its meaning. This + is NOT true of a mailbox local-part. The local-part of a mailbox + MUST BE treated as case sensitive. Therefore, SMTP implementations + MUST take care to preserve the case of mailbox local-parts. Mailbox + domains are not case sensitive. In particular, for some hosts the + user "smith" is different from the user "Smith". However, exploiting + the case sensitivity of mailbox local-parts impedes interoperability + and is discouraged. + + A few SMTP servers, in violation of this specification (and RFC 821) + require that command verbs be encoded by clients in upper case. + Implementations MAY wish to employ this encoding to accommodate those + servers. + + The argument field consists of a variable length character string + ending with the end of the line, i.e., with the character sequence + <CRLF>. The receiver will take no action until this sequence is + received. + + The syntax for each command is shown with the discussion of that + command. Common elements and parameters are shown in section 4.1.2. + + Commands and replies are composed of characters from the ASCII + character set [1]. When the transport service provides an 8-bit byte + (octet) transmission channel, each 7-bit character is transmitted + right justified in an octet with the high order bit cleared to zero. + More specifically, the unextended SMTP service provides seven bit + transport only. An originating SMTP client which has not + successfully negotiated an appropriate extension with a particular + server MUST NOT transmit messages with information in the high-order + bit of octets. If such messages are transmitted in violation of this + rule, receiving SMTP servers MAY clear the high-order bit or reject + the message as invalid. In general, a relay SMTP SHOULD assume that + the message content it has received is valid and, assuming that the + envelope permits doing so, relay it without inspecting that content. + Of course, if the content is mislabeled and the data path cannot + accept the actual content, this may result in ultimate delivery of a + severely garbled message to the recipient. Delivery SMTP systems MAY + reject ("bounce") such messages rather than deliver them. No sending + SMTP system is permitted to send envelope commands in any character + + + + + +Klensin Standards Track [Page 14] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + set other than US-ASCII; receiving systems SHOULD reject such + commands, normally using "500 syntax error - invalid character" + replies. + + Eight-bit message content transmission MAY be requested of the server + by a client using extended SMTP facilities, notably the "8BITMIME" + extension [20]. 8BITMIME SHOULD be supported by SMTP servers. + However, it MUST not be construed as authorization to transmit + unrestricted eight bit material. 8BITMIME MUST NOT be requested by + senders for material with the high bit on that is not in MIME format + with an appropriate content-transfer encoding; servers MAY reject + such messages. + + The metalinguistic notation used in this document corresponds to the + "Augmented BNF" used in other Internet mail system documents. The + reader who is not familiar with that syntax should consult the ABNF + specification [8]. Metalanguage terms used in running text are + surrounded by pointed brackets (e.g., <CRLF>) for clarity. + +3. The SMTP Procedures: An Overview + + This section contains descriptions of the procedures used in SMTP: + session initiation, the mail transaction, forwarding mail, verifying + mailbox names and expanding mailing lists, and the opening and + closing exchanges. Comments on relaying, a note on mail domains, and + a discussion of changing roles are included at the end of this + section. Several complete scenarios are presented in appendix D. + +3.1 Session Initiation + + An SMTP session is initiated when a client opens a connection to a + server and the server responds with an opening message. + + SMTP server implementations MAY include identification of their + software and version information in the connection greeting reply + after the 220 code, a practice that permits more efficient isolation + and repair of any problems. Implementations MAY make provision for + SMTP servers to disable the software and version announcement where + it causes security concerns. While some systems also identify their + contact point for mail problems, this is not a substitute for + maintaining the required "postmaster" address (see section 4.5.1). + + The SMTP protocol allows a server to formally reject a transaction + while still allowing the initial connection as follows: a 554 + response MAY be given in the initial connection opening message + instead of the 220. A server taking this approach MUST still wait + for the client to send a QUIT (see section 4.1.1.10) before closing + the connection and SHOULD respond to any intervening commands with + + + +Klensin Standards Track [Page 15] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + "503 bad sequence of commands". Since an attempt to make an SMTP + connection to such a system is probably in error, a server returning + a 554 response on connection opening SHOULD provide enough + information in the reply text to facilitate debugging of the sending + system. + +3.2 Client Initiation + + Once the server has sent the welcoming message and the client has + received it, the client normally sends the EHLO command to the + server, indicating the client's identity. In addition to opening the + session, use of EHLO indicates that the client is able to process + service extensions and requests that the server provide a list of the + extensions it supports. Older SMTP systems which are unable to + support service extensions and contemporary clients which do not + require service extensions in the mail session being initiated, MAY + use HELO instead of EHLO. Servers MUST NOT return the extended + EHLO-style response to a HELO command. For a particular connection + attempt, if the server returns a "command not recognized" response to + EHLO, the client SHOULD be able to fall back and send HELO. + + In the EHLO command the host sending the command identifies itself; + the command may be interpreted as saying "Hello, I am <domain>" (and, + in the case of EHLO, "and I support service extension requests"). + +3.3 Mail Transactions + + There are three steps to SMTP mail transactions. The transaction + starts with a MAIL command which gives the sender identification. + (In general, the MAIL command may be sent only when no mail + transaction is in progress; see section 4.1.4.) A series of one or + more RCPT commands follows giving the receiver information. Then a + DATA command initiates transfer of the mail data and is terminated by + the "end of mail" data indicator, which also confirms the + transaction. + + The first step in the procedure is the MAIL command. + + MAIL FROM:<reverse-path> [SP <mail-parameters> ] <CRLF> + + This command tells the SMTP-receiver that a new mail transaction is + starting and to reset all its state tables and buffers, including any + recipients or mail data. The <reverse-path> portion of the first or + only argument contains the source mailbox (between "<" and ">" + brackets), which can be used to report errors (see section 4.2 for a + discussion of error reporting). If accepted, the SMTP server returns + a 250 OK reply. If the mailbox specification is not acceptable for + some reason, the server MUST return a reply indicating whether the + + + +Klensin Standards Track [Page 16] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + failure is permanent (i.e., will occur again if the client tries to + send the same address again) or temporary (i.e., the address might be + accepted if the client tries again later). Despite the apparent + scope of this requirement, there are circumstances in which the + acceptability of the reverse-path may not be determined until one or + more forward-paths (in RCPT commands) can be examined. In those + cases, the server MAY reasonably accept the reverse-path (with a 250 + reply) and then report problems after the forward-paths are received + and examined. Normally, failures produce 550 or 553 replies. + + Historically, the <reverse-path> can contain more than just a + mailbox, however, contemporary systems SHOULD NOT use source routing + (see appendix C). + + The optional <mail-parameters> are associated with negotiated SMTP + service extensions (see section 2.2). + + The second step in the procedure is the RCPT command. + + RCPT TO:<forward-path> [ SP <rcpt-parameters> ] <CRLF> + + The first or only argument to this command includes a forward-path + (normally a mailbox and domain, always surrounded by "<" and ">" + brackets) identifying one recipient. If accepted, the SMTP server + returns a 250 OK reply and stores the forward-path. If the recipient + is known not to be a deliverable address, the SMTP server returns a + 550 reply, typically with a string such as "no such user - " and the + mailbox name (other circumstances and reply codes are possible). + This step of the procedure can be repeated any number of times. + + The <forward-path> can contain more than just a mailbox. + Historically, the <forward-path> can be a source routing list of + hosts and the destination mailbox, however, contemporary SMTP clients + SHOULD NOT utilize source routes (see appendix C). Servers MUST be + prepared to encounter a list of source routes in the forward path, + but SHOULD ignore the routes or MAY decline to support the relaying + they imply. Similarly, servers MAY decline to accept mail that is + destined for other hosts or systems. These restrictions make a + server useless as a relay for clients that do not support full SMTP + functionality. Consequently, restricted-capability clients MUST NOT + assume that any SMTP server on the Internet can be used as their mail + processing (relaying) site. If a RCPT command appears without a + previous MAIL command, the server MUST return a 503 "Bad sequence of + commands" response. The optional <rcpt-parameters> are associated + with negotiated SMTP service extensions (see section 2.2). + + The third step in the procedure is the DATA command (or some + alternative specified in a service extension). + + + +Klensin Standards Track [Page 17] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + DATA <CRLF> + + If accepted, the SMTP server returns a 354 Intermediate reply and + considers all succeeding lines up to but not including the end of + mail data indicator to be the message text. When the end of text is + successfully received and stored the SMTP-receiver sends a 250 OK + reply. + + Since the mail data is sent on the transmission channel, the end of + mail data must be indicated so that the command and reply dialog can + be resumed. SMTP indicates the end of the mail data by sending a + line containing only a "." (period or full stop). A transparency + procedure is used to prevent this from interfering with the user's + text (see section 4.5.2). + + The end of mail data indicator also confirms the mail transaction and + tells the SMTP server to now process the stored recipients and mail + data. If accepted, the SMTP server returns a 250 OK reply. The DATA + command can fail at only two points in the protocol exchange: + + - If there was no MAIL, or no RCPT, command, or all such commands + were rejected, the server MAY return a "command out of sequence" + (503) or "no valid recipients" (554) reply in response to the DATA + command. If one of those replies (or any other 5yz reply) is + received, the client MUST NOT send the message data; more + generally, message data MUST NOT be sent unless a 354 reply is + received. + + - If the verb is initially accepted and the 354 reply issued, the + DATA command should fail only if the mail transaction was + incomplete (for example, no recipients), or if resources were + unavailable (including, of course, the server unexpectedly + becoming unavailable), or if the server determines that the + message should be rejected for policy or other reasons. + + However, in practice, some servers do not perform recipient + verification until after the message text is received. These servers + SHOULD treat a failure for one or more recipients as a "subsequent + failure" and return a mail message as discussed in section 6. Using + a "550 mailbox not found" (or equivalent) reply code after the data + are accepted makes it difficult or impossible for the client to + determine which recipients failed. + + When RFC 822 format [7, 32] is being used, the mail data include the + memo header items such as Date, Subject, To, Cc, From. Server SMTP + systems SHOULD NOT reject messages based on perceived defects in the + RFC 822 or MIME [12] message header or message body. In particular, + + + + +Klensin Standards Track [Page 18] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + they MUST NOT reject messages in which the numbers of Resent-fields + do not match or Resent-to appears without Resent-from and/or Resent- + date. + + Mail transaction commands MUST be used in the order discussed above. + +3.4 Forwarding for Address Correction or Updating + + Forwarding support is most often required to consolidate and simplify + addresses within, or relative to, some enterprise and less frequently + to establish addresses to link a person's prior address with current + one. Silent forwarding of messages (without server notification to + the sender), for security or non-disclosure purposes, is common in + the contemporary Internet. + + In both the enterprise and the "new address" cases, information + hiding (and sometimes security) considerations argue against exposure + of the "final" address through the SMTP protocol as a side-effect of + the forwarding activity. This may be especially important when the + final address may not even be reachable by the sender. Consequently, + the "forwarding" mechanisms described in section 3.2 of RFC 821, and + especially the 251 (corrected destination) and 551 reply codes from + RCPT must be evaluated carefully by implementers and, when they are + available, by those configuring systems. + + In particular: + + * Servers MAY forward messages when they are aware of an address + change. When they do so, they MAY either provide address-updating + information with a 251 code, or may forward "silently" and return + a 250 code. But, if a 251 code is used, they MUST NOT assume that + the client will actually update address information or even return + that information to the user. + + Alternately, + + * Servers MAY reject or bounce messages when they are not + deliverable when addressed. When they do so, they MAY either + provide address-updating information with a 551 code, or may + reject the message as undeliverable with a 550 code and no + address-specific information. But, if a 551 code is used, they + MUST NOT assume that the client will actually update address + information or even return that information to the user. + + SMTP server implementations that support the 251 and/or 551 reply + codes are strongly encouraged to provide configuration mechanisms so + that sites which conclude that they would undesirably disclose + information can disable or restrict their use. + + + +Klensin Standards Track [Page 19] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +3.5 Commands for Debugging Addresses + +3.5.1 Overview + + SMTP provides commands to verify a user name or obtain the content of + a mailing list. This is done with the VRFY and EXPN commands, which + have character string arguments. Implementations SHOULD support VRFY + and EXPN (however, see section 3.5.2 and 7.3). + + For the VRFY command, the string is a user name or a user name and + domain (see below). If a normal (i.e., 250) response is returned, + the response MAY include the full name of the user and MUST include + the mailbox of the user. It MUST be in either of the following + forms: + + User Name <local-part@domain> + local-part@domain + + When a name that is the argument to VRFY could identify more than one + mailbox, the server MAY either note the ambiguity or identify the + alternatives. In other words, any of the following are legitimate + response to VRFY: + + 553 User ambiguous + + or + + 553- Ambiguous; Possibilities are + 553-Joe Smith <jsmith@foo.com> + 553-Harry Smith <hsmith@foo.com> + 553 Melvin Smith <dweep@foo.com> + + or + + 553-Ambiguous; Possibilities + 553- <jsmith@foo.com> + 553- <hsmith@foo.com> + 553 <dweep@foo.com> + + Under normal circumstances, a client receiving a 553 reply would be + expected to expose the result to the user. Use of exactly the forms + given, and the "user ambiguous" or "ambiguous" keywords, possibly + supplemented by extended reply codes such as those described in [34], + will facilitate automated translation into other languages as needed. + Of course, a client that was highly automated or that was operating + in another language than English, might choose to try to translate + the response, to return some other indication to the user than the + + + + +Klensin Standards Track [Page 20] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + literal text of the reply, or to take some automated action such as + consulting a directory service for additional information before + reporting to the user. + + For the EXPN command, the string identifies a mailing list, and the + successful (i.e., 250) multiline response MAY include the full name + of the users and MUST give the mailboxes on the mailing list. + + In some hosts the distinction between a mailing list and an alias for + a single mailbox is a bit fuzzy, since a common data structure may + hold both types of entries, and it is possible to have mailing lists + containing only one mailbox. If a request is made to apply VRFY to a + mailing list, a positive response MAY be given if a message so + addressed would be delivered to everyone on the list, otherwise an + error SHOULD be reported (e.g., "550 That is a mailing list, not a + user" or "252 Unable to verify members of mailing list"). If a + request is made to expand a user name, the server MAY return a + positive response consisting of a list containing one name, or an + error MAY be reported (e.g., "550 That is a user name, not a mailing + list"). + + In the case of a successful multiline reply (normal for EXPN) exactly + one mailbox is to be specified on each line of the reply. The case + of an ambiguous request is discussed above. + + "User name" is a fuzzy term and has been used deliberately. An + implementation of the VRFY or EXPN commands MUST include at least + recognition of local mailboxes as "user names". However, since + current Internet practice often results in a single host handling + mail for multiple domains, hosts, especially hosts that provide this + functionality, SHOULD accept the "local-part@domain" form as a "user + name"; hosts MAY also choose to recognize other strings as "user + names". + + The case of expanding a mailbox list requires a multiline reply, such + as: + + C: EXPN Example-People + S: 250-Jon Postel <Postel@isi.edu> + S: 250-Fred Fonebone <Fonebone@physics.foo-u.edu> + S: 250 Sam Q. Smith <SQSmith@specific.generic.com> + + or + + C: EXPN Executive-Washroom-List + S: 550 Access Denied to You. + + + + + +Klensin Standards Track [Page 21] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + The character string arguments of the VRFY and EXPN commands cannot + be further restricted due to the variety of implementations of the + user name and mailbox list concepts. On some systems it may be + appropriate for the argument of the EXPN command to be a file name + for a file containing a mailing list, but again there are a variety + of file naming conventions in the Internet. Similarly, historical + variations in what is returned by these commands are such that the + response SHOULD be interpreted very carefully, if at all, and SHOULD + generally only be used for diagnostic purposes. + +3.5.2 VRFY Normal Response + + When normal (2yz or 551) responses are returned from a VRFY or EXPN + request, the reply normally includes the mailbox name, i.e., + "<local-part@domain>", where "domain" is a fully qualified domain + name, MUST appear in the syntax. In circumstances exceptional enough + to justify violating the intent of this specification, free-form text + MAY be returned. In order to facilitate parsing by both computers + and people, addresses SHOULD appear in pointed brackets. When + addresses, rather than free-form debugging information, are returned, + EXPN and VRFY MUST return only valid domain addresses that are usable + in SMTP RCPT commands. Consequently, if an address implies delivery + to a program or other system, the mailbox name used to reach that + target MUST be given. Paths (explicit source routes) MUST NOT be + returned by VRFY or EXPN. + + Server implementations SHOULD support both VRFY and EXPN. For + security reasons, implementations MAY provide local installations a + way to disable either or both of these commands through configuration + options or the equivalent. When these commands are supported, they + are not required to work across relays when relaying is supported. + Since they were both optional in RFC 821, they MUST be listed as + service extensions in an EHLO response, if they are supported. + +3.5.3 Meaning of VRFY or EXPN Success Response + + A server MUST NOT return a 250 code in response to a VRFY or EXPN + command unless it has actually verified the address. In particular, + a server MUST NOT return 250 if all it has done is to verify that the + syntax given is valid. In that case, 502 (Command not implemented) + or 500 (Syntax error, command unrecognized) SHOULD be returned. As + stated elsewhere, implementation (in the sense of actually validating + addresses and returning information) of VRFY and EXPN are strongly + recommended. Hence, implementations that return 500 or 502 for VRFY + are not in full compliance with this specification. + + + + + + +Klensin Standards Track [Page 22] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + There may be circumstances where an address appears to be valid but + cannot reasonably be verified in real time, particularly when a + server is acting as a mail exchanger for another server or domain. + "Apparent validity" in this case would normally involve at least + syntax checking and might involve verification that any domains + specified were ones to which the host expected to be able to relay + mail. In these situations, reply code 252 SHOULD be returned. These + cases parallel the discussion of RCPT verification discussed in + section 2.1. Similarly, the discussion in section 3.4 applies to the + use of reply codes 251 and 551 with VRFY (and EXPN) to indicate + addresses that are recognized but that would be forwarded or bounced + were mail received for them. Implementations generally SHOULD be + more aggressive about address verification in the case of VRFY than + in the case of RCPT, even if it takes a little longer to do so. + +3.5.4 Semantics and Applications of EXPN + + EXPN is often very useful in debugging and understanding problems + with mailing lists and multiple-target-address aliases. Some systems + have attempted to use source expansion of mailing lists as a means of + eliminating duplicates. The propagation of aliasing systems with + mail on the Internet, for hosts (typically with MX and CNAME DNS + records), for mailboxes (various types of local host aliases), and in + various proxying arrangements, has made it nearly impossible for + these strategies to work consistently, and mail systems SHOULD NOT + attempt them. + +3.6 Domains + + Only resolvable, fully-qualified, domain names (FQDNs) are permitted + when domain names are used in SMTP. In other words, names that can + be resolved to MX RRs or A RRs (as discussed in section 5) are + permitted, as are CNAME RRs whose targets can be resolved, in turn, + to MX or A RRs. Local nicknames or unqualified names MUST NOT be + used. There are two exceptions to the rule requiring FQDNs: + + - The domain name given in the EHLO command MUST BE either a primary + host name (a domain name that resolves to an A RR) or, if the host + has no name, an address literal as described in section 4.1.1.1. + + - The reserved mailbox name "postmaster" may be used in a RCPT + command without domain qualification (see section 4.1.1.3) and + MUST be accepted if so used. + + + + + + + + +Klensin Standards Track [Page 23] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +3.7 Relaying + + In general, the availability of Mail eXchanger records in the domain + name system [22, 27] makes the use of explicit source routes in the + Internet mail system unnecessary. Many historical problems with + their interpretation have made their use undesirable. SMTP clients + SHOULD NOT generate explicit source routes except under unusual + circumstances. SMTP servers MAY decline to act as mail relays or to + accept addresses that specify source routes. When route information + is encountered, SMTP servers are also permitted to ignore the route + information and simply send to the final destination specified as the + last element in the route and SHOULD do so. There has been an + invalid practice of using names that do not appear in the DNS as + destination names, with the senders counting on the intermediate + hosts specified in source routing to resolve any problems. If source + routes are stripped, this practice will cause failures. This is one + of several reasons why SMTP clients MUST NOT generate invalid source + routes or depend on serial resolution of names. + + When source routes are not used, the process described in RFC 821 for + constructing a reverse-path from the forward-path is not applicable + and the reverse-path at the time of delivery will simply be the + address that appeared in the MAIL command. + + A relay SMTP server is usually the target of a DNS MX record that + designates it, rather than the final delivery system. The relay + server may accept or reject the task of relaying the mail in the same + way it accepts or rejects mail for a local user. If it accepts the + task, it then becomes an SMTP client, establishes a transmission + channel to the next SMTP server specified in the DNS (according to + the rules in section 5), and sends it the mail. If it declines to + relay mail to a particular address for policy reasons, a 550 response + SHOULD be returned. + + Many mail-sending clients exist, especially in conjunction with + facilities that receive mail via POP3 or IMAP, that have limited + capability to support some of the requirements of this specification, + such as the ability to queue messages for subsequent delivery + attempts. For these clients, it is common practice to make private + arrangements to send all messages to a single server for processing + and subsequent distribution. SMTP, as specified here, is not ideally + suited for this role, and work is underway on standardized mail + submission protocols that might eventually supercede the current + practices. In any event, because these arrangements are private and + fall outside the scope of this specification, they are not described + here. + + + + + +Klensin Standards Track [Page 24] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + It is important to note that MX records can point to SMTP servers + which act as gateways into other environments, not just SMTP relays + and final delivery systems; see sections 3.8 and 5. + + If an SMTP server has accepted the task of relaying the mail and + later finds that the destination is incorrect or that the mail cannot + be delivered for some other reason, then it MUST construct an + "undeliverable mail" notification message and send it to the + originator of the undeliverable mail (as indicated by the reverse- + path). Formats specified for non-delivery reports by other standards + (see, for example, [24, 25]) SHOULD be used if possible. + + This notification message must be from the SMTP server at the relay + host or the host that first determines that delivery cannot be + accomplished. Of course, SMTP servers MUST NOT send notification + messages about problems transporting notification messages. One way + to prevent loops in error reporting is to specify a null reverse-path + in the MAIL command of a notification message. When such a message + is transmitted the reverse-path MUST be set to null (see section + 4.5.5 for additional discussion). A MAIL command with a null + reverse-path appears as follows: + + MAIL FROM:<> + + As discussed in section 2.4.1, a relay SMTP has no need to inspect or + act upon the headers or body of the message data and MUST NOT do so + except to add its own "Received:" header (section 4.4) and, + optionally, to attempt to detect looping in the mail system (see + section 6.2). + +3.8 Mail Gatewaying + + While the relay function discussed above operates within the Internet + SMTP transport service environment, MX records or various forms of + explicit routing may require that an intermediate SMTP server perform + a translation function between one transport service and another. As + discussed in section 2.3.8, when such a system is at the boundary + between two transport service environments, we refer to it as a + "gateway" or "gateway SMTP". + + Gatewaying mail between different mail environments, such as + different mail formats and protocols, is complex and does not easily + yield to standardization. However, some general requirements may be + given for a gateway between the Internet and another mail + environment. + + + + + + +Klensin Standards Track [Page 25] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +3.8.1 Header Fields in Gatewaying + + Header fields MAY be rewritten when necessary as messages are + gatewayed across mail environment boundaries. This may involve + inspecting the message body or interpreting the local-part of the + destination address in spite of the prohibitions in section 2.4.1. + + Other mail systems gatewayed to the Internet often use a subset of + RFC 822 headers or provide similar functionality with a different + syntax, but some of these mail systems do not have an equivalent to + the SMTP envelope. Therefore, when a message leaves the Internet + environment, it may be necessary to fold the SMTP envelope + information into the message header. A possible solution would be to + create new header fields to carry the envelope information (e.g., + "X-SMTP-MAIL:" and "X-SMTP-RCPT:"); however, this would require + changes in mail programs in foreign environments and might risk + disclosure of private information (see section 7.2). + +3.8.2 Received Lines in Gatewaying + + When forwarding a message into or out of the Internet environment, a + gateway MUST prepend a Received: line, but it MUST NOT alter in any + way a Received: line that is already in the header. + + "Received:" fields of messages originating from other environments + may not conform exactly to this specification. However, the most + important use of Received: lines is for debugging mail faults, and + this debugging can be severely hampered by well-meaning gateways that + try to "fix" a Received: line. As another consequence of trace + fields arising in non-SMTP environments, receiving systems MUST NOT + reject mail based on the format of a trace field and SHOULD be + extremely robust in the light of unexpected information or formats in + those fields. + + The gateway SHOULD indicate the environment and protocol in the "via" + clauses of Received field(s) that it supplies. + +3.8.3 Addresses in Gatewaying + + From the Internet side, the gateway SHOULD accept all valid address + formats in SMTP commands and in RFC 822 headers, and all valid RFC + 822 messages. Addresses and headers generated by gateways MUST + conform to applicable Internet standards (including this one and RFC + 822). Gateways are, of course, subject to the same rules for + handling source routes as those described for other SMTP systems in + section 3.3. + + + + + +Klensin Standards Track [Page 26] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +3.8.4 Other Header Fields in Gatewaying + + The gateway MUST ensure that all header fields of a message that it + forwards into the Internet mail environment meet the requirements for + Internet mail. In particular, all addresses in "From:", "To:", + "Cc:", etc., fields MUST be transformed (if necessary) to satisfy RFC + 822 syntax, MUST reference only fully-qualified domain names, and + MUST be effective and useful for sending replies. The translation + algorithm used to convert mail from the Internet protocols to another + environment's protocol SHOULD ensure that error messages from the + foreign mail environment are delivered to the return path from the + SMTP envelope, not to the sender listed in the "From:" field (or + other fields) of the RFC 822 message. + +3.8.5 Envelopes in Gatewaying + + Similarly, when forwarding a message from another environment into + the Internet, the gateway SHOULD set the envelope return path in + accordance with an error message return address, if supplied by the + foreign environment. If the foreign environment has no equivalent + concept, the gateway must select and use a best approximation, with + the message originator's address as the default of last resort. + +3.9 Terminating Sessions and Connections + + An SMTP connection is terminated when the client sends a QUIT + command. The server responds with a positive reply code, after which + it closes the connection. + + An SMTP server MUST NOT intentionally close the connection except: + + - After receiving a QUIT command and responding with a 221 reply. + + - After detecting the need to shut down the SMTP service and + returning a 421 response code. This response code can be issued + after the server receives any command or, if necessary, + asynchronously from command receipt (on the assumption that the + client will receive it after the next command is issued). + + In particular, a server that closes connections in response to + commands that are not understood is in violation of this + specification. Servers are expected to be tolerant of unknown + commands, issuing a 500 reply and awaiting further instructions from + the client. + + + + + + + +Klensin Standards Track [Page 27] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + An SMTP server which is forcibly shut down via external means SHOULD + attempt to send a line containing a 421 response code to the SMTP + client before exiting. The SMTP client will normally read the 421 + response code after sending its next command. + + SMTP clients that experience a connection close, reset, or other + communications failure due to circumstances not under their control + (in violation of the intent of this specification but sometimes + unavoidable) SHOULD, to maintain the robustness of the mail system, + treat the mail transaction as if a 451 response had been received and + act accordingly. + +3.10 Mailing Lists and Aliases + + An SMTP-capable host SHOULD support both the alias and the list + models of address expansion for multiple delivery. When a message is + delivered or forwarded to each address of an expanded list form, the + return address in the envelope ("MAIL FROM:") MUST be changed to be + the address of a person or other entity who administers the list. + However, in this case, the message header [32] MUST be left + unchanged; in particular, the "From" field of the message header is + unaffected. + + An important mail facility is a mechanism for multi-destination + delivery of a single message, by transforming (or "expanding" or + "exploding") a pseudo-mailbox address into a list of destination + mailbox addresses. When a message is sent to such a pseudo-mailbox + (sometimes called an "exploder"), copies are forwarded or + redistributed to each mailbox in the expanded list. Servers SHOULD + simply utilize the addresses on the list; application of heuristics + or other matching rules to eliminate some addresses, such as that of + the originator, is strongly discouraged. We classify such a pseudo- + mailbox as an "alias" or a "list", depending upon the expansion + rules. + +3.10.1 Alias + + To expand an alias, the recipient mailer simply replaces the pseudo- + mailbox address in the envelope with each of the expanded addresses + in turn; the rest of the envelope and the message body are left + unchanged. The message is then delivered or forwarded to each + expanded address. + +3.10.2 List + + A mailing list may be said to operate by "redistribution" rather than + by "forwarding". To expand a list, the recipient mailer replaces the + pseudo-mailbox address in the envelope with all of the expanded + + + +Klensin Standards Track [Page 28] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + addresses. The return address in the envelope is changed so that all + error messages generated by the final deliveries will be returned to + a list administrator, not to the message originator, who generally + has no control over the contents of the list and will typically find + error messages annoying. + +4. The SMTP Specifications + +4.1 SMTP Commands + +4.1.1 Command Semantics and Syntax + + The SMTP commands define the mail transfer or the mail system + function requested by the user. SMTP commands are character strings + terminated by <CRLF>. The commands themselves are alphabetic + characters terminated by <SP> if parameters follow and <CRLF> + otherwise. (In the interest of improved interoperability, SMTP + receivers are encouraged to tolerate trailing white space before the + terminating <CRLF>.) The syntax of the local part of a mailbox must + conform to receiver site conventions and the syntax specified in + section 4.1.2. The SMTP commands are discussed below. The SMTP + replies are discussed in section 4.2. + + A mail transaction involves several data objects which are + communicated as arguments to different commands. The reverse-path is + the argument of the MAIL command, the forward-path is the argument of + the RCPT command, and the mail data is the argument of the DATA + command. These arguments or data objects must be transmitted and + held pending the confirmation communicated by the end of mail data + indication which finalizes the transaction. The model for this is + that distinct buffers are provided to hold the types of data objects, + that is, there is a reverse-path buffer, a forward-path buffer, and a + mail data buffer. Specific commands cause information to be appended + to a specific buffer, or cause one or more buffers to be cleared. + + Several commands (RSET, DATA, QUIT) are specified as not permitting + parameters. In the absence of specific extensions offered by the + server and accepted by the client, clients MUST NOT send such + parameters and servers SHOULD reject commands containing them as + having invalid syntax. + +4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO) + + These commands are used to identify the SMTP client to the SMTP + server. The argument field contains the fully-qualified domain name + of the SMTP client if one is available. In situations in which the + SMTP client system does not have a meaningful domain name (e.g., when + its address is dynamically allocated and no reverse mapping record is + + + +Klensin Standards Track [Page 29] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + available), the client SHOULD send an address literal (see section + 4.1.3), optionally followed by information that will help to identify + the client system. y The SMTP server identifies itself to the SMTP + client in the connection greeting reply and in the response to this + command. + + A client SMTP SHOULD start an SMTP session by issuing the EHLO + command. If the SMTP server supports the SMTP service extensions it + will give a successful response, a failure response, or an error + response. If the SMTP server, in violation of this specification, + does not support any SMTP service extensions it will generate an + error response. Older client SMTP systems MAY, as discussed above, + use HELO (as specified in RFC 821) instead of EHLO, and servers MUST + support the HELO command and reply properly to it. In any event, a + client MUST issue HELO or EHLO before starting a mail transaction. + + These commands, and a "250 OK" reply to one of them, confirm that + both the SMTP client and the SMTP server are in the initial state, + that is, there is no transaction in progress and all state tables and + buffers are cleared. + + Syntax: + + ehlo = "EHLO" SP Domain CRLF + helo = "HELO" SP Domain CRLF + + Normally, the response to EHLO will be a multiline reply. Each line + of the response contains a keyword and, optionally, one or more + parameters. Following the normal syntax for multiline replies, these + keyworks follow the code (250) and a hyphen for all but the last + line, and the code and a space for the last line. The syntax for a + positive response, using the ABNF notation and terminal symbols of + [8], is: + + ehlo-ok-rsp = ( "250" domain [ SP ehlo-greet ] CRLF ) + / ( "250-" domain [ SP ehlo-greet ] CRLF + *( "250-" ehlo-line CRLF ) + "250" SP ehlo-line CRLF ) + + ehlo-greet = 1*(%d0-9 / %d11-12 / %d14-127) + ; string of any characters other than CR or LF + + ehlo-line = ehlo-keyword *( SP ehlo-param ) + + ehlo-keyword = (ALPHA / DIGIT) *(ALPHA / DIGIT / "-") + ; additional syntax of ehlo-params depends on + ; ehlo-keyword + + + + +Klensin Standards Track [Page 30] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + ehlo-param = 1*(%d33-127) + ; any CHAR excluding <SP> and all + ; control characters (US-ASCII 0-31 inclusive) + + Although EHLO keywords may be specified in upper, lower, or mixed + case, they MUST always be recognized and processed in a case- + insensitive manner. This is simply an extension of practices + specified in RFC 821 and section 2.4.1. + +4.1.1.2 MAIL (MAIL) + + This command is used to initiate a mail transaction in which the mail + data is delivered to an SMTP server which may, in turn, deliver it to + one or more mailboxes or pass it on to another system (possibly using + SMTP). The argument field contains a reverse-path and may contain + optional parameters. In general, the MAIL command may be sent only + when no mail transaction is in progress, see section 4.1.4. + + The reverse-path consists of the sender mailbox. Historically, that + mailbox might optionally have been preceded by a list of hosts, but + that behavior is now deprecated (see appendix C). In some types of + reporting messages for which a reply is likely to cause a mail loop + (for example, mail delivery and nondelivery notifications), the + reverse-path may be null (see section 3.7). + + This command clears the reverse-path buffer, the forward-path buffer, + and the mail data buffer; and inserts the reverse-path information + from this command into the reverse-path buffer. + + If service extensions were negotiated, the MAIL command may also + carry parameters associated with a particular service extension. + + Syntax: + + "MAIL FROM:" ("<>" / Reverse-Path) + [SP Mail-parameters] CRLF + +4.1.1.3 RECIPIENT (RCPT) + + This command is used to identify an individual recipient of the mail + data; multiple recipients are specified by multiple use of this + command. The argument field contains a forward-path and may contain + optional parameters. + + The forward-path normally consists of the required destination + mailbox. Sending systems SHOULD not generate the optional list of + hosts known as a source route. Receiving systems MUST recognize + + + + +Klensin Standards Track [Page 31] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + source route syntax but SHOULD strip off the source route + specification and utilize the domain name associated with the mailbox + as if the source route had not been provided. + + Similarly, relay hosts SHOULD strip or ignore source routes, and + names MUST NOT be copied into the reverse-path. When mail reaches + its ultimate destination (the forward-path contains only a + destination mailbox), the SMTP server inserts it into the destination + mailbox in accordance with its host mail conventions. + + For example, mail received at relay host xyz.com with envelope + commands + + MAIL FROM:<userx@y.foo.org> + RCPT TO:<@hosta.int,@jkl.org:userc@d.bar.org> + + will normally be sent directly on to host d.bar.org with envelope + commands + + MAIL FROM:<userx@y.foo.org> + RCPT TO:<userc@d.bar.org> + + As provided in appendix C, xyz.com MAY also choose to relay the + message to hosta.int, using the envelope commands + + MAIL FROM:<userx@y.foo.org> + RCPT TO:<@hosta.int,@jkl.org:userc@d.bar.org> + + or to jkl.org, using the envelope commands + + MAIL FROM:<userx@y.foo.org> + RCPT TO:<@jkl.org:userc@d.bar.org> + + Of course, since hosts are not required to relay mail at all, xyz.com + may also reject the message entirely when the RCPT command is + received, using a 550 code (since this is a "policy reason"). + + If service extensions were negotiated, the RCPT command may also + carry parameters associated with a particular service extension + offered by the server. The client MUST NOT transmit parameters other + than those associated with a service extension offered by the server + in its EHLO response. + +Syntax: + "RCPT TO:" ("<Postmaster@" domain ">" / "<Postmaster>" / Forward-Path) + [SP Rcpt-parameters] CRLF + + + + + +Klensin Standards Track [Page 32] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +4.1.1.4 DATA (DATA) + + The receiver normally sends a 354 response to DATA, and then treats + the lines (strings ending in <CRLF> sequences, as described in + section 2.3.7) following the command as mail data from the sender. + This command causes the mail data to be appended to the mail data + buffer. The mail data may contain any of the 128 ASCII character + codes, although experience has indicated that use of control + characters other than SP, HT, CR, and LF may cause problems and + SHOULD be avoided when possible. + + The mail data is terminated by a line containing only a period, that + is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2). This + is the end of mail data indication. Note that the first <CRLF> of + this terminating sequence is also the <CRLF> that ends the final line + of the data (message text) or, if there was no data, ends the DATA + command itself. An extra <CRLF> MUST NOT be added, as that would + cause an empty line to be added to the message. The only exception + to this rule would arise if the message body were passed to the + originating SMTP-sender with a final "line" that did not end in + <CRLF>; in that case, the originating SMTP system MUST either reject + the message as invalid or add <CRLF> in order to have the receiving + SMTP server recognize the "end of data" condition. + + The custom of accepting lines ending only in <LF>, as a concession to + non-conforming behavior on the part of some UNIX systems, has proven + to cause more interoperability problems than it solves, and SMTP + server systems MUST NOT do this, even in the name of improved + robustness. In particular, the sequence "<LF>.<LF>" (bare line + feeds, without carriage returns) MUST NOT be treated as equivalent to + <CRLF>.<CRLF> as the end of mail data indication. + + Receipt of the end of mail data indication requires the server to + process the stored mail transaction information. This processing + consumes the information in the reverse-path buffer, the forward-path + buffer, and the mail data buffer, and on the completion of this + command these buffers are cleared. If the processing is successful, + the receiver MUST send an OK reply. If the processing fails the + receiver MUST send a failure reply. The SMTP model does not allow + for partial failures at this point: either the message is accepted by + the server for delivery and a positive response is returned or it is + not accepted and a failure reply is returned. In sending a positive + completion reply to the end of data indication, the receiver takes + full responsibility for the message (see section 6.1). Errors that + are diagnosed subsequently MUST be reported in a mail message, as + discussed in section 4.4. + + + + + +Klensin Standards Track [Page 33] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + When the SMTP server accepts a message either for relaying or for + final delivery, it inserts a trace record (also referred to + interchangeably as a "time stamp line" or "Received" line) at the top + of the mail data. This trace record indicates the identity of the + host that sent the message, the identity of the host that received + the message (and is inserting this time stamp), and the date and time + the message was received. Relayed messages will have multiple time + stamp lines. Details for formation of these lines, including their + syntax, is specified in section 4.4. + + Additional discussion about the operation of the DATA command appears + in section 3.3. + + Syntax: + "DATA" CRLF + +4.1.1.5 RESET (RSET) + + This command specifies that the current mail transaction will be + aborted. Any stored sender, recipients, and mail data MUST be + discarded, and all buffers and state tables cleared. The receiver + MUST send a "250 OK" reply to a RSET command with no arguments. A + reset command may be issued by the client at any time. It is + effectively equivalent to a NOOP (i.e., if has no effect) if issued + immediately after EHLO, before EHLO is issued in the session, after + an end-of-data indicator has been sent and acknowledged, or + immediately before a QUIT. An SMTP server MUST NOT close the + connection as the result of receiving a RSET; that action is reserved + for QUIT (see section 4.1.1.10). + + Since EHLO implies some additional processing and response by the + server, RSET will normally be more efficient than reissuing that + command, even though the formal semantics are the same. + + There are circumstances, contrary to the intent of this + specification, in which an SMTP server may receive an indication that + the underlying TCP connection has been closed or reset. To preserve + the robustness of the mail system, SMTP servers SHOULD be prepared + for this condition and SHOULD treat it as if a QUIT had been received + before the connection disappeared. + + Syntax: + "RSET" CRLF + + + + + + + + +Klensin Standards Track [Page 34] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +4.1.1.6 VERIFY (VRFY) + + This command asks the receiver to confirm that the argument + identifies a user or mailbox. If it is a user name, information is + returned as specified in section 3.5. + + This command has no effect on the reverse-path buffer, the forward- + path buffer, or the mail data buffer. + + Syntax: + "VRFY" SP String CRLF + +4.1.1.7 EXPAND (EXPN) + + This command asks the receiver to confirm that the argument + identifies a mailing list, and if so, to return the membership of + that list. If the command is successful, a reply is returned + containing information as described in section 3.5. This reply will + have multiple lines except in the trivial case of a one-member list. + + This command has no effect on the reverse-path buffer, the forward- + path buffer, or the mail data buffer and may be issued at any time. + + Syntax: + "EXPN" SP String CRLF + +4.1.1.8 HELP (HELP) + + This command causes the server to send helpful information to the + client. The command MAY take an argument (e.g., any command name) + and return more specific information as a response. + + This command has no effect on the reverse-path buffer, the forward- + path buffer, or the mail data buffer and may be issued at any time. + + SMTP servers SHOULD support HELP without arguments and MAY support it + with arguments. + + Syntax: + "HELP" [ SP String ] CRLF + +4.1.1.9 NOOP (NOOP) + + This command does not affect any parameters or previously entered + commands. It specifies no action other than that the receiver send + an OK reply. + + + + + +Klensin Standards Track [Page 35] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + This command has no effect on the reverse-path buffer, the forward- + path buffer, or the mail data buffer and may be issued at any time. + If a parameter string is specified, servers SHOULD ignore it. + + Syntax: + "NOOP" [ SP String ] CRLF + +4.1.1.10 QUIT (QUIT) + + This command specifies that the receiver MUST send an OK reply, and + then close the transmission channel. + + The receiver MUST NOT intentionally close the transmission channel + until it receives and replies to a QUIT command (even if there was an + error). The sender MUST NOT intentionally close the transmission + channel until it sends a QUIT command and SHOULD wait until it + receives the reply (even if there was an error response to a previous + command). If the connection is closed prematurely due to violations + of the above or system or network failure, the server MUST cancel any + pending transaction, but not undo any previously completed + transaction, and generally MUST act as if the command or transaction + in progress had received a temporary error (i.e., a 4yz response). + + The QUIT command may be issued at any time. + + Syntax: + "QUIT" CRLF + +4.1.2 Command Argument Syntax + + The syntax of the argument fields of the above commands (using the + syntax specified in [8] where applicable) is given below. Some of + the productions given below are used only in conjunction with source + routes as described in appendix C. Terminals not defined in this + document, such as ALPHA, DIGIT, SP, CR, LF, CRLF, are as defined in + the "core" syntax [8 (section 6)] or in the message format syntax + [32]. + + Reverse-path = Path + Forward-path = Path + Path = "<" [ A-d-l ":" ] Mailbox ">" + A-d-l = At-domain *( "," A-d-l ) + ; Note that this form, the so-called "source route", + ; MUST BE accepted, SHOULD NOT be generated, and SHOULD be + ; ignored. + At-domain = "@" domain + Mail-parameters = esmtp-param *(SP esmtp-param) + Rcpt-parameters = esmtp-param *(SP esmtp-param) + + + +Klensin Standards Track [Page 36] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + esmtp-param = esmtp-keyword ["=" esmtp-value] + esmtp-keyword = (ALPHA / DIGIT) *(ALPHA / DIGIT / "-") + esmtp-value = 1*(%d33-60 / %d62-127) + ; any CHAR excluding "=", SP, and control characters + Keyword = Ldh-str + Argument = Atom + Domain = (sub-domain 1*("." sub-domain)) / address-literal + sub-domain = Let-dig [Ldh-str] + + address-literal = "[" IPv4-address-literal / + IPv6-address-literal / + General-address-literal "]" + ; See section 4.1.3 + + Mailbox = Local-part "@" Domain + + Local-part = Dot-string / Quoted-string + ; MAY be case-sensitive + + Dot-string = Atom *("." Atom) + + Atom = 1*atext + + Quoted-string = DQUOTE *qcontent DQUOTE + + String = Atom / Quoted-string + + While the above definition for Local-part is relatively permissive, + for maximum interoperability, a host that expects to receive mail + SHOULD avoid defining mailboxes where the Local-part requires (or + uses) the Quoted-string form or where the Local-part is case- + sensitive. For any purposes that require generating or comparing + Local-parts (e.g., to specific mailbox names), all quoted forms MUST + be treated as equivalent and the sending system SHOULD transmit the + form that uses the minimum quoting possible. + + Systems MUST NOT define mailboxes in such a way as to require the use + in SMTP of non-ASCII characters (octets with the high order bit set + to one) or ASCII "control characters" (decimal value 0-31 and 127). + These characters MUST NOT be used in MAIL or RCPT commands or other + commands that require mailbox names. + + Note that the backslash, "\", is a quote character, which is used to + indicate that the next character is to be used literally (instead of + its normal interpretation). For example, "Joe\,Smith" indicates a + single nine character user field with the comma being the fourth + character of the field. + + + + +Klensin Standards Track [Page 37] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + To promote interoperability and consistent with long-standing + guidance about conservative use of the DNS in naming and applications + (e.g., see section 2.3.1 of the base DNS document, RFC1035 [22]), + characters outside the set of alphas, digits, and hyphen MUST NOT + appear in domain name labels for SMTP clients or servers. In + particular, the underscore character is not permitted. SMTP servers + that receive a command in which invalid character codes have been + employed, and for which there are no other reasons for rejection, + MUST reject that command with a 501 response. + +4.1.3 Address Literals + + Sometimes a host is not known to the domain name system and + communication (and, in particular, communication to report and repair + the error) is blocked. To bypass this barrier a special literal form + of the address is allowed as an alternative to a domain name. For + IPv4 addresses, this form uses four small decimal integers separated + by dots and enclosed by brackets such as [123.255.37.2], which + indicates an (IPv4) Internet Address in sequence-of-octets form. For + IPv6 and other forms of addressing that might eventually be + standardized, the form consists of a standardized "tag" that + identifies the address syntax, a colon, and the address itself, in a + format specified as part of the IPv6 standards [17]. + + Specifically: + + IPv4-address-literal = Snum 3("." Snum) + IPv6-address-literal = "IPv6:" IPv6-addr + General-address-literal = Standardized-tag ":" 1*dcontent + Standardized-tag = Ldh-str + ; MUST be specified in a standards-track RFC + ; and registered with IANA + + Snum = 1*3DIGIT ; representing a decimal integer + ; value in the range 0 through 255 + Let-dig = ALPHA / DIGIT + Ldh-str = *( ALPHA / DIGIT / "-" ) Let-dig + + IPv6-addr = IPv6-full / IPv6-comp / IPv6v4-full / IPv6v4-comp + IPv6-hex = 1*4HEXDIG + IPv6-full = IPv6-hex 7(":" IPv6-hex) + IPv6-comp = [IPv6-hex *5(":" IPv6-hex)] "::" [IPv6-hex *5(":" + IPv6-hex)] + ; The "::" represents at least 2 16-bit groups of zeros + ; No more than 6 groups in addition to the "::" may be + ; present + IPv6v4-full = IPv6-hex 5(":" IPv6-hex) ":" IPv4-address-literal + IPv6v4-comp = [IPv6-hex *3(":" IPv6-hex)] "::" + + + +Klensin Standards Track [Page 38] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + [IPv6-hex *3(":" IPv6-hex) ":"] IPv4-address-literal + ; The "::" represents at least 2 16-bit groups of zeros + ; No more than 4 groups in addition to the "::" and + ; IPv4-address-literal may be present + +4.1.4 Order of Commands + + There are restrictions on the order in which these commands may be + used. + + A session that will contain mail transactions MUST first be + initialized by the use of the EHLO command. An SMTP server SHOULD + accept commands for non-mail transactions (e.g., VRFY or EXPN) + without this initialization. + + An EHLO command MAY be issued by a client later in the session. If + it is issued after the session begins, the SMTP server MUST clear all + buffers and reset the state exactly as if a RSET command had been + issued. In other words, the sequence of RSET followed immediately by + EHLO is redundant, but not harmful other than in the performance cost + of executing unnecessary commands. + + If the EHLO command is not acceptable to the SMTP server, 501, 500, + or 502 failure replies MUST be returned as appropriate. The SMTP + server MUST stay in the same state after transmitting these replies + that it was in before the EHLO was received. + + The SMTP client MUST, if possible, ensure that the domain parameter + to the EHLO command is a valid principal host name (not a CNAME or MX + name) for its host. If this is not possible (e.g., when the client's + address is dynamically assigned and the client does not have an + obvious name), an address literal SHOULD be substituted for the + domain name and supplemental information provided that will assist in + identifying the client. + + An SMTP server MAY verify that the domain name parameter in the EHLO + command actually corresponds to the IP address of the client. + However, the server MUST NOT refuse to accept a message for this + reason if the verification fails: the information about verification + failure is for logging and tracing only. + + The NOOP, HELP, EXPN, VRFY, and RSET commands can be used at any time + during a session, or without previously initializing a session. SMTP + servers SHOULD process these normally (that is, not return a 503 + code) even if no EHLO command has yet been received; clients SHOULD + open a session with EHLO before sending these commands. + + + + + +Klensin Standards Track [Page 39] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + If these rules are followed, the example in RFC 821 that shows "550 + access denied to you" in response to an EXPN command is incorrect + unless an EHLO command precedes the EXPN or the denial of access is + based on the client's IP address or other authentication or + authorization-determining mechanisms. + + The MAIL command (or the obsolete SEND, SOML, or SAML commands) + begins a mail transaction. Once started, a mail transaction consists + of a transaction beginning command, one or more RCPT commands, and a + DATA command, in that order. A mail transaction may be aborted by + the RSET (or a new EHLO) command. There may be zero or more + transactions in a session. MAIL (or SEND, SOML, or SAML) MUST NOT be + sent if a mail transaction is already open, i.e., it should be sent + only if no mail transaction had been started in the session, or it + the previous one successfully concluded with a successful DATA + command, or if the previous one was aborted with a RSET. + + If the transaction beginning command argument is not acceptable, a + 501 failure reply MUST be returned and the SMTP server MUST stay in + the same state. If the commands in a transaction are out of order to + the degree that they cannot be processed by the server, a 503 failure + reply MUST be returned and the SMTP server MUST stay in the same + state. + + The last command in a session MUST be the QUIT command. The QUIT + command cannot be used at any other time in a session, but SHOULD be + used by the client SMTP to request connection closure, even when no + session opening command was sent and accepted. + +4.1.5 Private-use Commands + + As specified in section 2.2.2, commands starting in "X" may be used + by bilateral agreement between the client (sending) and server + (receiving) SMTP agents. An SMTP server that does not recognize such + a command is expected to reply with "500 Command not recognized". An + extended SMTP server MAY list the feature names associated with these + private commands in the response to the EHLO command. + + Commands sent or accepted by SMTP systems that do not start with "X" + MUST conform to the requirements of section 2.2.2. + +4.2 SMTP Replies + + Replies to SMTP commands serve to ensure the synchronization of + requests and actions in the process of mail transfer and to guarantee + that the SMTP client always knows the state of the SMTP server. + Every command MUST generate exactly one reply. + + + + +Klensin Standards Track [Page 40] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + The details of the command-reply sequence are described in section + 4.3. + + An SMTP reply consists of a three digit number (transmitted as three + numeric characters) followed by some text unless specified otherwise + in this document. The number is for use by automata to determine + what state to enter next; the text is for the human user. The three + digits contain enough encoded information that the SMTP client need + not examine the text and may either discard it or pass it on to the + user, as appropriate. Exceptions are as noted elsewhere in this + document. In particular, the 220, 221, 251, 421, and 551 reply codes + are associated with message text that must be parsed and interpreted + by machines. In the general case, the text may be receiver dependent + and context dependent, so there are likely to be varying texts for + each reply code. A discussion of the theory of reply codes is given + in section 4.2.1. Formally, a reply is defined to be the sequence: a + three-digit code, <SP>, one line of text, and <CRLF>, or a multiline + reply (as defined in section 4.2.1). Since, in violation of this + specification, the text is sometimes not sent, clients which do not + receive it SHOULD be prepared to process the code alone (with or + without a trailing space character). Only the EHLO, EXPN, and HELP + commands are expected to result in multiline replies in normal + circumstances, however, multiline replies are allowed for any + command. + + In ABNF, server responses are: + + Greeting = "220 " Domain [ SP text ] CRLF + Reply-line = Reply-code [ SP text ] CRLF + + where "Greeting" appears only in the 220 response that announces that + the server is opening its part of the connection. + + An SMTP server SHOULD send only the reply codes listed in this + document. An SMTP server SHOULD use the text shown in the examples + whenever appropriate. + + An SMTP client MUST determine its actions only by the reply code, not + by the text (except for the "change of address" 251 and 551 and, if + necessary, 220, 221, and 421 replies); in the general case, any text, + including no text at all (although senders SHOULD NOT send bare + codes), MUST be acceptable. The space (blank) following the reply + code is considered part of the text. Whenever possible, a receiver- + SMTP SHOULD test the first digit (severity indication) of the reply + code. + + + + + + +Klensin Standards Track [Page 41] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + The list of codes that appears below MUST NOT be construed as + permanent. While the addition of new codes should be a rare and + significant activity, with supplemental information in the textual + part of the response being preferred, new codes may be added as the + result of new Standards or Standards-track specifications. + Consequently, a sender-SMTP MUST be prepared to handle codes not + specified in this document and MUST do so by interpreting the first + digit only. + +4.2.1 Reply Code Severities and Theory + + The three digits of the reply each have a special significance. The + first digit denotes whether the response is good, bad or incomplete. + An unsophisticated SMTP client, or one that receives an unexpected + code, will be able to determine its next action (proceed as planned, + redo, retrench, etc.) by examining this first digit. An SMTP client + that wants to know approximately what kind of error occurred (e.g., + mail system error, command syntax error) may examine the second + digit. The third digit and any supplemental information that may be + present is reserved for the finest gradation of information. + + There are five values for the first digit of the reply code: + + 1yz Positive Preliminary reply + The command has been accepted, but the requested action is being + held in abeyance, pending confirmation of the information in this + reply. The SMTP client should send another command specifying + whether to continue or abort the action. Note: unextended SMTP + does not have any commands that allow this type of reply, and so + does not have continue or abort commands. + + 2yz Positive Completion reply + The requested action has been successfully completed. A new + request may be initiated. + + 3yz Positive Intermediate reply + The command has been accepted, but the requested action is being + held in abeyance, pending receipt of further information. The + SMTP client should send another command specifying this + information. This reply is used in command sequence groups (i.e., + in DATA). + + 4yz Transient Negative Completion reply + The command was not accepted, and the requested action did not + occur. However, the error condition is temporary and the action + may be requested again. The sender should return to the beginning + of the command sequence (if any). It is difficult to assign a + meaning to "transient" when two different sites (receiver- and + + + +Klensin Standards Track [Page 42] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + sender-SMTP agents) must agree on the interpretation. Each reply + in this category might have a different time value, but the SMTP + client is encouraged to try again. A rule of thumb to determine + whether a reply fits into the 4yz or the 5yz category (see below) + is that replies are 4yz if they can be successful if repeated + without any change in command form or in properties of the sender + or receiver (that is, the command is repeated identically and the + receiver does not put up a new implementation.) + + 5yz Permanent Negative Completion reply + The command was not accepted and the requested action did not + occur. The SMTP client is discouraged from repeating the exact + request (in the same sequence). Even some "permanent" error + conditions can be corrected, so the human user may want to direct + the SMTP client to reinitiate the command sequence by direct + action at some point in the future (e.g., after the spelling has + been changed, or the user has altered the account status). + + The second digit encodes responses in specific categories: + + x0z Syntax: These replies refer to syntax errors, syntactically + correct commands that do not fit any functional category, and + unimplemented or superfluous commands. + + x1z Information: These are replies to requests for information, + such as status or help. + + x2z Connections: These are replies referring to the transmission + channel. + + x3z Unspecified. + + x4z Unspecified. + + x5z Mail system: These replies indicate the status of the receiver + mail system vis-a-vis the requested transfer or other mail system + action. + + The third digit gives a finer gradation of meaning in each category + specified by the second digit. The list of replies illustrates this. + Each reply text is recommended rather than mandatory, and may even + change according to the command with which it is associated. On the + other hand, the reply codes must strictly follow the specifications + in this section. Receiver implementations should not invent new + codes for slightly different situations from the ones described here, + but rather adapt codes already defined. + + + + + +Klensin Standards Track [Page 43] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + For example, a command such as NOOP, whose successful execution does + not offer the SMTP client any new information, will return a 250 + reply. The reply is 502 when the command requests an unimplemented + non-site-specific action. A refinement of that is the 504 reply for + a command that is implemented, but that requests an unimplemented + parameter. + + The reply text may be longer than a single line; in these cases the + complete text must be marked so the SMTP client knows when it can + stop reading the reply. This requires a special format to indicate a + multiple line reply. + + The format for multiline replies requires that every line, except the + last, begin with the reply code, followed immediately by a hyphen, + "-" (also known as minus), followed by text. The last line will + begin with the reply code, followed immediately by <SP>, optionally + some text, and <CRLF>. As noted above, servers SHOULD send the <SP> + if subsequent text is not sent, but clients MUST be prepared for it + to be omitted. + + For example: + + 123-First line + 123-Second line + 123-234 text beginning with numbers + 123 The last line + + In many cases the SMTP client then simply needs to search for a line + beginning with the reply code followed by <SP> or <CRLF> and ignore + all preceding lines. In a few cases, there is important data for the + client in the reply "text". The client will be able to identify + these cases from the current context. + +4.2.2 Reply Codes by Function Groups + + 500 Syntax error, command unrecognized + (This may include errors such as command line too long) + 501 Syntax error in parameters or arguments + 502 Command not implemented (see section 4.2.4) + 503 Bad sequence of commands + 504 Command parameter not implemented + + 211 System status, or system help reply + 214 Help message + (Information on how to use the receiver or the meaning of a + particular non-standard command; this reply is useful only + to the human user) + + + + +Klensin Standards Track [Page 44] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + 220 <domain> Service ready + 221 <domain> Service closing transmission channel + 421 <domain> Service not available, closing transmission channel + (This may be a reply to any command if the service knows it + must shut down) + + 250 Requested mail action okay, completed + 251 User not local; will forward to <forward-path> + (See section 3.4) + 252 Cannot VRFY user, but will accept message and attempt + delivery + (See section 3.5.3) + 450 Requested mail action not taken: mailbox unavailable + (e.g., mailbox busy) + 550 Requested action not taken: mailbox unavailable + (e.g., mailbox not found, no access, or command rejected + for policy reasons) + 451 Requested action aborted: error in processing + 551 User not local; please try <forward-path> + (See section 3.4) + 452 Requested action not taken: insufficient system storage + 552 Requested mail action aborted: exceeded storage allocation + 553 Requested action not taken: mailbox name not allowed + (e.g., mailbox syntax incorrect) + 354 Start mail input; end with <CRLF>.<CRLF> + 554 Transaction failed (Or, in the case of a connection-opening + response, "No SMTP service here") + +4.2.3 Reply Codes in Numeric Order + + 211 System status, or system help reply + 214 Help message + (Information on how to use the receiver or the meaning of a + particular non-standard command; this reply is useful only + to the human user) + 220 <domain> Service ready + 221 <domain> Service closing transmission channel + 250 Requested mail action okay, completed + 251 User not local; will forward to <forward-path> + (See section 3.4) + 252 Cannot VRFY user, but will accept message and attempt + delivery + (See section 3.5.3) + + 354 Start mail input; end with <CRLF>.<CRLF> + + + + + + +Klensin Standards Track [Page 45] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + 421 <domain> Service not available, closing transmission channel + (This may be a reply to any command if the service knows it + must shut down) + 450 Requested mail action not taken: mailbox unavailable + (e.g., mailbox busy) + 451 Requested action aborted: local error in processing + 452 Requested action not taken: insufficient system storage + 500 Syntax error, command unrecognized + (This may include errors such as command line too long) + 501 Syntax error in parameters or arguments + 502 Command not implemented (see section 4.2.4) + 503 Bad sequence of commands + 504 Command parameter not implemented + 550 Requested action not taken: mailbox unavailable + (e.g., mailbox not found, no access, or command rejected + for policy reasons) + 551 User not local; please try <forward-path> + (See section 3.4) + 552 Requested mail action aborted: exceeded storage allocation + 553 Requested action not taken: mailbox name not allowed + (e.g., mailbox syntax incorrect) + 554 Transaction failed (Or, in the case of a connection-opening + response, "No SMTP service here") + +4.2.4 Reply Code 502 + + Questions have been raised as to when reply code 502 (Command not + implemented) SHOULD be returned in preference to other codes. 502 + SHOULD be used when the command is actually recognized by the SMTP + server, but not implemented. If the command is not recognized, code + 500 SHOULD be returned. Extended SMTP systems MUST NOT list + capabilities in response to EHLO for which they will return 502 (or + 500) replies. + +4.2.5 Reply Codes After DATA and the Subsequent <CRLF>.<CRLF> + + When an SMTP server returns a positive completion status (2yz code) + after the DATA command is completed with <CRLF>.<CRLF>, it accepts + responsibility for: + + - delivering the message (if the recipient mailbox exists), or + + - if attempts to deliver the message fail due to transient + conditions, retrying delivery some reasonable number of times at + intervals as specified in section 4.5.4. + + + + + + +Klensin Standards Track [Page 46] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + - if attempts to deliver the message fail due to permanent + conditions, or if repeated attempts to deliver the message fail + due to transient conditions, returning appropriate notification to + the sender of the original message (using the address in the SMTP + MAIL command). + + When an SMTP server returns a permanent error status (5yz) code after + the DATA command is completed with <CRLF>.<CRLF>, it MUST NOT make + any subsequent attempt to deliver that message. The SMTP client + retains responsibility for delivery of that message and may either + return it to the user or requeue it for a subsequent attempt (see + section 4.5.4.1). + + The user who originated the message SHOULD be able to interpret the + return of a transient failure status (by mail message or otherwise) + as a non-delivery indication, just as a permanent failure would be + interpreted. I.e., if the client SMTP successfully handles these + conditions, the user will not receive such a reply. + + When an SMTP server returns a permanent error status (5yz) code after + the DATA command is completely with <CRLF>.<CRLF>, it MUST NOT make + any subsequent attempt to deliver the message. As with temporary + error status codes, the SMTP client retains responsibility for the + message, but SHOULD not again attempt delivery to the same server + without user review and intervention of the message. + +4.3 Sequencing of Commands and Replies + +4.3.1 Sequencing Overview + + The communication between the sender and receiver is an alternating + dialogue, controlled by the sender. As such, the sender issues a + command and the receiver responds with a reply. Unless other + arrangements are negotiated through service extensions, the sender + MUST wait for this response before sending further commands. + + One important reply is the connection greeting. Normally, a receiver + will send a 220 "Service ready" reply when the connection is + completed. The sender SHOULD wait for this greeting message before + sending any commands. + + Note: all the greeting-type replies have the official name (the + fully-qualified primary domain name) of the server host as the first + word following the reply code. Sometimes the host will have no + meaningful name. See 4.1.3 for a discussion of alternatives in these + situations. + + + + + +Klensin Standards Track [Page 47] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + For example, + + 220 ISIF.USC.EDU Service ready + or + 220 mail.foo.com SuperSMTP v 6.1.2 Service ready + or + 220 [10.0.0.1] Clueless host service ready + + The table below lists alternative success and failure replies for + each command. These SHOULD be strictly adhered to: a receiver may + substitute text in the replies, but the meaning and action implied by + the code numbers and by the specific command reply sequence cannot be + altered. + +4.3.2 Command-Reply Sequences + + Each command is listed with its usual possible replies. The prefixes + used before the possible replies are "I" for intermediate, "S" for + success, and "E" for error. Since some servers may generate other + replies under special circumstances, and to allow for future + extension, SMTP clients SHOULD, when possible, interpret only the + first digit of the reply and MUST be prepared to deal with + unrecognized reply codes by interpreting the first digit only. + Unless extended using the mechanisms described in section 2.2, SMTP + servers MUST NOT transmit reply codes to an SMTP client that are + other than three digits or that do not start in a digit between 2 and + 5 inclusive. + + These sequencing rules and, in principle, the codes themselves, can + be extended or modified by SMTP extensions offered by the server and + accepted (requested) by the client. + + In addition to the codes listed below, any SMTP command can return + any of the following codes if the corresponding unusual circumstances + are encountered: + + 500 For the "command line too long" case or if the command name was + not recognized. Note that producing a "command not recognized" + error in response to the required subset of these commands is a + violation of this specification. + + 501 Syntax error in command or arguments. In order to provide for + future extensions, commands that are specified in this document as + not accepting arguments (DATA, RSET, QUIT) SHOULD return a 501 + message if arguments are supplied in the absence of EHLO- + advertised extensions. + + 421 Service shutting down and closing transmission channel + + + +Klensin Standards Track [Page 48] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + Specific sequences are: + + CONNECTION ESTABLISHMENT + S: 220 + E: 554 + EHLO or HELO + S: 250 + E: 504, 550 + MAIL + S: 250 + E: 552, 451, 452, 550, 553, 503 + RCPT + S: 250, 251 (but see section 3.4 for discussion of 251 and 551) + E: 550, 551, 552, 553, 450, 451, 452, 503, 550 + DATA + I: 354 -> data -> S: 250 + E: 552, 554, 451, 452 + E: 451, 554, 503 + RSET + S: 250 + VRFY + S: 250, 251, 252 + E: 550, 551, 553, 502, 504 + EXPN + S: 250, 252 + E: 550, 500, 502, 504 + HELP + S: 211, 214 + E: 502, 504 + NOOP + S: 250 + QUIT + S: 221 + +4.4 Trace Information + + When an SMTP server receives a message for delivery or further + processing, it MUST insert trace ("time stamp" or "Received") + information at the beginning of the message content, as discussed in + section 4.1.1.4. + + This line MUST be structured as follows: + + - The FROM field, which MUST be supplied in an SMTP environment, + SHOULD contain both (1) the name of the source host as presented + in the EHLO command and (2) an address literal containing the IP + address of the source, determined from the TCP connection. + + + + +Klensin Standards Track [Page 49] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + - The ID field MAY contain an "@" as suggested in RFC 822, but this + is not required. + + - The FOR field MAY contain a list of <path> entries when multiple + RCPT commands have been given. This may raise some security + issues and is usually not desirable; see section 7.2. + + An Internet mail program MUST NOT change a Received: line that was + previously added to the message header. SMTP servers MUST prepend + Received lines to messages; they MUST NOT change the order of + existing lines or insert Received lines in any other location. + + As the Internet grows, comparability of Received fields is important + for detecting problems, especially slow relays. SMTP servers that + create Received fields SHOULD use explicit offsets in the dates + (e.g., -0800), rather than time zone names of any type. Local time + (with an offset) is preferred to UT when feasible. This formulation + allows slightly more information about local circumstances to be + specified. If UT is needed, the receiver need merely do some simple + arithmetic to convert the values. Use of UT loses information about + the time zone-location of the server. If it is desired to supply a + time zone name, it SHOULD be included in a comment. + + When the delivery SMTP server makes the "final delivery" of a + message, it inserts a return-path line at the beginning of the mail + data. This use of return-path is required; mail systems MUST support + it. The return-path line preserves the information in the <reverse- + path> from the MAIL command. Here, final delivery means the message + has left the SMTP environment. Normally, this would mean it had been + delivered to the destination user or an associated mail drop, but in + some cases it may be further processed and transmitted by another + mail system. + + It is possible for the mailbox in the return path to be different + from the actual sender's mailbox, for example, if error responses are + to be delivered to a special error handling mailbox rather than to + the message sender. When mailing lists are involved, this + arrangement is common and useful as a means of directing errors to + the list maintainer rather than the message originator. + + The text above implies that the final mail data will begin with a + return path line, followed by one or more time stamp lines. These + lines will be followed by the mail data headers and body [32]. + + It is sometimes difficult for an SMTP server to determine whether or + not it is making final delivery since forwarding or other operations + may occur after the message is accepted for delivery. Consequently, + + + + +Klensin Standards Track [Page 50] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + any further (forwarding, gateway, or relay) systems MAY remove the + return path and rebuild the MAIL command as needed to ensure that + exactly one such line appears in a delivered message. + + A message-originating SMTP system SHOULD NOT send a message that + already contains a Return-path header. SMTP servers performing a + relay function MUST NOT inspect the message data, and especially not + to the extent needed to determine if Return-path headers are present. + SMTP servers making final delivery MAY remove Return-path headers + before adding their own. + + The primary purpose of the Return-path is to designate the address to + which messages indicating non-delivery or other mail system failures + are to be sent. For this to be unambiguous, exactly one return path + SHOULD be present when the message is delivered. Systems using RFC + 822 syntax with non-SMTP transports SHOULD designate an unambiguous + address, associated with the transport envelope, to which error + reports (e.g., non-delivery messages) should be sent. + + Historical note: Text in RFC 822 that appears to contradict the use + of the Return-path header (or the envelope reverse path address from + the MAIL command) as the destination for error messages is not + applicable on the Internet. The reverse path address (as copied into + the Return-path) MUST be used as the target of any mail containing + delivery error messages. + + In particular: + + - a gateway from SMTP->elsewhere SHOULD insert a return-path header, + unless it is known that the "elsewhere" transport also uses + Internet domain addresses and maintains the envelope sender + address separately. + + - a gateway from elsewhere->SMTP SHOULD delete any return-path + header present in the message, and either copy that information to + the SMTP envelope or combine it with information present in the + envelope of the other transport system to construct the reverse + path argument to the MAIL command in the SMTP envelope. + + The server must give special treatment to cases in which the + processing following the end of mail data indication is only + partially successful. This could happen if, after accepting several + recipients and the mail data, the SMTP server finds that the mail + data could be successfully delivered to some, but not all, of the + recipients. In such cases, the response to the DATA command MUST be + an OK reply. However, the SMTP server MUST compose and send an + "undeliverable mail" notification message to the originator of the + message. + + + +Klensin Standards Track [Page 51] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + A single notification listing all of the failed recipients or + separate notification messages MUST be sent for each failed + recipient. For economy of processing by the sender, the former is + preferred when possible. All undeliverable mail notification + messages are sent using the MAIL command (even if they result from + processing the obsolete SEND, SOML, or SAML commands) and use a null + return path as discussed in section 3.7. + + The time stamp line and the return path line are formally defined as + follows: + +Return-path-line = "Return-Path:" FWS Reverse-path <CRLF> + +Time-stamp-line = "Received:" FWS Stamp <CRLF> + +Stamp = From-domain By-domain Opt-info ";" FWS date-time + + ; where "date-time" is as defined in [32] + ; but the "obs-" forms, especially two-digit + ; years, are prohibited in SMTP and MUST NOT be used. + +From-domain = "FROM" FWS Extended-Domain CFWS + +By-domain = "BY" FWS Extended-Domain CFWS + +Extended-Domain = Domain / + ( Domain FWS "(" TCP-info ")" ) / + ( Address-literal FWS "(" TCP-info ")" ) + +TCP-info = Address-literal / ( Domain FWS Address-literal ) + ; Information derived by server from TCP connection + ; not client EHLO. + +Opt-info = [Via] [With] [ID] [For] + +Via = "VIA" FWS Link CFWS + +With = "WITH" FWS Protocol CFWS + +ID = "ID" FWS String / msg-id CFWS + +For = "FOR" FWS 1*( Path / Mailbox ) CFWS + +Link = "TCP" / Addtl-Link +Addtl-Link = Atom + ; Additional standard names for links are registered with the + ; Internet Assigned Numbers Authority (IANA). "Via" is + ; primarily of value with non-Internet transports. SMTP + + + +Klensin Standards Track [Page 52] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + ; servers SHOULD NOT use unregistered names. +Protocol = "ESMTP" / "SMTP" / Attdl-Protocol +Attdl-Protocol = Atom + ; Additional standard names for protocols are registered with the + ; Internet Assigned Numbers Authority (IANA). SMTP servers + ; SHOULD NOT use unregistered names. + +4.5 Additional Implementation Issues + +4.5.1 Minimum Implementation + + In order to make SMTP workable, the following minimum implementation + is required for all receivers. The following commands MUST be + supported to conform to this specification: + + EHLO + HELO + MAIL + RCPT + DATA + RSET + NOOP + QUIT + VRFY + + Any system that includes an SMTP server supporting mail relaying or + delivery MUST support the reserved mailbox "postmaster" as a case- + insensitive local name. This postmaster address is not strictly + necessary if the server always returns 554 on connection opening (as + described in section 3.1). The requirement to accept mail for + postmaster implies that RCPT commands which specify a mailbox for + postmaster at any of the domains for which the SMTP server provides + mail service, as well as the special case of "RCPT TO:<Postmaster>" + (with no domain specification), MUST be supported. + + SMTP systems are expected to make every reasonable effort to accept + mail directed to Postmaster from any other system on the Internet. + In extreme cases --such as to contain a denial of service attack or + other breach of security-- an SMTP server may block mail directed to + Postmaster. However, such arrangements SHOULD be narrowly tailored + so as to avoid blocking messages which are not part of such attacks. + +4.5.2 Transparency + + Without some provision for data transparency, the character sequence + "<CRLF>.<CRLF>" ends the mail text and cannot be sent by the user. + In general, users are not aware of such "forbidden" sequences. To + + + + +Klensin Standards Track [Page 53] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + allow all user composed text to be transmitted transparently, the + following procedures are used: + + - Before sending a line of mail text, the SMTP client checks the + first character of the line. If it is a period, one additional + period is inserted at the beginning of the line. + + - When a line of mail text is received by the SMTP server, it checks + the line. If the line is composed of a single period, it is + treated as the end of mail indicator. If the first character is a + period and there are other characters on the line, the first + character is deleted. + + The mail data may contain any of the 128 ASCII characters. All + characters are to be delivered to the recipient's mailbox, including + spaces, vertical and horizontal tabs, and other control characters. + If the transmission channel provides an 8-bit byte (octet) data + stream, the 7-bit ASCII codes are transmitted right justified in the + octets, with the high order bits cleared to zero. See 3.7 for + special treatment of these conditions in SMTP systems serving a relay + function. + + In some systems it may be necessary to transform the data as it is + received and stored. This may be necessary for hosts that use a + different character set than ASCII as their local character set, that + store data in records rather than strings, or which use special + character sequences as delimiters inside mailboxes. If such + transformations are necessary, they MUST be reversible, especially if + they are applied to mail being relayed. + +4.5.3 Sizes and Timeouts + +4.5.3.1 Size limits and minimums + + There are several objects that have required minimum/maximum sizes. + Every implementation MUST be able to receive objects of at least + these sizes. Objects larger than these sizes SHOULD be avoided when + possible. However, some Internet mail constructs such as encoded + X.400 addresses [16] will often require larger objects: clients MAY + attempt to transmit these, but MUST be prepared for a server to + reject them if they cannot be handled by it. To the maximum extent + possible, implementation techniques which impose no limits on the + length of these objects should be used. + + local-part + The maximum total length of a user name or other local-part is 64 + characters. + + + + +Klensin Standards Track [Page 54] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + domain + The maximum total length of a domain name or number is 255 + characters. + + path + The maximum total length of a reverse-path or forward-path is 256 + characters (including the punctuation and element separators). + + command line + The maximum total length of a command line including the command + word and the <CRLF> is 512 characters. SMTP extensions may be + used to increase this limit. + + reply line + The maximum total length of a reply line including the reply code + and the <CRLF> is 512 characters. More information may be + conveyed through multiple-line replies. + + text line + The maximum total length of a text line including the <CRLF> is + 1000 characters (not counting the leading dot duplicated for + transparency). This number may be increased by the use of SMTP + Service Extensions. + + message content + The maximum total length of a message content (including any + message headers as well as the message body) MUST BE at least 64K + octets. Since the introduction of Internet standards for + multimedia mail [12], message lengths on the Internet have grown + dramatically, and message size restrictions should be avoided if + at all possible. SMTP server systems that must impose + restrictions SHOULD implement the "SIZE" service extension [18], + and SMTP client systems that will send large messages SHOULD + utilize it when possible. + + recipients buffer + The minimum total number of recipients that must be buffered is + 100 recipients. Rejection of messages (for excessive recipients) + with fewer than 100 RCPT commands is a violation of this + specification. The general principle that relaying SMTP servers + MUST NOT, and delivery SMTP servers SHOULD NOT, perform validation + tests on message headers suggests that rejecting a message based + on the total number of recipients shown in header fields is to be + discouraged. A server which imposes a limit on the number of + recipients MUST behave in an orderly fashion, such as to reject + additional addresses over its limit rather than silently + discarding addresses previously accepted. A client that needs to + + + + +Klensin Standards Track [Page 55] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + deliver a message containing over 100 RCPT commands SHOULD be + prepared to transmit in 100-recipient "chunks" if the server + declines to accept more than 100 recipients in a single message. + + Errors due to exceeding these limits may be reported by using the + reply codes. Some examples of reply codes are: + + 500 Line too long. + or + 501 Path too long + or + 452 Too many recipients (see below) + or + 552 Too much mail data. + + RFC 821 [30] incorrectly listed the error where an SMTP server + exhausts its implementation limit on the number of RCPT commands + ("too many recipients") as having reply code 552. The correct reply + code for this condition is 452. Clients SHOULD treat a 552 code in + this case as a temporary, rather than permanent, failure so the logic + below works. + + When a conforming SMTP server encounters this condition, it has at + least 100 successful RCPT commands in its recipients buffer. If the + server is able to accept the message, then at least these 100 + addresses will be removed from the SMTP client's queue. When the + client attempts retransmission of those addresses which received 452 + responses, at least 100 of these will be able to fit in the SMTP + server's recipients buffer. Each retransmission attempt which is + able to deliver anything will be able to dispose of at least 100 of + these recipients. + + If an SMTP server has an implementation limit on the number of RCPT + commands and this limit is exhausted, it MUST use a response code of + 452 (but the client SHOULD also be prepared for a 552, as noted + above). If the server has a configured site-policy limitation on the + number of RCPT commands, it MAY instead use a 5XX response code. + This would be most appropriate if the policy limitation was intended + to apply if the total recipient count for a particular message body + were enforced even if that message body was sent in multiple mail + transactions. + +4.5.3.2 Timeouts + + An SMTP client MUST provide a timeout mechanism. It MUST use per- + command timeouts rather than somehow trying to time the entire mail + transaction. Timeouts SHOULD be easily reconfigurable, preferably + without recompiling the SMTP code. To implement this, a timer is set + + + +Klensin Standards Track [Page 56] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + for each SMTP command and for each buffer of the data transfer. The + latter means that the overall timeout is inherently proportional to + the size of the message. + + Based on extensive experience with busy mail-relay hosts, the minimum + per-command timeout values SHOULD be as follows: + + Initial 220 Message: 5 minutes + An SMTP client process needs to distinguish between a failed TCP + connection and a delay in receiving the initial 220 greeting + message. Many SMTP servers accept a TCP connection but delay + delivery of the 220 message until their system load permits more + mail to be processed. + + MAIL Command: 5 minutes + + RCPT Command: 5 minutes + A longer timeout is required if processing of mailing lists and + aliases is not deferred until after the message was accepted. + + DATA Initiation: 2 minutes + This is while awaiting the "354 Start Input" reply to a DATA + command. + + Data Block: 3 minutes + This is while awaiting the completion of each TCP SEND call + transmitting a chunk of data. + + DATA Termination: 10 minutes. + This is while awaiting the "250 OK" reply. When the receiver gets + the final period terminating the message data, it typically + performs processing to deliver the message to a user mailbox. A + spurious timeout at this point would be very wasteful and would + typically result in delivery of multiple copies of the message, + since it has been successfully sent and the server has accepted + responsibility for delivery. See section 6.1 for additional + discussion. + + An SMTP server SHOULD have a timeout of at least 5 minutes while it + is awaiting the next command from the sender. + +4.5.4 Retry Strategies + + The common structure of a host SMTP implementation includes user + mailboxes, one or more areas for queuing messages in transit, and one + or more daemon processes for sending and receiving mail. The exact + structure will vary depending on the needs of the users on the host + + + + +Klensin Standards Track [Page 57] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + and the number and size of mailing lists supported by the host. We + describe several optimizations that have proved helpful, particularly + for mailers supporting high traffic levels. + + Any queuing strategy MUST include timeouts on all activities on a + per-command basis. A queuing strategy MUST NOT send error messages + in response to error messages under any circumstances. + +4.5.4.1 Sending Strategy + + The general model for an SMTP client is one or more processes that + periodically attempt to transmit outgoing mail. In a typical system, + the program that composes a message has some method for requesting + immediate attention for a new piece of outgoing mail, while mail that + cannot be transmitted immediately MUST be queued and periodically + retried by the sender. A mail queue entry will include not only the + message itself but also the envelope information. + + The sender MUST delay retrying a particular destination after one + attempt has failed. In general, the retry interval SHOULD be at + least 30 minutes; however, more sophisticated and variable strategies + will be beneficial when the SMTP client can determine the reason for + non-delivery. + + Retries continue until the message is transmitted or the sender gives + up; the give-up time generally needs to be at least 4-5 days. The + parameters to the retry algorithm MUST be configurable. + + A client SHOULD keep a list of hosts it cannot reach and + corresponding connection timeouts, rather than just retrying queued + mail items. + + Experience suggests that failures are typically transient (the target + system or its connection has crashed), favoring a policy of two + connection attempts in the first hour the message is in the queue, + and then backing off to one every two or three hours. + + The SMTP client can shorten the queuing delay in cooperation with the + SMTP server. For example, if mail is received from a particular + address, it is likely that mail queued for that host can now be sent. + Application of this principle may, in many cases, eliminate the + requirement for an explicit "send queues now" function such as ETRN + [9]. + + The strategy may be further modified as a result of multiple + addresses per host (see below) to optimize delivery time vs. resource + usage. + + + + +Klensin Standards Track [Page 58] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + An SMTP client may have a large queue of messages for each + unavailable destination host. If all of these messages were retried + in every retry cycle, there would be excessive Internet overhead and + the sending system would be blocked for a long period. Note that an + SMTP client can generally determine that a delivery attempt has + failed only after a timeout of several minutes and even a one-minute + timeout per connection will result in a very large delay if retries + are repeated for dozens, or even hundreds, of queued messages to the + same host. + + At the same time, SMTP clients SHOULD use great care in caching + negative responses from servers. In an extreme case, if EHLO is + issued multiple times during the same SMTP connection, different + answers may be returned by the server. More significantly, 5yz + responses to the MAIL command MUST NOT be cached. + + When a mail message is to be delivered to multiple recipients, and + the SMTP server to which a copy of the message is to be sent is the + same for multiple recipients, then only one copy of the message + SHOULD be transmitted. That is, the SMTP client SHOULD use the + command sequence: MAIL, RCPT, RCPT,... RCPT, DATA instead of the + sequence: MAIL, RCPT, DATA, ..., MAIL, RCPT, DATA. However, if there + are very many addresses, a limit on the number of RCPT commands per + MAIL command MAY be imposed. Implementation of this efficiency + feature is strongly encouraged. + + Similarly, to achieve timely delivery, the SMTP client MAY support + multiple concurrent outgoing mail transactions. However, some limit + may be appropriate to protect the host from devoting all its + resources to mail. + +4.5.4.2 Receiving Strategy + + The SMTP server SHOULD attempt to keep a pending listen on the SMTP + port at all times. This requires the support of multiple incoming + TCP connections for SMTP. Some limit MAY be imposed but servers that + cannot handle more than one SMTP transaction at a time are not in + conformance with the intent of this specification. + + As discussed above, when the SMTP server receives mail from a + particular host address, it could activate its own SMTP queuing + mechanisms to retry any mail pending for that host address. + +4.5.5 Messages with a null reverse-path + + There are several types of notification messages which are required + by existing and proposed standards to be sent with a null reverse + path, namely non-delivery notifications as discussed in section 3.7, + + + +Klensin Standards Track [Page 59] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + other kinds of Delivery Status Notifications (DSNs) [24], and also + Message Disposition Notifications (MDNs) [10]. All of these kinds of + messages are notifications about a previous message, and they are + sent to the reverse-path of the previous mail message. (If the + delivery of such a notification message fails, that usually indicates + a problem with the mail system of the host to which the notification + message is addressed. For this reason, at some hosts the MTA is set + up to forward such failed notification messages to someone who is + able to fix problems with the mail system, e.g., via the postmaster + alias.) + + All other types of messages (i.e., any message which is not required + by a standards-track RFC to have a null reverse-path) SHOULD be sent + with with a valid, non-null reverse-path. + + Implementors of automated email processors should be careful to make + sure that the various kinds of messages with null reverse-path are + handled correctly, in particular such systems SHOULD NOT reply to + messages with null reverse-path. + +5. Address Resolution and Mail Handling + + Once an SMTP client lexically identifies a domain to which mail will + be delivered for processing (as described in sections 3.6 and 3.7), a + DNS lookup MUST be performed to resolve the domain name [22]. The + names are expected to be fully-qualified domain names (FQDNs): + mechanisms for inferring FQDNs from partial names or local aliases + are outside of this specification and, due to a history of problems, + are generally discouraged. The lookup first attempts to locate an MX + record associated with the name. If a CNAME record is found instead, + the resulting name is processed as if it were the initial name. If + no MX records are found, but an A RR is found, the A RR is treated as + if it was associated with an implicit MX RR, with a preference of 0, + pointing to that host. If one or more MX RRs are found for a given + name, SMTP systems MUST NOT utilize any A RRs associated with that + name unless they are located using the MX RRs; the "implicit MX" rule + above applies only if there are no MX records present. If MX records + are present, but none of them are usable, this situation MUST be + reported as an error. + + When the lookup succeeds, the mapping can result in a list of + alternative delivery addresses rather than a single address, because + of multiple MX records, multihoming, or both. To provide reliable + mail transmission, the SMTP client MUST be able to try (and retry) + each of the relevant addresses in this list in order, until a + delivery attempt succeeds. However, there MAY also be a configurable + limit on the number of alternate addresses that can be tried. In any + case, the SMTP client SHOULD try at least two addresses. + + + +Klensin Standards Track [Page 60] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + Two types of information is used to rank the host addresses: multiple + MX records, and multihomed hosts. + + Multiple MX records contain a preference indication that MUST be used + in sorting (see below). Lower numbers are more preferred than higher + ones. If there are multiple destinations with the same preference + and there is no clear reason to favor one (e.g., by recognition of an + easily-reached address), then the sender-SMTP MUST randomize them to + spread the load across multiple mail exchangers for a specific + organization. + + The destination host (perhaps taken from the preferred MX record) may + be multihomed, in which case the domain name resolver will return a + list of alternative IP addresses. It is the responsibility of the + domain name resolver interface to have ordered this list by + decreasing preference if necessary, and SMTP MUST try them in the + order presented. + + Although the capability to try multiple alternative addresses is + required, specific installations may want to limit or disable the use + of alternative addresses. The question of whether a sender should + attempt retries using the different addresses of a multihomed host + has been controversial. The main argument for using the multiple + addresses is that it maximizes the probability of timely delivery, + and indeed sometimes the probability of any delivery; the counter- + argument is that it may result in unnecessary resource use. Note + that resource use is also strongly determined by the sending strategy + discussed in section 4.5.4.1. + + If an SMTP server receives a message with a destination for which it + is a designated Mail eXchanger, it MAY relay the message (potentially + after having rewritten the MAIL FROM and/or RCPT TO addresses), make + final delivery of the message, or hand it off using some mechanism + outside the SMTP-provided transport environment. Of course, neither + of the latter require that the list of MX records be examined + further. + + If it determines that it should relay the message without rewriting + the address, it MUST sort the MX records to determine candidates for + delivery. The records are first ordered by preference, with the + lowest-numbered records being most preferred. The relay host MUST + then inspect the list for any of the names or addresses by which it + might be known in mail transactions. If a matching record is found, + all records at that preference level and higher-numbered ones MUST be + discarded from consideration. If there are no records left at that + point, it is an error condition, and the message MUST be returned as + undeliverable. If records do remain, they SHOULD be tried, best + preference first, as described above. + + + +Klensin Standards Track [Page 61] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +6. Problem Detection and Handling + +6.1 Reliable Delivery and Replies by Email + + When the receiver-SMTP accepts a piece of mail (by sending a "250 OK" + message in response to DATA), it is accepting responsibility for + delivering or relaying the message. It must take this responsibility + seriously. It MUST NOT lose the message for frivolous reasons, such + as because the host later crashes or because of a predictable + resource shortage. + + If there is a delivery failure after acceptance of a message, the + receiver-SMTP MUST formulate and mail a notification message. This + notification MUST be sent using a null ("<>") reverse path in the + envelope. The recipient of this notification MUST be the address + from the envelope return path (or the Return-Path: line). However, + if this address is null ("<>"), the receiver-SMTP MUST NOT send a + notification. Obviously, nothing in this section can or should + prohibit local decisions (i.e., as part of the same system + environment as the receiver-SMTP) to log or otherwise transmit + information about null address events locally if that is desired. If + the address is an explicit source route, it MUST be stripped down to + its final hop. + + For example, suppose that an error notification must be sent for a + message that arrived with: + + MAIL FROM:<@a,@b:user@d> + + The notification message MUST be sent using: + + RCPT TO:<user@d> + + Some delivery failures after the message is accepted by SMTP will be + unavoidable. For example, it may be impossible for the receiving + SMTP server to validate all the delivery addresses in RCPT command(s) + due to a "soft" domain system error, because the target is a mailing + list (see earlier discussion of RCPT), or because the server is + acting as a relay and has no immediate access to the delivering + system. + + To avoid receiving duplicate messages as the result of timeouts, a + receiver-SMTP MUST seek to minimize the time required to respond to + the final <CRLF>.<CRLF> end of data indicator. See RFC 1047 [28] for + a discussion of this problem. + + + + + + +Klensin Standards Track [Page 62] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +6.2 Loop Detection + + Simple counting of the number of "Received:" headers in a message has + proven to be an effective, although rarely optimal, method of + detecting loops in mail systems. SMTP servers using this technique + SHOULD use a large rejection threshold, normally at least 100 + Received entries. Whatever mechanisms are used, servers MUST contain + provisions for detecting and stopping trivial loops. + +6.3 Compensating for Irregularities + + Unfortunately, variations, creative interpretations, and outright + violations of Internet mail protocols do occur; some would suggest + that they occur quite frequently. The debate as to whether a well- + behaved SMTP receiver or relay should reject a malformed message, + attempt to pass it on unchanged, or attempt to repair it to increase + the odds of successful delivery (or subsequent reply) began almost + with the dawn of structured network mail and shows no signs of + abating. Advocates of rejection claim that attempted repairs are + rarely completely adequate and that rejection of bad messages is the + only way to get the offending software repaired. Advocates of + "repair" or "deliver no matter what" argue that users prefer that + mail go through it if at all possible and that there are significant + market pressures in that direction. In practice, these market + pressures may be more important to particular vendors than strict + conformance to the standards, regardless of the preference of the + actual developers. + + The problems associated with ill-formed messages were exacerbated by + the introduction of the split-UA mail reading protocols [3, 26, 5, + 21]. These protocols have encouraged the use of SMTP as a posting + protocol, and SMTP servers as relay systems for these client hosts + (which are often only intermittently connected to the Internet). + Historically, many of those client machines lacked some of the + mechanisms and information assumed by SMTP (and indeed, by the mail + format protocol [7]). Some could not keep adequate track of time; + others had no concept of time zones; still others could not identify + their own names or addresses; and, of course, none could satisfy the + assumptions that underlay RFC 822's conception of authenticated + addresses. + + In response to these weak SMTP clients, many SMTP systems now + complete messages that are delivered to them in incomplete or + incorrect form. This strategy is generally considered appropriate + when the server can identify or authenticate the client, and there + are prior agreements between them. By contrast, there is at best + great concern about fixes applied by a relay or delivery SMTP server + that has little or no knowledge of the user or client machine. + + + +Klensin Standards Track [Page 63] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + The following changes to a message being processed MAY be applied + when necessary by an originating SMTP server, or one used as the + target of SMTP as an initial posting protocol: + + - Addition of a message-id field when none appears + + - Addition of a date, time or time zone when none appears + + - Correction of addresses to proper FQDN format + + The less information the server has about the client, the less likely + these changes are to be correct and the more caution and conservatism + should be applied when considering whether or not to perform fixes + and how. These changes MUST NOT be applied by an SMTP server that + provides an intermediate relay function. + + In all cases, properly-operating clients supplying correct + information are preferred to corrections by the SMTP server. In all + cases, documentation of actions performed by the servers (in trace + fields and/or header comments) is strongly encouraged. + +7. Security Considerations + +7.1 Mail Security and Spoofing + + SMTP mail is inherently insecure in that it is feasible for even + fairly casual users to negotiate directly with receiving and relaying + SMTP servers and create messages that will trick a naive recipient + into believing that they came from somewhere else. Constructing such + a message so that the "spoofed" behavior cannot be detected by an + expert is somewhat more difficult, but not sufficiently so as to be a + deterrent to someone who is determined and knowledgeable. + Consequently, as knowledge of Internet mail increases, so does the + knowledge that SMTP mail inherently cannot be authenticated, or + integrity checks provided, at the transport level. Real mail + security lies only in end-to-end methods involving the message + bodies, such as those which use digital signatures (see [14] and, + e.g., PGP [4] or S/MIME [31]). + + Various protocol extensions and configuration options that provide + authentication at the transport level (e.g., from an SMTP client to + an SMTP server) improve somewhat on the traditional situation + described above. However, unless they are accompanied by careful + handoffs of responsibility in a carefully-designed trust environment, + they remain inherently weaker than end-to-end mechanisms which use + digitally signed messages rather than depending on the integrity of + the transport system. + + + + +Klensin Standards Track [Page 64] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + Efforts to make it more difficult for users to set envelope return + path and header "From" fields to point to valid addresses other than + their own are largely misguided: they frustrate legitimate + applications in which mail is sent by one user on behalf of another + or in which error (or normal) replies should be directed to a special + address. (Systems that provide convenient ways for users to alter + these fields on a per-message basis should attempt to establish a + primary and permanent mailbox address for the user so that Sender + fields within the message data can be generated sensibly.) + + This specification does not further address the authentication issues + associated with SMTP other than to advocate that useful functionality + not be disabled in the hope of providing some small margin of + protection against an ignorant user who is trying to fake mail. + +7.2 "Blind" Copies + + Addresses that do not appear in the message headers may appear in the + RCPT commands to an SMTP server for a number of reasons. The two + most common involve the use of a mailing address as a "list exploder" + (a single address that resolves into multiple addresses) and the + appearance of "blind copies". Especially when more than one RCPT + command is present, and in order to avoid defeating some of the + purpose of these mechanisms, SMTP clients and servers SHOULD NOT copy + the full set of RCPT command arguments into the headers, either as + part of trace headers or as informational or private-extension + headers. Since this rule is often violated in practice, and cannot + be enforced, sending SMTP systems that are aware of "bcc" use MAY + find it helpful to send each blind copy as a separate message + transaction containing only a single RCPT command. + + There is no inherent relationship between either "reverse" (from + MAIL, SAML, etc., commands) or "forward" (RCPT) addresses in the SMTP + transaction ("envelope") and the addresses in the headers. Receiving + systems SHOULD NOT attempt to deduce such relationships and use them + to alter the headers of the message for delivery. The popular + "Apparently-to" header is a violation of this principle as well as a + common source of unintended information disclosure and SHOULD NOT be + used. + +7.3 VRFY, EXPN, and Security + + As discussed in section 3.5, individual sites may want to disable + either or both of VRFY or EXPN for security reasons. As a corollary + to the above, implementations that permit this MUST NOT appear to + have verified addresses that are not, in fact, verified. If a site + + + + + +Klensin Standards Track [Page 65] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + disables these commands for security reasons, the SMTP server MUST + return a 252 response, rather than a code that could be confused with + successful or unsuccessful verification. + + Returning a 250 reply code with the address listed in the VRFY + command after having checked it only for syntax violates this rule. + Of course, an implementation that "supports" VRFY by always returning + 550 whether or not the address is valid is equally not in + conformance. + + Within the last few years, the contents of mailing lists have become + popular as an address information source for so-called "spammers." + The use of EXPN to "harvest" addresses has increased as list + administrators have installed protections against inappropriate uses + of the lists themselves. Implementations SHOULD still provide + support for EXPN, but sites SHOULD carefully evaluate the tradeoffs. + As authentication mechanisms are introduced into SMTP, some sites may + choose to make EXPN available only to authenticated requestors. + +7.4 Information Disclosure in Announcements + + There has been an ongoing debate about the tradeoffs between the + debugging advantages of announcing server type and version (and, + sometimes, even server domain name) in the greeting response or in + response to the HELP command and the disadvantages of exposing + information that might be useful in a potential hostile attack. The + utility of the debugging information is beyond doubt. Those who + argue for making it available point out that it is far better to + actually secure an SMTP server rather than hope that trying to + conceal known vulnerabilities by hiding the server's precise identity + will provide more protection. Sites are encouraged to evaluate the + tradeoff with that issue in mind; implementations are strongly + encouraged to minimally provide for making type and version + information available in some way to other network hosts. + +7.5 Information Disclosure in Trace Fields + + In some circumstances, such as when mail originates from within a LAN + whose hosts are not directly on the public Internet, trace + ("Received") fields produced in conformance with this specification + may disclose host names and similar information that would not + normally be available. This ordinarily does not pose a problem, but + sites with special concerns about name disclosure should be aware of + it. Also, the optional FOR clause should be supplied with caution or + not at all when multiple recipients are involved lest it + inadvertently disclose the identities of "blind copy" recipients to + others. + + + + +Klensin Standards Track [Page 66] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +7.6 Information Disclosure in Message Forwarding + + As discussed in section 3.4, use of the 251 or 551 reply codes to + identify the replacement address associated with a mailbox may + inadvertently disclose sensitive information. Sites that are + concerned about those issues should ensure that they select and + configure servers appropriately. + +7.7 Scope of Operation of SMTP Servers + + It is a well-established principle that an SMTP server may refuse to + accept mail for any operational or technical reason that makes sense + to the site providing the server. However, cooperation among sites + and installations makes the Internet possible. If sites take + excessive advantage of the right to reject traffic, the ubiquity of + email availability (one of the strengths of the Internet) will be + threatened; considerable care should be taken and balance maintained + if a site decides to be selective about the traffic it will accept + and process. + + In recent years, use of the relay function through arbitrary sites + has been used as part of hostile efforts to hide the actual origins + of mail. Some sites have decided to limit the use of the relay + function to known or identifiable sources, and implementations SHOULD + provide the capability to perform this type of filtering. When mail + is rejected for these or other policy reasons, a 550 code SHOULD be + used in response to EHLO, MAIL, or RCPT as appropriate. + +8. IANA Considerations + + IANA will maintain three registries in support of this specification. + The first consists of SMTP service extensions with the associated + keywords, and, as needed, parameters and verbs. As specified in + section 2.2.2, no entry may be made in this registry that starts in + an "X". Entries may be made only for service extensions (and + associated keywords, parameters, or verbs) that are defined in + standards-track or experimental RFCs specifically approved by the + IESG for this purpose. + + The second registry consists of "tags" that identify forms of domain + literals other than those for IPv4 addresses (specified in RFC 821 + and in this document) and IPv6 addresses (specified in this + document). Additional literal types require standardization before + being used; none are anticipated at this time. + + The third, established by RFC 821 and renewed by this specification, + is a registry of link and protocol identifiers to be used with the + "via" and "with" subclauses of the time stamp ("Received: header") + + + +Klensin Standards Track [Page 67] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + described in section 4.4. Link and protocol identifiers in addition + to those specified in this document may be registered only by + standardization or by way of an RFC-documented, IESG-approved, + Experimental protocol extension. + +9. References + + [1] American National Standards Institute (formerly United States of + America Standards Institute), X3.4, 1968, "USA Code for + Information Interchange". ANSI X3.4-1968 has been replaced by + newer versions with slight modifications, but the 1968 version + remains definitive for the Internet. + + [2] Braden, R., "Requirements for Internet hosts - application and + support", STD 3, RFC 1123, October 1989. + + [3] Butler, M., Chase, D., Goldberger, J., Postel, J. and J. + Reynolds, "Post Office Protocol - version 2", RFC 937, February + 1985. + + [4] Callas, J., Donnerhacke, L., Finney, H. and R. Thayer, "OpenPGP + Message Format", RFC 2440, November 1998. + + [5] Crispin, M., "Interactive Mail Access Protocol - Version 2", RFC + 1176, August 1990. + + [6] Crispin, M., "Internet Message Access Protocol - Version 4", RFC + 2060, December 1996. + + [7] Crocker, D., "Standard for the Format of ARPA Internet Text + Messages", RFC 822, August 1982. + + [8] Crocker, D. and P. Overell, Eds., "Augmented BNF for Syntax + Specifications: ABNF", RFC 2234, November 1997. + + [9] De Winter, J., "SMTP Service Extension for Remote Message Queue + Starting", RFC 1985, August 1996. + + [10] Fajman, R., "An Extensible Message Format for Message + Disposition Notifications", RFC 2298, March 1998. + + [11] Freed, N, "Behavior of and Requirements for Internet Firewalls", + RFC 2979, October 2000. + + [12] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message Bodies", + RFC 2045, December 1996. + + + + +Klensin Standards Track [Page 68] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + [13] Freed, N., "SMTP Service Extension for Command Pipelining", RFC + 2920, September 2000. + + [14] Galvin, J., Murphy, S., Crocker, S. and N. Freed, "Security + Multiparts for MIME: Multipart/Signed and Multipart/Encrypted", + RFC 1847, October 1995. + + [15] Gellens, R. and J. Klensin, "Message Submission", RFC 2476, + December 1998. + + [16] Kille, S., "Mapping between X.400 and RFC822/MIME", RFC 2156, + January 1998. + + [17] Hinden, R and S. Deering, Eds. "IP Version 6 Addressing + Architecture", RFC 2373, July 1998. + + [18] Klensin, J., Freed, N. and K. Moore, "SMTP Service Extension for + Message Size Declaration", STD 10, RFC 1870, November 1995. + + [19] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker, + "SMTP Service Extensions", STD 10, RFC 1869, November 1995. + + [20] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker, + "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July + 1994. + + [21] Lambert, M., "PCMAIL: A distributed mail system for personal + computers", RFC 1056, July 1988. + + [22] Mockapetris, P., "Domain names - implementation and + specification", STD 13, RFC 1035, November 1987. + + Mockapetris, P., "Domain names - concepts and facilities", STD + 13, RFC 1034, November 1987. + + [23] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part + Three: Message Header Extensions for Non-ASCII Text", RFC 2047, + December 1996. + + [24] Moore, K., "SMTP Service Extension for Delivery Status + Notifications", RFC 1891, January 1996. + + [25] Moore, K., and G. Vaudreuil, "An Extensible Message Format for + Delivery Status Notifications", RFC 1894, January 1996. + + [26] Myers, J. and M. Rose, "Post Office Protocol - Version 3", STD + 53, RFC 1939, May 1996. + + + + +Klensin Standards Track [Page 69] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + [27] Partridge, C., "Mail routing and the domain system", RFC 974, + January 1986. + + [28] Partridge, C., "Duplicate messages and SMTP", RFC 1047, February + 1988. + + [29] Postel, J., ed., "Transmission Control Protocol - DARPA Internet + Program Protocol Specification", STD 7, RFC 793, September 1981. + + [30] Postel, J., "Simple Mail Transfer Protocol", RFC 821, August + 1982. + + [31] Ramsdell, B., Ed., "S/MIME Version 3 Message Specification", RFC + 2633, June 1999. + + [32] Resnick, P., Ed., "Internet Message Format", RFC 2822, April + 2001. + + [33] Vaudreuil, G., "SMTP Service Extensions for Transmission of + Large and Binary MIME Messages", RFC 1830, August 1995. + + [34] Vaudreuil, G., "Enhanced Mail System Status Codes", RFC 1893, + January 1996. + +10. Editor's Address + + John C. Klensin + AT&T Laboratories + 99 Bedford St + Boston, MA 02111 USA + + Phone: 617-574-3076 + EMail: klensin@research.att.com + +11. Acknowledgments + + Many people worked long and hard on the many iterations of this + document. There was wide-ranging debate in the IETF DRUMS Working + Group, both on its mailing list and in face to face discussions, + about many technical issues and the role of a revised standard for + Internet mail transport, and many contributors helped form the + wording in this specification. The hundreds of participants in the + many discussions since RFC 821 was produced are too numerous to + mention, but they all helped this document become what it is. + + + + + + + +Klensin Standards Track [Page 70] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +APPENDICES + +A. TCP Transport Service + + The TCP connection supports the transmission of 8-bit bytes. The + SMTP data is 7-bit ASCII characters. Each character is transmitted + as an 8-bit byte with the high-order bit cleared to zero. Service + extensions may modify this rule to permit transmission of full 8-bit + data bytes as part of the message body, but not in SMTP commands or + responses. + +B. Generating SMTP Commands from RFC 822 Headers + + Some systems use RFC 822 headers (only) in a mail submission + protocol, or otherwise generate SMTP commands from RFC 822 headers + when such a message is handed to an MTA from a UA. While the MTA-UA + protocol is a private matter, not covered by any Internet Standard, + there are problems with this approach. For example, there have been + repeated problems with proper handling of "bcc" copies and + redistribution lists when information that conceptually belongs to a + mail envelopes is not separated early in processing from header + information (and kept separate). + + It is recommended that the UA provide its initial ("submission + client") MTA with an envelope separate from the message itself. + However, if the envelope is not supplied, SMTP commands SHOULD be + generated as follows: + + 1. Each recipient address from a TO, CC, or BCC header field SHOULD + be copied to a RCPT command (generating multiple message copies if + that is required for queuing or delivery). This includes any + addresses listed in a RFC 822 "group". Any BCC fields SHOULD then + be removed from the headers. Once this process is completed, the + remaining headers SHOULD be checked to verify that at least one + To:, Cc:, or Bcc: header remains. If none do, then a bcc: header + with no additional information SHOULD be inserted as specified in + [32]. + + 2. The return address in the MAIL command SHOULD, if possible, be + derived from the system's identity for the submitting (local) + user, and the "From:" header field otherwise. If there is a + system identity available, it SHOULD also be copied to the Sender + header field if it is different from the address in the From + header field. (Any Sender field that was already there SHOULD be + removed.) Systems may provide a way for submitters to override + the envelope return address, but may want to restrict its use to + privileged users. This will not prevent mail forgery, but may + lessen its incidence; see section 7.1. + + + +Klensin Standards Track [Page 71] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + When an MTA is being used in this way, it bears responsibility for + ensuring that the message being transmitted is valid. The mechanisms + for checking that validity, and for handling (or returning) messages + that are not valid at the time of arrival, are part of the MUA-MTA + interface and not covered by this specification. + + A submission protocol based on Standard RFC 822 information alone + MUST NOT be used to gateway a message from a foreign (non-SMTP) mail + system into an SMTP environment. Additional information to construct + an envelope must come from some source in the other environment, + whether supplemental headers or the foreign system's envelope. + + Attempts to gateway messages using only their header "to" and "cc" + fields have repeatedly caused mail loops and other behavior adverse + to the proper functioning of the Internet mail environment. These + problems have been especially common when the message originates from + an Internet mailing list and is distributed into the foreign + environment using envelope information. When these messages are then + processed by a header-only remailer, loops back to the Internet + environment (and the mailing list) are almost inevitable. + +C. Source Routes + + Historically, the <reverse-path> was a reverse source routing list of + hosts and a source mailbox. The first host in the <reverse-path> + SHOULD be the host sending the MAIL command. Similarly, the + <forward-path> may be a source routing lists of hosts and a + destination mailbox. However, in general, the <forward-path> SHOULD + contain only a mailbox and domain name, relying on the domain name + system to supply routing information if required. The use of source + routes is deprecated; while servers MUST be prepared to receive and + handle them as discussed in section 3.3 and F.2, clients SHOULD NOT + transmit them and this section was included only to provide context. + + For relay purposes, the forward-path may be a source route of the + form "@ONE,@TWO:JOE@THREE", where ONE, TWO, and THREE MUST BE fully- + qualified domain names. This form is used to emphasize the + distinction between an address and a route. The mailbox is an + absolute address, and the route is information about how to get + there. The two concepts should not be confused. + + If source routes are used, RFC 821 and the text below should be + consulted for the mechanisms for constructing and updating the + forward- and reverse-paths. + + + + + + + +Klensin Standards Track [Page 72] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + The SMTP server transforms the command arguments by moving its own + identifier (its domain name or that of any domain for which it is + acting as a mail exchanger), if it appears, from the forward-path to + the beginning of the reverse-path. + + Notice that the forward-path and reverse-path appear in the SMTP + commands and replies, but not necessarily in the message. That is, + there is no need for these paths and especially this syntax to appear + in the "To:" , "From:", "CC:", etc. fields of the message header. + Conversely, SMTP servers MUST NOT derive final message delivery + information from message header fields. + + When the list of hosts is present, it is a "reverse" source route and + indicates that the mail was relayed through each host on the list + (the first host in the list was the most recent relay). This list is + used as a source route to return non-delivery notices to the sender. + As each relay host adds itself to the beginning of the list, it MUST + use its name as known in the transport environment to which it is + relaying the mail rather than that of the transport environment from + which the mail came (if they are different). + +D. Scenarios + + This section presents complete scenarios of several types of SMTP + sessions. In the examples, "C:" indicates what is said by the SMTP + client, and "S:" indicates what is said by the SMTP server. + +D.1 A Typical SMTP Transaction Scenario + + This SMTP example shows mail sent by Smith at host bar.com, to Jones, + Green, and Brown at host foo.com. Here we assume that host bar.com + contacts host foo.com directly. The mail is accepted for Jones and + Brown. Green does not have a mailbox at host foo.com. + + S: 220 foo.com Simple Mail Transfer Service Ready + C: EHLO bar.com + S: 250-foo.com greets bar.com + S: 250-8BITMIME + S: 250-SIZE + S: 250-DSN + S: 250 HELP + C: MAIL FROM:<Smith@bar.com> + S: 250 OK + C: RCPT TO:<Jones@foo.com> + S: 250 OK + C: RCPT TO:<Green@foo.com> + S: 550 No such user here + C: RCPT TO:<Brown@foo.com> + + + +Klensin Standards Track [Page 73] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + S: 250 OK + C: DATA + S: 354 Start mail input; end with <CRLF>.<CRLF> + C: Blah blah blah... + C: ...etc. etc. etc. + C: . + S: 250 OK + C: QUIT + S: 221 foo.com Service closing transmission channel + +D.2 Aborted SMTP Transaction Scenario + + S: 220 foo.com Simple Mail Transfer Service Ready + C: EHLO bar.com + S: 250-foo.com greets bar.com + S: 250-8BITMIME + S: 250-SIZE + S: 250-DSN + S: 250 HELP + C: MAIL FROM:<Smith@bar.com> + S: 250 OK + C: RCPT TO:<Jones@foo.com> + S: 250 OK + C: RCPT TO:<Green@foo.com> + S: 550 No such user here + C: RSET + S: 250 OK + C: QUIT + S: 221 foo.com Service closing transmission channel + +D.3 Relayed Mail Scenario + + Step 1 -- Source Host to Relay Host + + S: 220 foo.com Simple Mail Transfer Service Ready + C: EHLO bar.com + S: 250-foo.com greets bar.com + S: 250-8BITMIME + S: 250-SIZE + S: 250-DSN + S: 250 HELP + C: MAIL FROM:<JQP@bar.com> + S: 250 OK + C: RCPT TO:<@foo.com:Jones@XYZ.COM> + S: 250 OK + C: DATA + S: 354 Start mail input; end with <CRLF>.<CRLF> + C: Date: Thu, 21 May 1998 05:33:29 -0700 + + + +Klensin Standards Track [Page 74] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + C: From: John Q. Public <JQP@bar.com> + C: Subject: The Next Meeting of the Board + C: To: Jones@xyz.com + C: + C: Bill: + C: The next meeting of the board of directors will be + C: on Tuesday. + C: John. + C: . + S: 250 OK + C: QUIT + S: 221 foo.com Service closing transmission channel + + Step 2 -- Relay Host to Destination Host + + S: 220 xyz.com Simple Mail Transfer Service Ready + C: EHLO foo.com + S: 250 xyz.com is on the air + C: MAIL FROM:<@foo.com:JQP@bar.com> + S: 250 OK + C: RCPT TO:<Jones@XYZ.COM> + S: 250 OK + C: DATA + S: 354 Start mail input; end with <CRLF>.<CRLF> + C: Received: from bar.com by foo.com ; Thu, 21 May 1998 + C: 05:33:29 -0700 + C: Date: Thu, 21 May 1998 05:33:22 -0700 + C: From: John Q. Public <JQP@bar.com> + C: Subject: The Next Meeting of the Board + C: To: Jones@xyz.com + C: + C: Bill: + C: The next meeting of the board of directors will be + C: on Tuesday. + C: John. + C: . + S: 250 OK + C: QUIT + S: 221 foo.com Service closing transmission channel + +D.4 Verifying and Sending Scenario + + S: 220 foo.com Simple Mail Transfer Service Ready + C: EHLO bar.com + S: 250-foo.com greets bar.com + S: 250-8BITMIME + S: 250-SIZE + S: 250-DSN + + + +Klensin Standards Track [Page 75] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + + S: 250-VRFY + S: 250 HELP + C: VRFY Crispin + S: 250 Mark Crispin <Admin.MRC@foo.com> + C: SEND FROM:<EAK@bar.com> + S: 250 OK + C: RCPT TO:<Admin.MRC@foo.com> + S: 250 OK + C: DATA + S: 354 Start mail input; end with <CRLF>.<CRLF> + C: Blah blah blah... + C: ...etc. etc. etc. + C: . + S: 250 OK + C: QUIT + S: 221 foo.com Service closing transmission channel + +E. Other Gateway Issues + + In general, gateways between the Internet and other mail systems + SHOULD attempt to preserve any layering semantics across the + boundaries between the two mail systems involved. Gateway- + translation approaches that attempt to take shortcuts by mapping, + (such as envelope information from one system to the message headers + or body of another) have generally proven to be inadequate in + important ways. Systems translating between environments that do not + support both envelopes and headers and Internet mail must be written + with the understanding that some information loss is almost + inevitable. + +F. Deprecated Features of RFC 821 + + A few features of RFC 821 have proven to be problematic and SHOULD + NOT be used in Internet mail. + +F.1 TURN + + This command, described in RFC 821, raises important security issues + since, in the absence of strong authentication of the host requesting + that the client and server switch roles, it can easily be used to + divert mail from its correct destination. Its use is deprecated; + SMTP systems SHOULD NOT use it unless the server can authenticate the + client. + + + + + + + + +Klensin Standards Track [Page 76] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +F.2 Source Routing + + RFC 821 utilized the concept of explicit source routing to get mail + from one host to another via a series of relays. The requirement to + utilize source routes in regular mail traffic was eliminated by the + introduction of the domain name system "MX" record and the last + significant justification for them was eliminated by the + introduction, in RFC 1123, of a clear requirement that addresses + following an "@" must all be fully-qualified domain names. + Consequently, the only remaining justifications for the use of source + routes are support for very old SMTP clients or MUAs and in mail + system debugging. They can, however, still be useful in the latter + circumstance and for routing mail around serious, but temporary, + problems such as problems with the relevant DNS records. + + SMTP servers MUST continue to accept source route syntax as specified + in the main body of this document and in RFC 1123. They MAY, if + necessary, ignore the routes and utilize only the target domain in + the address. If they do utilize the source route, the message MUST + be sent to the first domain shown in the address. In particular, a + server MUST NOT guess at shortcuts within the source route. + + Clients SHOULD NOT utilize explicit source routing except under + unusual circumstances, such as debugging or potentially relaying + around firewall or mail system configuration errors. + +F.3 HELO + + As discussed in sections 3.1 and 4.1.1, EHLO is strongly preferred to + HELO when the server will accept the former. Servers must continue + to accept and process HELO in order to support older clients. + +F.4 #-literals + + RFC 821 provided for specifying an Internet address as a decimal + integer host number prefixed by a pound sign, "#". In practice, that + form has been obsolete since the introduction of TCP/IP. It is + deprecated and MUST NOT be used. + +F.5 Dates and Years + + When dates are inserted into messages by SMTP clients or servers + (e.g., in trace fields), four-digit years MUST BE used. Two-digit + years are deprecated; three-digit years were never permitted in the + Internet mail system. + + + + + + +Klensin Standards Track [Page 77] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +F.6 Sending versus Mailing + + In addition to specifying a mechanism for delivering messages to + user's mailboxes, RFC 821 provided additional, optional, commands to + deliver messages directly to the user's terminal screen. These + commands (SEND, SAML, SOML) were rarely implemented, and changes in + workstation technology and the introduction of other protocols may + have rendered them obsolete even where they are implemented. + + Clients SHOULD NOT provide SEND, SAML, or SOML as services. Servers + MAY implement them. If they are implemented by servers, the + implementation model specified in RFC 821 MUST be used and the + command names MUST be published in the response to the EHLO command. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 78] + +RFC 2821 Simple Mail Transfer Protocol April 2001 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2001). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 79] + diff --git a/rfc/rfc2822.txt b/rfc/rfc2822.txt @@ -0,0 +1,2859 @@ + + + + + + +Network Working Group P. Resnick, Editor +Request for Comments: 2822 QUALCOMM Incorporated +Obsoletes: 822 April 2001 +Category: Standards Track + + + Internet Message Format + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2001). All Rights Reserved. + +Abstract + + This standard specifies a syntax for text messages that are sent + between computer users, within the framework of "electronic mail" + messages. This standard supersedes the one specified in Request For + Comments (RFC) 822, "Standard for the Format of ARPA Internet Text + Messages", updating it to reflect current practice and incorporating + incremental changes that were specified in other RFCs. + +Table of Contents + + 1. Introduction ............................................... 3 + 1.1. Scope .................................................... 3 + 1.2. Notational conventions ................................... 4 + 1.2.1. Requirements notation .................................. 4 + 1.2.2. Syntactic notation ..................................... 4 + 1.3. Structure of this document ............................... 4 + 2. Lexical Analysis of Messages ............................... 5 + 2.1. General Description ...................................... 5 + 2.1.1. Line Length Limits ..................................... 6 + 2.2. Header Fields ............................................ 7 + 2.2.1. Unstructured Header Field Bodies ....................... 7 + 2.2.2. Structured Header Field Bodies ......................... 7 + 2.2.3. Long Header Fields ..................................... 7 + 2.3. Body ..................................................... 8 + 3. Syntax ..................................................... 9 + 3.1. Introduction ............................................. 9 + 3.2. Lexical Tokens ........................................... 9 + + + +Resnick Standards Track [Page 1] + +RFC 2822 Internet Message Format April 2001 + + + 3.2.1. Primitive Tokens ....................................... 9 + 3.2.2. Quoted characters ......................................10 + 3.2.3. Folding white space and comments .......................11 + 3.2.4. Atom ...................................................12 + 3.2.5. Quoted strings .........................................13 + 3.2.6. Miscellaneous tokens ...................................13 + 3.3. Date and Time Specification ..............................14 + 3.4. Address Specification ....................................15 + 3.4.1. Addr-spec specification ................................16 + 3.5 Overall message syntax ....................................17 + 3.6. Field definitions ........................................18 + 3.6.1. The origination date field .............................20 + 3.6.2. Originator fields ......................................21 + 3.6.3. Destination address fields .............................22 + 3.6.4. Identification fields ..................................23 + 3.6.5. Informational fields ...................................26 + 3.6.6. Resent fields ..........................................26 + 3.6.7. Trace fields ...........................................28 + 3.6.8. Optional fields ........................................29 + 4. Obsolete Syntax ............................................29 + 4.1. Miscellaneous obsolete tokens ............................30 + 4.2. Obsolete folding white space .............................31 + 4.3. Obsolete Date and Time ...................................31 + 4.4. Obsolete Addressing ......................................33 + 4.5. Obsolete header fields ...................................33 + 4.5.1. Obsolete origination date field ........................34 + 4.5.2. Obsolete originator fields .............................34 + 4.5.3. Obsolete destination address fields ....................34 + 4.5.4. Obsolete identification fields .........................35 + 4.5.5. Obsolete informational fields ..........................35 + 4.5.6. Obsolete resent fields .................................35 + 4.5.7. Obsolete trace fields ..................................36 + 4.5.8. Obsolete optional fields ...............................36 + 5. Security Considerations ....................................36 + 6. Bibliography ...............................................37 + 7. Editor's Address ...........................................38 + 8. Acknowledgements ...........................................39 + Appendix A. Example messages ..................................41 + A.1. Addressing examples ......................................41 + A.1.1. A message from one person to another with simple + addressing .............................................41 + A.1.2. Different types of mailboxes ...........................42 + A.1.3. Group addresses ........................................43 + A.2. Reply messages ...........................................43 + A.3. Resent messages ..........................................44 + A.4. Messages with trace fields ...............................46 + A.5. White space, comments, and other oddities ................47 + A.6. Obsoleted forms ..........................................47 + + + +Resnick Standards Track [Page 2] + +RFC 2822 Internet Message Format April 2001 + + + A.6.1. Obsolete addressing ....................................48 + A.6.2. Obsolete dates .........................................48 + A.6.3. Obsolete white space and comments ......................48 + Appendix B. Differences from earlier standards ................49 + Appendix C. Notices ...........................................50 + Full Copyright Statement ......................................51 + +1. Introduction + +1.1. Scope + + This standard specifies a syntax for text messages that are sent + between computer users, within the framework of "electronic mail" + messages. This standard supersedes the one specified in Request For + Comments (RFC) 822, "Standard for the Format of ARPA Internet Text + Messages" [RFC822], updating it to reflect current practice and + incorporating incremental changes that were specified in other RFCs + [STD3]. + + This standard specifies a syntax only for text messages. In + particular, it makes no provision for the transmission of images, + audio, or other sorts of structured data in electronic mail messages. + There are several extensions published, such as the MIME document + series [RFC2045, RFC2046, RFC2049], which describe mechanisms for the + transmission of such data through electronic mail, either by + extending the syntax provided here or by structuring such messages to + conform to this syntax. Those mechanisms are outside of the scope of + this standard. + + In the context of electronic mail, messages are viewed as having an + envelope and contents. The envelope contains whatever information is + needed to accomplish transmission and delivery. (See [RFC2821] for a + discussion of the envelope.) The contents comprise the object to be + delivered to the recipient. This standard applies only to the format + and some of the semantics of message contents. It contains no + specification of the information in the envelope. + + However, some message systems may use information from the contents + to create the envelope. It is intended that this standard facilitate + the acquisition of such information by programs. + + This specification is intended as a definition of what message + content format is to be passed between systems. Though some message + systems locally store messages in this format (which eliminates the + need for translation between formats) and others use formats that + differ from the one specified in this standard, local storage is + outside of the scope of this standard. + + + + +Resnick Standards Track [Page 3] + +RFC 2822 Internet Message Format April 2001 + + + Note: This standard is not intended to dictate the internal formats + used by sites, the specific message system features that they are + expected to support, or any of the characteristics of user interface + programs that create or read messages. In addition, this standard + does not specify an encoding of the characters for either transport + or storage; that is, it does not specify the number of bits used or + how those bits are specifically transferred over the wire or stored + on disk. + +1.2. Notational conventions + +1.2.1. Requirements notation + + This document occasionally uses terms that appear in capital letters. + When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD + NOT", and "MAY" appear capitalized, they are being used to indicate + particular requirements of this specification. A discussion of the + meanings of these terms appears in [RFC2119]. + +1.2.2. Syntactic notation + + This standard uses the Augmented Backus-Naur Form (ABNF) notation + specified in [RFC2234] for the formal definitions of the syntax of + messages. Characters will be specified either by a decimal value + (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by + a case-insensitive literal value enclosed in quotation marks (e.g., + "A" for either uppercase or lowercase A). See [RFC2234] for the full + description of the notation. + +1.3. Structure of this document + + This document is divided into several sections. + + This section, section 1, is a short introduction to the document. + + Section 2 lays out the general description of a message and its + constituent parts. This is an overview to help the reader understand + some of the general principles used in the later portions of this + document. Any examples in this section MUST NOT be taken as + specification of the formal syntax of any part of a message. + + Section 3 specifies formal ABNF rules for the structure of each part + of a message (the syntax) and describes the relationship between + those parts and their meaning in the context of a message (the + semantics). That is, it describes the actual rules for the structure + of each part of a message (the syntax) as well as a description of + the parts and instructions on how they ought to be interpreted (the + semantics). This includes analysis of the syntax and semantics of + + + +Resnick Standards Track [Page 4] + +RFC 2822 Internet Message Format April 2001 + + + subparts of messages that have specific structure. The syntax + included in section 3 represents messages as they MUST be created. + There are also notes in section 3 to indicate if any of the options + specified in the syntax SHOULD be used over any of the others. + + Both sections 2 and 3 describe messages that are legal to generate + for purposes of this standard. + + Section 4 of this document specifies an "obsolete" syntax. There are + references in section 3 to these obsolete syntactic elements. The + rules of the obsolete syntax are elements that have appeared in + earlier revisions of this standard or have previously been widely + used in Internet messages. As such, these elements MUST be + interpreted by parsers of messages in order to be conformant to this + standard. However, since items in this syntax have been determined + to be non-interoperable or to cause significant problems for + recipients of messages, they MUST NOT be generated by creators of + conformant messages. + + Section 5 details security considerations to take into account when + implementing this standard. + + Section 6 is a bibliography of references in this document. + + Section 7 contains the editor's address. + + Section 8 contains acknowledgements. + + Appendix A lists examples of different sorts of messages. These + examples are not exhaustive of the types of messages that appear on + the Internet, but give a broad overview of certain syntactic forms. + + Appendix B lists the differences between this standard and earlier + standards for Internet messages. + + Appendix C has copyright and intellectual property notices. + +2. Lexical Analysis of Messages + +2.1. General Description + + At the most basic level, a message is a series of characters. A + message that is conformant with this standard is comprised of + characters with values in the range 1 through 127 and interpreted as + US-ASCII characters [ASCII]. For brevity, this document sometimes + refers to this range of characters as simply "US-ASCII characters". + + + + + +Resnick Standards Track [Page 5] + +RFC 2822 Internet Message Format April 2001 + + + Note: This standard specifies that messages are made up of characters + in the US-ASCII range of 1 through 127. There are other documents, + specifically the MIME document series [RFC2045, RFC2046, RFC2047, + RFC2048, RFC2049], that extend this standard to allow for values + outside of that range. Discussion of those mechanisms is not within + the scope of this standard. + + Messages are divided into lines of characters. A line is a series of + characters that is delimited with the two characters carriage-return + and line-feed; that is, the carriage return (CR) character (ASCII + value 13) followed immediately by the line feed (LF) character (ASCII + value 10). (The carriage-return/line-feed pair is usually written in + this document as "CRLF".) + + A message consists of header fields (collectively called "the header + of the message") followed, optionally, by a body. The header is a + sequence of lines of characters with special syntax as defined in + this standard. The body is simply a sequence of characters that + follows the header and is separated from the header by an empty line + (i.e., a line with nothing preceding the CRLF). + +2.1.1. Line Length Limits + + There are two limits that this standard places on the number of + characters in a line. Each line of characters MUST be no more than + 998 characters, and SHOULD be no more than 78 characters, excluding + the CRLF. + + The 998 character limit is due to limitations in many implementations + which send, receive, or store Internet Message Format messages that + simply cannot handle more than 998 characters on a line. Receiving + implementations would do well to handle an arbitrarily large number + of characters in a line for robustness sake. However, there are so + many implementations which (in compliance with the transport + requirements of [RFC2821]) do not accept messages containing more + than 1000 character including the CR and LF per line, it is important + for implementations not to create such messages. + + The more conservative 78 character recommendation is to accommodate + the many implementations of user interfaces that display these + messages which may truncate, or disastrously wrap, the display of + more than 78 characters per line, in spite of the fact that such + implementations are non-conformant to the intent of this + specification (and that of [RFC2821] if they actually cause + information to be lost). Again, even though this limitation is put on + messages, it is encumbant upon implementations which display messages + + + + + +Resnick Standards Track [Page 6] + +RFC 2822 Internet Message Format April 2001 + + + to handle an arbitrarily large number of characters in a line + (certainly at least up to the 998 character limit) for the sake of + robustness. + +2.2. Header Fields + + Header fields are lines composed of a field name, followed by a colon + (":"), followed by a field body, and terminated by CRLF. A field + name MUST be composed of printable US-ASCII characters (i.e., + characters that have values between 33 and 126, inclusive), except + colon. A field body may be composed of any US-ASCII characters, + except for CR and LF. However, a field body may contain CRLF when + used in header "folding" and "unfolding" as described in section + 2.2.3. All field bodies MUST conform to the syntax described in + sections 3 and 4 of this standard. + +2.2.1. Unstructured Header Field Bodies + + Some field bodies in this standard are defined simply as + "unstructured" (which is specified below as any US-ASCII characters, + except for CR and LF) with no further restrictions. These are + referred to as unstructured field bodies. Semantically, unstructured + field bodies are simply to be treated as a single line of characters + with no further processing (except for header "folding" and + "unfolding" as described in section 2.2.3). + +2.2.2. Structured Header Field Bodies + + Some field bodies in this standard have specific syntactical + structure more restrictive than the unstructured field bodies + described above. These are referred to as "structured" field bodies. + Structured field bodies are sequences of specific lexical tokens as + described in sections 3 and 4 of this standard. Many of these tokens + are allowed (according to their syntax) to be introduced or end with + comments (as described in section 3.2.3) as well as the space (SP, + ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters + (together known as the white space characters, WSP), and those WSP + characters are subject to header "folding" and "unfolding" as + described in section 2.2.3. Semantic analysis of structured field + bodies is given along with their syntax. + +2.2.3. Long Header Fields + + Each header field is logically a single line of characters comprising + the field name, the colon, and the field body. For convenience + however, and to deal with the 998/78 character limitations per line, + the field body portion of a header field can be split into a multiple + line representation; this is called "folding". The general rule is + + + +Resnick Standards Track [Page 7] + +RFC 2822 Internet Message Format April 2001 + + + that wherever this standard allows for folding white space (not + simply WSP characters), a CRLF may be inserted before any WSP. For + example, the header field: + + Subject: This is a test + + can be represented as: + + Subject: This + is a test + + Note: Though structured field bodies are defined in such a way that + folding can take place between many of the lexical tokens (and even + within some of the lexical tokens), folding SHOULD be limited to + placing the CRLF at higher-level syntactic breaks. For instance, if + a field body is defined as comma-separated values, it is recommended + that folding occur after the comma separating the structured items in + preference to other places where the field could be folded, even if + it is allowed elsewhere. + + The process of moving from this folded multiple-line representation + of a header field to its single line representation is called + "unfolding". Unfolding is accomplished by simply removing any CRLF + that is immediately followed by WSP. Each header field should be + treated in its unfolded form for further syntactic and semantic + evaluation. + +2.3. Body + + The body of a message is simply lines of US-ASCII characters. The + only two limitations on the body are as follows: + + - CR and LF MUST only occur together as CRLF; they MUST NOT appear + independently in the body. + + - Lines of characters in the body MUST be limited to 998 characters, + and SHOULD be limited to 78 characters, excluding the CRLF. + + Note: As was stated earlier, there are other standards documents, + specifically the MIME documents [RFC2045, RFC2046, RFC2048, RFC2049] + that extend this standard to allow for different sorts of message + bodies. Again, these mechanisms are beyond the scope of this + document. + + + + + + + + +Resnick Standards Track [Page 8] + +RFC 2822 Internet Message Format April 2001 + + +3. Syntax + +3.1. Introduction + + The syntax as given in this section defines the legal syntax of + Internet messages. Messages that are conformant to this standard + MUST conform to the syntax in this section. If there are options in + this section where one option SHOULD be generated, that is indicated + either in the prose or in a comment next to the syntax. + + For the defined expressions, a short description of the syntax and + use is given, followed by the syntax in ABNF, followed by a semantic + analysis. Primitive tokens that are used but otherwise unspecified + come from [RFC2234]. + + In some of the definitions, there will be nonterminals whose names + start with "obs-". These "obs-" elements refer to tokens defined in + the obsolete syntax in section 4. In all cases, these productions + are to be ignored for the purposes of generating legal Internet + messages and MUST NOT be used as part of such a message. However, + when interpreting messages, these tokens MUST be honored as part of + the legal syntax. In this sense, section 3 defines a grammar for + generation of messages, with "obs-" elements that are to be ignored, + while section 4 adds grammar for interpretation of messages. + +3.2. Lexical Tokens + + The following rules are used to define an underlying lexical + analyzer, which feeds tokens to the higher-level parsers. This + section defines the tokens used in structured header field bodies. + + Note: Readers of this standard need to pay special attention to how + these lexical tokens are used in both the lower-level and + higher-level syntax later in the document. Particularly, the white + space tokens and the comment tokens defined in section 3.2.3 get used + in the lower-level tokens defined here, and those lower-level tokens + are in turn used as parts of the higher-level tokens defined later. + Therefore, the white space and comments may be allowed in the + higher-level tokens even though they may not explicitly appear in a + particular definition. + +3.2.1. Primitive Tokens + + The following are primitive tokens referred to elsewhere in this + standard, but not otherwise defined in [RFC2234]. Some of them will + not appear anywhere else in the syntax, but they are convenient to + refer to in other parts of this document. + + + + +Resnick Standards Track [Page 9] + +RFC 2822 Internet Message Format April 2001 + + + Note: The "specials" below are just such an example. Though the + specials token does not appear anywhere else in this standard, it is + useful for implementers who use tools that lexically analyze + messages. Each of the characters in specials can be used to indicate + a tokenization point in lexical analysis. + +NO-WS-CTL = %d1-8 / ; US-ASCII control characters + %d11 / ; that do not include the + %d12 / ; carriage return, line feed, + %d14-31 / ; and white space characters + %d127 + +text = %d1-9 / ; Characters excluding CR and LF + %d11 / + %d12 / + %d14-127 / + obs-text + +specials = "(" / ")" / ; Special characters used in + "<" / ">" / ; other parts of the syntax + "[" / "]" / + ":" / ";" / + "@" / "\" / + "," / "." / + DQUOTE + + No special semantics are attached to these tokens. They are simply + single characters. + +3.2.2. Quoted characters + + Some characters are reserved for special interpretation, such as + delimiting lexical tokens. To permit use of these characters as + uninterpreted data, a quoting mechanism is provided. + +quoted-pair = ("\" text) / obs-qp + + Where any quoted-pair appears, it is to be interpreted as the text + character alone. That is to say, the "\" character that appears as + part of a quoted-pair is semantically "invisible". + + Note: The "\" character may appear in a message where it is not part + of a quoted-pair. A "\" character that does not appear in a + quoted-pair is not semantically invisible. The only places in this + standard where quoted-pair currently appears are ccontent, qcontent, + dcontent, no-fold-quote, and no-fold-literal. + + + + + +Resnick Standards Track [Page 10] + +RFC 2822 Internet Message Format April 2001 + + +3.2.3. Folding white space and comments + + White space characters, including white space used in folding + (described in section 2.2.3), may appear between many elements in + header field bodies. Also, strings of characters that are treated as + comments may be included in structured field bodies as characters + enclosed in parentheses. The following defines the folding white + space (FWS) and comment constructs. + + Strings of characters enclosed in parentheses are considered comments + so long as they do not appear within a "quoted-string", as defined in + section 3.2.5. Comments may nest. + + There are several places in this standard where comments and FWS may + be freely inserted. To accommodate that syntax, an additional token + for "CFWS" is defined for places where comments and/or FWS can occur. + However, where CFWS occurs in this standard, it MUST NOT be inserted + in such a way that any line of a folded header field is made up + entirely of WSP characters and nothing else. + +FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space + obs-FWS + +ctext = NO-WS-CTL / ; Non white space controls + + %d33-39 / ; The rest of the US-ASCII + %d42-91 / ; characters not including "(", + %d93-126 ; ")", or "\" + +ccontent = ctext / quoted-pair / comment + +comment = "(" *([FWS] ccontent) [FWS] ")" + +CFWS = *([FWS] comment) (([FWS] comment) / FWS) + + Throughout this standard, where FWS (the folding white space token) + appears, it indicates a place where header folding, as discussed in + section 2.2.3, may take place. Wherever header folding appears in a + message (that is, a header field body containing a CRLF followed by + any WSP), header unfolding (removal of the CRLF) is performed before + any further lexical analysis is performed on that header field + according to this standard. That is to say, any CRLF that appears in + FWS is semantically "invisible." + + A comment is normally used in a structured field body to provide some + human readable informational text. Since a comment is allowed to + contain FWS, folding is permitted within the comment. Also note that + since quoted-pair is allowed in a comment, the parentheses and + + + +Resnick Standards Track [Page 11] + +RFC 2822 Internet Message Format April 2001 + + + backslash characters may appear in a comment so long as they appear + as a quoted-pair. Semantically, the enclosing parentheses are not + part of the comment; the comment is what is contained between the two + parentheses. As stated earlier, the "\" in any quoted-pair and the + CRLF in any FWS that appears within the comment are semantically + "invisible" and therefore not part of the comment either. + + Runs of FWS, comment or CFWS that occur between lexical tokens in a + structured field header are semantically interpreted as a single + space character. + +3.2.4. Atom + + Several productions in structured header field bodies are simply + strings of certain basic characters. Such productions are called + atoms. + + Some of the structured header field bodies also allow the period + character (".", ASCII value 46) within runs of atext. An additional + "dot-atom" token is defined for those purposes. + +atext = ALPHA / DIGIT / ; Any character except controls, + "!" / "#" / ; SP, and specials. + "$" / "%" / ; Used for atoms + "&" / "'" / + "*" / "+" / + "-" / "/" / + "=" / "?" / + "^" / "_" / + "`" / "{" / + "|" / "}" / + "~" + +atom = [CFWS] 1*atext [CFWS] + +dot-atom = [CFWS] dot-atom-text [CFWS] + +dot-atom-text = 1*atext *("." 1*atext) + + Both atom and dot-atom are interpreted as a single unit, comprised of + the string of characters that make it up. Semantically, the optional + comments and FWS surrounding the rest of the characters are not part + of the atom; the atom is only the run of atext characters in an atom, + or the atext and "." characters in a dot-atom. + + + + + + + +Resnick Standards Track [Page 12] + +RFC 2822 Internet Message Format April 2001 + + +3.2.5. Quoted strings + + Strings of characters that include characters other than those + allowed in atoms may be represented in a quoted string format, where + the characters are surrounded by quote (DQUOTE, ASCII value 34) + characters. + +qtext = NO-WS-CTL / ; Non white space controls + + %d33 / ; The rest of the US-ASCII + %d35-91 / ; characters not including "\" + %d93-126 ; or the quote character + +qcontent = qtext / quoted-pair + +quoted-string = [CFWS] + DQUOTE *([FWS] qcontent) [FWS] DQUOTE + [CFWS] + + A quoted-string is treated as a unit. That is, quoted-string is + identical to atom, semantically. Since a quoted-string is allowed to + contain FWS, folding is permitted. Also note that since quoted-pair + is allowed in a quoted-string, the quote and backslash characters may + appear in a quoted-string so long as they appear as a quoted-pair. + + Semantically, neither the optional CFWS outside of the quote + characters nor the quote characters themselves are part of the + quoted-string; the quoted-string is what is contained between the two + quote characters. As stated earlier, the "\" in any quoted-pair and + the CRLF in any FWS/CFWS that appears within the quoted-string are + semantically "invisible" and therefore not part of the quoted-string + either. + +3.2.6. Miscellaneous tokens + + Three additional tokens are defined, word and phrase for combinations + of atoms and/or quoted-strings, and unstructured for use in + unstructured header fields and in some places within structured + header fields. + +word = atom / quoted-string + +phrase = 1*word / obs-phrase + + + + + + + + +Resnick Standards Track [Page 13] + +RFC 2822 Internet Message Format April 2001 + + +utext = NO-WS-CTL / ; Non white space controls + %d33-126 / ; The rest of US-ASCII + obs-utext + +unstructured = *([FWS] utext) [FWS] + +3.3. Date and Time Specification + + Date and time occur in several header fields. This section specifies + the syntax for a full date and time specification. Though folding + white space is permitted throughout the date-time specification, it + is RECOMMENDED that a single space be used in each place that FWS + appears (whether it is required or optional); some older + implementations may not interpret other occurrences of folding white + space correctly. + +date-time = [ day-of-week "," ] date FWS time [CFWS] + +day-of-week = ([FWS] day-name) / obs-day-of-week + +day-name = "Mon" / "Tue" / "Wed" / "Thu" / + "Fri" / "Sat" / "Sun" + +date = day month year + +year = 4*DIGIT / obs-year + +month = (FWS month-name FWS) / obs-month + +month-name = "Jan" / "Feb" / "Mar" / "Apr" / + "May" / "Jun" / "Jul" / "Aug" / + "Sep" / "Oct" / "Nov" / "Dec" + +day = ([FWS] 1*2DIGIT) / obs-day + +time = time-of-day FWS zone + +time-of-day = hour ":" minute [ ":" second ] + +hour = 2DIGIT / obs-hour + +minute = 2DIGIT / obs-minute + +second = 2DIGIT / obs-second + +zone = (( "+" / "-" ) 4DIGIT) / obs-zone + + + + + +Resnick Standards Track [Page 14] + +RFC 2822 Internet Message Format April 2001 + + + The day is the numeric day of the month. The year is any numeric + year 1900 or later. + + The time-of-day specifies the number of hours, minutes, and + optionally seconds since midnight of the date indicated. + + The date and time-of-day SHOULD express local time. + + The zone specifies the offset from Coordinated Universal Time (UTC, + formerly referred to as "Greenwich Mean Time") that the date and + time-of-day represent. The "+" or "-" indicates whether the + time-of-day is ahead of (i.e., east of) or behind (i.e., west of) + Universal Time. The first two digits indicate the number of hours + difference from Universal Time, and the last two digits indicate the + number of minutes difference from Universal Time. (Hence, +hhmm + means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) + minutes). The form "+0000" SHOULD be used to indicate a time zone at + Universal Time. Though "-0000" also indicates Universal Time, it is + used to indicate that the time was generated on a system that may be + in a local time zone other than Universal Time and therefore + indicates that the date-time contains no information about the local + time zone. + + A date-time specification MUST be semantically valid. That is, the + day-of-the-week (if included) MUST be the day implied by the date, + the numeric day-of-month MUST be between 1 and the number of days + allowed for the specified month (in the specified year), the + time-of-day MUST be in the range 00:00:00 through 23:59:60 (the + number of seconds allowing for a leap second; see [STD12]), and the + zone MUST be within the range -9959 through +9959. + +3.4. Address Specification + + Addresses occur in several message header fields to indicate senders + and recipients of messages. An address may either be an individual + mailbox, or a group of mailboxes. + +address = mailbox / group + +mailbox = name-addr / addr-spec + +name-addr = [display-name] angle-addr + +angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr + +group = display-name ":" [mailbox-list / CFWS] ";" + [CFWS] + + + + +Resnick Standards Track [Page 15] + +RFC 2822 Internet Message Format April 2001 + + +display-name = phrase + +mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list + +address-list = (address *("," address)) / obs-addr-list + + A mailbox receives mail. It is a conceptual entity which does not + necessarily pertain to file storage. For example, some sites may + choose to print mail on a printer and deliver the output to the + addressee's desk. Normally, a mailbox is comprised of two parts: (1) + an optional display name that indicates the name of the recipient + (which could be a person or a system) that could be displayed to the + user of a mail application, and (2) an addr-spec address enclosed in + angle brackets ("<" and ">"). There is also an alternate simple form + of a mailbox where the addr-spec address appears alone, without the + recipient's name or the angle brackets. The Internet addr-spec + address is described in section 3.4.1. + + Note: Some legacy implementations used the simple form where the + addr-spec appears without the angle brackets, but included the name + of the recipient in parentheses as a comment following the addr-spec. + Since the meaning of the information in a comment is unspecified, + implementations SHOULD use the full name-addr form of the mailbox, + instead of the legacy form, to specify the display name associated + with a mailbox. Also, because some legacy implementations interpret + the comment, comments generally SHOULD NOT be used in address fields + to avoid confusing such implementations. + + When it is desirable to treat several mailboxes as a single unit + (i.e., in a distribution list), the group construct can be used. The + group construct allows the sender to indicate a named group of + recipients. This is done by giving a display name for the group, + followed by a colon, followed by a comma separated list of any number + of mailboxes (including zero and one), and ending with a semicolon. + Because the list of mailboxes can be empty, using the group construct + is also a simple way to communicate to recipients that the message + was sent to one or more named sets of recipients, without actually + providing the individual mailbox address for each of those + recipients. + +3.4.1. Addr-spec specification + + An addr-spec is a specific Internet identifier that contains a + locally interpreted string followed by the at-sign character ("@", + ASCII value 64) followed by an Internet domain. The locally + interpreted string is either a quoted-string or a dot-atom. If the + string can be represented as a dot-atom (that is, it contains no + characters other than atext characters or "." surrounded by atext + + + +Resnick Standards Track [Page 16] + +RFC 2822 Internet Message Format April 2001 + + + characters), then the dot-atom form SHOULD be used and the + quoted-string form SHOULD NOT be used. Comments and folding white + space SHOULD NOT be used around the "@" in the addr-spec. + +addr-spec = local-part "@" domain + +local-part = dot-atom / quoted-string / obs-local-part + +domain = dot-atom / domain-literal / obs-domain + +domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS] + +dcontent = dtext / quoted-pair + +dtext = NO-WS-CTL / ; Non white space controls + + %d33-90 / ; The rest of the US-ASCII + %d94-126 ; characters not including "[", + ; "]", or "\" + + The domain portion identifies the point to which the mail is + delivered. In the dot-atom form, this is interpreted as an Internet + domain name (either a host name or a mail exchanger name) as + described in [STD3, STD13, STD14]. In the domain-literal form, the + domain is interpreted as the literal Internet address of the + particular host. In both cases, how addressing is used and how + messages are transported to a particular host is covered in the mail + transport document [RFC2821]. These mechanisms are outside of the + scope of this document. + + The local-part portion is a domain dependent string. In addresses, + it is simply interpreted on the particular host as a name of a + particular mailbox. + +3.5 Overall message syntax + + A message consists of header fields, optionally followed by a message + body. Lines in a message MUST be a maximum of 998 characters + excluding the CRLF, but it is RECOMMENDED that lines be limited to 78 + characters excluding the CRLF. (See section 2.1.1 for explanation.) + In a message body, though all of the characters listed in the text + rule MAY be used, the use of US-ASCII control characters (values 1 + through 8, 11, 12, and 14 through 31) is discouraged since their + interpretation by receivers for display is not guaranteed. + + + + + + + +Resnick Standards Track [Page 17] + +RFC 2822 Internet Message Format April 2001 + + +message = (fields / obs-fields) + [CRLF body] + +body = *(*998text CRLF) *998text + + The header fields carry most of the semantic information and are + defined in section 3.6. The body is simply a series of lines of text + which are uninterpreted for the purposes of this standard. + +3.6. Field definitions + + The header fields of a message are defined here. All header fields + have the same general syntactic structure: A field name, followed by + a colon, followed by the field body. The specific syntax for each + header field is defined in the subsequent sections. + + Note: In the ABNF syntax for each field in subsequent sections, each + field name is followed by the required colon. However, for brevity + sometimes the colon is not referred to in the textual description of + the syntax. It is, nonetheless, required. + + It is important to note that the header fields are not guaranteed to + be in a particular order. They may appear in any order, and they + have been known to be reordered occasionally when transported over + the Internet. However, for the purposes of this standard, header + fields SHOULD NOT be reordered when a message is transported or + transformed. More importantly, the trace header fields and resent + header fields MUST NOT be reordered, and SHOULD be kept in blocks + prepended to the message. See sections 3.6.6 and 3.6.7 for more + information. + + The only required header fields are the origination date field and + the originator address field(s). All other header fields are + syntactically optional. More information is contained in the table + following this definition. + +fields = *(trace + *(resent-date / + resent-from / + resent-sender / + resent-to / + resent-cc / + resent-bcc / + resent-msg-id)) + *(orig-date / + from / + sender / + reply-to / + + + +Resnick Standards Track [Page 18] + +RFC 2822 Internet Message Format April 2001 + + + to / + cc / + bcc / + message-id / + in-reply-to / + references / + subject / + comments / + keywords / + optional-field) + + The following table indicates limits on the number of times each + field may occur in a message header as well as any special + limitations on the use of those fields. An asterisk next to a value + in the minimum or maximum column indicates that a special restriction + appears in the Notes column. + +Field Min number Max number Notes + +trace 0 unlimited Block prepended - see + 3.6.7 + +resent-date 0* unlimited* One per block, required + if other resent fields + present - see 3.6.6 + +resent-from 0 unlimited* One per block - see + 3.6.6 + +resent-sender 0* unlimited* One per block, MUST + occur with multi-address + resent-from - see 3.6.6 + +resent-to 0 unlimited* One per block - see + 3.6.6 + +resent-cc 0 unlimited* One per block - see + 3.6.6 + +resent-bcc 0 unlimited* One per block - see + 3.6.6 + +resent-msg-id 0 unlimited* One per block - see + 3.6.6 + +orig-date 1 1 + +from 1 1 See sender and 3.6.2 + + + +Resnick Standards Track [Page 19] + +RFC 2822 Internet Message Format April 2001 + + +sender 0* 1 MUST occur with multi- + address from - see 3.6.2 + +reply-to 0 1 + +to 0 1 + +cc 0 1 + +bcc 0 1 + +message-id 0* 1 SHOULD be present - see + 3.6.4 + +in-reply-to 0* 1 SHOULD occur in some + replies - see 3.6.4 + +references 0* 1 SHOULD occur in some + replies - see 3.6.4 + +subject 0 1 + +comments 0 unlimited + +keywords 0 unlimited + +optional-field 0 unlimited + + The exact interpretation of each field is described in subsequent + sections. + +3.6.1. The origination date field + + The origination date field consists of the field name "Date" followed + by a date-time specification. + +orig-date = "Date:" date-time CRLF + + The origination date specifies the date and time at which the creator + of the message indicated that the message was complete and ready to + enter the mail delivery system. For instance, this might be the time + that a user pushes the "send" or "submit" button in an application + program. In any case, it is specifically not intended to convey the + time that the message is actually transported, but rather the time at + which the human or other creator of the message has put the message + into its final form, ready for transport. (For example, a portable + computer user who is not connected to a network might queue a message + + + + +Resnick Standards Track [Page 20] + +RFC 2822 Internet Message Format April 2001 + + + for delivery. The origination date is intended to contain the date + and time that the user queued the message, not the time when the user + connected to the network to send the message.) + +3.6.2. Originator fields + + The originator fields of a message consist of the from field, the + sender field (when applicable), and optionally the reply-to field. + The from field consists of the field name "From" and a + comma-separated list of one or more mailbox specifications. If the + from field contains more than one mailbox specification in the + mailbox-list, then the sender field, containing the field name + "Sender" and a single mailbox specification, MUST appear in the + message. In either case, an optional reply-to field MAY also be + included, which contains the field name "Reply-To" and a + comma-separated list of one or more addresses. + +from = "From:" mailbox-list CRLF + +sender = "Sender:" mailbox CRLF + +reply-to = "Reply-To:" address-list CRLF + + The originator fields indicate the mailbox(es) of the source of the + message. The "From:" field specifies the author(s) of the message, + that is, the mailbox(es) of the person(s) or system(s) responsible + for the writing of the message. The "Sender:" field specifies the + mailbox of the agent responsible for the actual transmission of the + message. For example, if a secretary were to send a message for + another person, the mailbox of the secretary would appear in the + "Sender:" field and the mailbox of the actual author would appear in + the "From:" field. If the originator of the message can be indicated + by a single mailbox and the author and transmitter are identical, the + "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD + appear. + + The originator fields also provide the information required when + replying to a message. When the "Reply-To:" field is present, it + indicates the mailbox(es) to which the author of the message suggests + that replies be sent. In the absence of the "Reply-To:" field, + replies SHOULD by default be sent to the mailbox(es) specified in the + "From:" field unless otherwise specified by the person composing the + reply. + + In all cases, the "From:" field SHOULD NOT contain any mailbox that + does not belong to the author(s) of the message. See also section + 3.6.3 for more information on forming the destination addresses for a + reply. + + + +Resnick Standards Track [Page 21] + +RFC 2822 Internet Message Format April 2001 + + +3.6.3. Destination address fields + + The destination fields of a message consist of three possible fields, + each of the same form: The field name, which is either "To", "Cc", or + "Bcc", followed by a comma-separated list of one or more addresses + (either mailbox or group syntax). + +to = "To:" address-list CRLF + +cc = "Cc:" address-list CRLF + +bcc = "Bcc:" (address-list / [CFWS]) CRLF + + The destination fields specify the recipients of the message. Each + destination field may have one or more addresses, and each of the + addresses indicate the intended recipients of the message. The only + difference between the three fields is how each is used. + + The "To:" field contains the address(es) of the primary recipient(s) + of the message. + + The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of + making a copy on a typewriter using carbon paper) contains the + addresses of others who are to receive the message, though the + content of the message may not be directed at them. + + The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy") contains + addresses of recipients of the message whose addresses are not to be + revealed to other recipients of the message. There are three ways in + which the "Bcc:" field is used. In the first case, when a message + containing a "Bcc:" field is prepared to be sent, the "Bcc:" line is + removed even though all of the recipients (including those specified + in the "Bcc:" field) are sent a copy of the message. In the second + case, recipients specified in the "To:" and "Cc:" lines each are sent + a copy of the message with the "Bcc:" line removed as above, but the + recipients on the "Bcc:" line get a separate copy of the message + containing a "Bcc:" line. (When there are multiple recipient + addresses in the "Bcc:" field, some implementations actually send a + separate copy of the message to each recipient with a "Bcc:" + containing only the address of that particular recipient.) Finally, + since a "Bcc:" field may contain no addresses, a "Bcc:" field can be + sent without any addresses indicating to the recipients that blind + copies were sent to someone. Which method to use with "Bcc:" fields + is implementation dependent, but refer to the "Security + Considerations" section of this document for a discussion of each. + + + + + + +Resnick Standards Track [Page 22] + +RFC 2822 Internet Message Format April 2001 + + + When a message is a reply to another message, the mailboxes of the + authors of the original message (the mailboxes in the "From:" field) + or mailboxes specified in the "Reply-To:" field (if it exists) MAY + appear in the "To:" field of the reply since these would normally be + the primary recipients of the reply. If a reply is sent to a message + that has destination fields, it is often desirable to send a copy of + the reply to all of the recipients of the message, in addition to the + author. When such a reply is formed, addresses in the "To:" and + "Cc:" fields of the original message MAY appear in the "Cc:" field of + the reply, since these are normally secondary recipients of the + reply. If a "Bcc:" field is present in the original message, + addresses in that field MAY appear in the "Bcc:" field of the reply, + but SHOULD NOT appear in the "To:" or "Cc:" fields. + + Note: Some mail applications have automatic reply commands that + include the destination addresses of the original message in the + destination addresses of the reply. How those reply commands behave + is implementation dependent and is beyond the scope of this document. + In particular, whether or not to include the original destination + addresses when the original message had a "Reply-To:" field is not + addressed here. + +3.6.4. Identification fields + + Though optional, every message SHOULD have a "Message-ID:" field. + Furthermore, reply messages SHOULD have "In-Reply-To:" and + "References:" fields as appropriate, as described below. + + The "Message-ID:" field contains a single unique message identifier. + The "References:" and "In-Reply-To:" field each contain one or more + unique message identifiers, optionally separated by CFWS. + + The message identifier (msg-id) is similar in syntax to an angle-addr + construct without the internal CFWS. + +message-id = "Message-ID:" msg-id CRLF + +in-reply-to = "In-Reply-To:" 1*msg-id CRLF + +references = "References:" 1*msg-id CRLF + +msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] + +id-left = dot-atom-text / no-fold-quote / obs-id-left + +id-right = dot-atom-text / no-fold-literal / obs-id-right + +no-fold-quote = DQUOTE *(qtext / quoted-pair) DQUOTE + + + +Resnick Standards Track [Page 23] + +RFC 2822 Internet Message Format April 2001 + + +no-fold-literal = "[" *(dtext / quoted-pair) "]" + + The "Message-ID:" field provides a unique message identifier that + refers to a particular version of a particular message. The + uniqueness of the message identifier is guaranteed by the host that + generates it (see below). This message identifier is intended to be + machine readable and not necessarily meaningful to humans. A message + identifier pertains to exactly one instantiation of a particular + message; subsequent revisions to the message each receive new message + identifiers. + + Note: There are many instances when messages are "changed", but those + changes do not constitute a new instantiation of that message, and + therefore the message would not get a new message identifier. For + example, when messages are introduced into the transport system, they + are often prepended with additional header fields such as trace + fields (described in section 3.6.7) and resent fields (described in + section 3.6.6). The addition of such header fields does not change + the identity of the message and therefore the original "Message-ID:" + field is retained. In all cases, it is the meaning that the sender + of the message wishes to convey (i.e., whether this is the same + message or a different message) that determines whether or not the + "Message-ID:" field changes, not any particular syntactic difference + that appears (or does not appear) in the message. + + The "In-Reply-To:" and "References:" fields are used when creating a + reply to a message. They hold the message identifier of the original + message and the message identifiers of other messages (for example, + in the case of a reply to a message which was itself a reply). The + "In-Reply-To:" field may be used to identify the message (or + messages) to which the new message is a reply, while the + "References:" field may be used to identify a "thread" of + conversation. + + When creating a reply to a message, the "In-Reply-To:" and + "References:" fields of the resultant message are constructed as + follows: + + The "In-Reply-To:" field will contain the contents of the "Message- + ID:" field of the message to which this one is a reply (the "parent + message"). If there is more than one parent message, then the "In- + Reply-To:" field will contain the contents of all of the parents' + "Message-ID:" fields. If there is no "Message-ID:" field in any of + the parent messages, then the new message will have no "In-Reply-To:" + field. + + + + + + +Resnick Standards Track [Page 24] + +RFC 2822 Internet Message Format April 2001 + + + The "References:" field will contain the contents of the parent's + "References:" field (if any) followed by the contents of the parent's + "Message-ID:" field (if any). If the parent message does not contain + a "References:" field but does have an "In-Reply-To:" field + containing a single message identifier, then the "References:" field + will contain the contents of the parent's "In-Reply-To:" field + followed by the contents of the parent's "Message-ID:" field (if + any). If the parent has none of the "References:", "In-Reply-To:", + or "Message-ID:" fields, then the new message will have no + "References:" field. + + Note: Some implementations parse the "References:" field to display + the "thread of the discussion". These implementations assume that + each new message is a reply to a single parent and hence that they + can walk backwards through the "References:" field to find the parent + of each message listed there. Therefore, trying to form a + "References:" field for a reply that has multiple parents is + discouraged and how to do so is not defined in this document. + + The message identifier (msg-id) itself MUST be a globally unique + identifier for a message. The generator of the message identifier + MUST guarantee that the msg-id is unique. There are several + algorithms that can be used to accomplish this. Since the msg-id has + a similar syntax to angle-addr (identical except that comments and + folding white space are not allowed), a good method is to put the + domain name (or a domain literal IP address) of the host on which the + message identifier was created on the right hand side of the "@", and + put a combination of the current absolute date and time along with + some other currently unique (perhaps sequential) identifier available + on the system (for example, a process id number) on the left hand + side. Using a date on the left hand side and a domain name or domain + literal on the right hand side makes it possible to guarantee + uniqueness since no two hosts use the same domain name or IP address + at the same time. Though other algorithms will work, it is + RECOMMENDED that the right hand side contain some domain identifier + (either of the host itself or otherwise) such that the generator of + the message identifier can guarantee the uniqueness of the left hand + side within the scope of that domain. + + Semantically, the angle bracket characters are not part of the + msg-id; the msg-id is what is contained between the two angle bracket + characters. + + + + + + + + + +Resnick Standards Track [Page 25] + +RFC 2822 Internet Message Format April 2001 + + +3.6.5. Informational fields + + The informational fields are all optional. The "Keywords:" field + contains a comma-separated list of one or more words or + quoted-strings. The "Subject:" and "Comments:" fields are + unstructured fields as defined in section 2.2.1, and therefore may + contain text or folding white space. + +subject = "Subject:" unstructured CRLF + +comments = "Comments:" unstructured CRLF + +keywords = "Keywords:" phrase *("," phrase) CRLF + + These three fields are intended to have only human-readable content + with information about the message. The "Subject:" field is the most + common and contains a short string identifying the topic of the + message. When used in a reply, the field body MAY start with the + string "Re: " (from the Latin "res", in the matter of) followed by + the contents of the "Subject:" field body of the original message. + If this is done, only one instance of the literal string "Re: " ought + to be used since use of other strings or more than one instance can + lead to undesirable consequences. The "Comments:" field contains any + additional comments on the text of the body of the message. The + "Keywords:" field contains a comma-separated list of important words + and phrases that might be useful for the recipient. + +3.6.6. Resent fields + + Resent fields SHOULD be added to any message that is reintroduced by + a user into the transport system. A separate set of resent fields + SHOULD be added each time this is done. All of the resent fields + corresponding to a particular resending of the message SHOULD be + together. Each new set of resent fields is prepended to the message; + that is, the most recent set of resent fields appear earlier in the + message. No other fields in the message are changed when resent + fields are added. + + Each of the resent fields corresponds to a particular field elsewhere + in the syntax. For instance, the "Resent-Date:" field corresponds to + the "Date:" field and the "Resent-To:" field corresponds to the "To:" + field. In each case, the syntax for the field body is identical to + the syntax given previously for the corresponding field. + + When resent fields are used, the "Resent-From:" and "Resent-Date:" + fields MUST be sent. The "Resent-Message-ID:" field SHOULD be sent. + "Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be + identical to "Resent-From:". + + + +Resnick Standards Track [Page 26] + +RFC 2822 Internet Message Format April 2001 + + +resent-date = "Resent-Date:" date-time CRLF + +resent-from = "Resent-From:" mailbox-list CRLF + +resent-sender = "Resent-Sender:" mailbox CRLF + +resent-to = "Resent-To:" address-list CRLF + +resent-cc = "Resent-Cc:" address-list CRLF + +resent-bcc = "Resent-Bcc:" (address-list / [CFWS]) CRLF + +resent-msg-id = "Resent-Message-ID:" msg-id CRLF + + Resent fields are used to identify a message as having been + reintroduced into the transport system by a user. The purpose of + using resent fields is to have the message appear to the final + recipient as if it were sent directly by the original sender, with + all of the original fields remaining the same. Each set of resent + fields correspond to a particular resending event. That is, if a + message is resent multiple times, each set of resent fields gives + identifying information for each individual time. Resent fields are + strictly informational. They MUST NOT be used in the normal + processing of replies or other such automatic actions on messages. + + Note: Reintroducing a message into the transport system and using + resent fields is a different operation from "forwarding". + "Forwarding" has two meanings: One sense of forwarding is that a mail + reading program can be told by a user to forward a copy of a message + to another person, making the forwarded message the body of the new + message. A forwarded message in this sense does not appear to have + come from the original sender, but is an entirely new message from + the forwarder of the message. On the other hand, forwarding is also + used to mean when a mail transport program gets a message and + forwards it on to a different destination for final delivery. Resent + header fields are not intended for use with either type of + forwarding. + + The resent originator fields indicate the mailbox of the person(s) or + system(s) that resent the message. As with the regular originator + fields, there are two forms: a simple "Resent-From:" form which + contains the mailbox of the individual doing the resending, and the + more complex form, when one individual (identified in the + "Resent-Sender:" field) resends a message on behalf of one or more + others (identified in the "Resent-From:" field). + + Note: When replying to a resent message, replies behave just as they + would with any other message, using the original "From:", + + + +Resnick Standards Track [Page 27] + +RFC 2822 Internet Message Format April 2001 + + + "Reply-To:", "Message-ID:", and other fields. The resent fields are + only informational and MUST NOT be used in the normal processing of + replies. + + The "Resent-Date:" indicates the date and time at which the resent + message is dispatched by the resender of the message. Like the + "Date:" field, it is not the date and time that the message was + actually transported. + + The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function + identically to the "To:", "Cc:", and "Bcc:" fields respectively, + except that they indicate the recipients of the resent message, not + the recipients of the original message. + + The "Resent-Message-ID:" field provides a unique identifier for the + resent message. + +3.6.7. Trace fields + + The trace fields are a group of header fields consisting of an + optional "Return-Path:" field, and one or more "Received:" fields. + The "Return-Path:" header field contains a pair of angle brackets + that enclose an optional addr-spec. The "Received:" field contains a + (possibly empty) list of name/value pairs followed by a semicolon and + a date-time specification. The first item of the name/value pair is + defined by item-name, and the second item is either an addr-spec, an + atom, a domain, or a msg-id. Further restrictions may be applied to + the syntax of the trace fields by standards that provide for their + use, such as [RFC2821]. + +trace = [return] + 1*received + +return = "Return-Path:" path CRLF + +path = ([CFWS] "<" ([CFWS] / addr-spec) ">" [CFWS]) / + obs-path + +received = "Received:" name-val-list ";" date-time CRLF + +name-val-list = [CFWS] [name-val-pair *(CFWS name-val-pair)] + +name-val-pair = item-name CFWS item-value + +item-name = ALPHA *(["-"] (ALPHA / DIGIT)) + +item-value = 1*angle-addr / addr-spec / + atom / domain / msg-id + + + +Resnick Standards Track [Page 28] + +RFC 2822 Internet Message Format April 2001 + + + A full discussion of the Internet mail use of trace fields is + contained in [RFC2821]. For the purposes of this standard, the trace + fields are strictly informational, and any formal interpretation of + them is outside of the scope of this document. + +3.6.8. Optional fields + + Fields may appear in messages that are otherwise unspecified in this + standard. They MUST conform to the syntax of an optional-field. + This is a field name, made up of the printable US-ASCII characters + except SP and colon, followed by a colon, followed by any text which + conforms to unstructured. + + The field names of any optional-field MUST NOT be identical to any + field name specified elsewhere in this standard. + +optional-field = field-name ":" unstructured CRLF + +field-name = 1*ftext + +ftext = %d33-57 / ; Any character except + %d59-126 ; controls, SP, and + ; ":". + + For the purposes of this standard, any optional field is + uninterpreted. + +4. Obsolete Syntax + + Earlier versions of this standard allowed for different (usually more + liberal) syntax than is allowed in this version. Also, there have + been syntactic elements used in messages on the Internet whose + interpretation have never been documented. Though some of these + syntactic forms MUST NOT be generated according to the grammar in + section 3, they MUST be accepted and parsed by a conformant receiver. + This section documents many of these syntactic elements. Taking the + grammar in section 3 and adding the definitions presented in this + section will result in the grammar to use for interpretation of + messages. + + Note: This section identifies syntactic forms that any implementation + MUST reasonably interpret. However, there are certainly Internet + messages which do not conform to even the additional syntax given in + this section. The fact that a particular form does not appear in any + section of this document is not justification for computer programs + to crash or for malformed data to be irretrievably lost by any + implementation. To repeat an example, though this document requires + lines in messages to be no longer than 998 characters, silently + + + +Resnick Standards Track [Page 29] + +RFC 2822 Internet Message Format April 2001 + + + discarding the 999th and subsequent characters in a line without + warning would still be bad behavior for an implementation. It is up + to the implementation to deal with messages robustly. + + One important difference between the obsolete (interpreting) and the + current (generating) syntax is that in structured header field bodies + (i.e., between the colon and the CRLF of any structured header + field), white space characters, including folding white space, and + comments can be freely inserted between any syntactic tokens. This + allows many complex forms that have proven difficult for some + implementations to parse. + + Another key difference between the obsolete and the current syntax is + that the rule in section 3.2.3 regarding lines composed entirely of + white space in comments and folding white space does not apply. See + the discussion of folding white space in section 4.2 below. + + Finally, certain characters that were formerly allowed in messages + appear in this section. The NUL character (ASCII value 0) was once + allowed, but is no longer for compatibility reasons. CR and LF were + allowed to appear in messages other than as CRLF; this use is also + shown here. + + Other differences in syntax and semantics are noted in the following + sections. + +4.1. Miscellaneous obsolete tokens + + These syntactic elements are used elsewhere in the obsolete syntax or + in the main syntax. The obs-char and obs-qp elements each add ASCII + value 0. Bare CR and bare LF are added to obs-text and obs-utext. + The period character is added to obs-phrase. The obs-phrase-list + provides for "empty" elements in a comma-separated list of phrases. + + Note: The "period" (or "full stop") character (".") in obs-phrase is + not a form that was allowed in earlier versions of this or any other + standard. Period (nor any other character from specials) was not + allowed in phrase because it introduced a parsing difficulty + distinguishing between phrases and portions of an addr-spec (see + section 4.4). It appears here because the period character is + currently used in many messages in the display-name portion of + addresses, especially for initials in names, and therefore must be + interpreted properly. In the future, period may appear in the + regular syntax of phrase. + +obs-qp = "\" (%d0-127) + +obs-text = *LF *CR *(obs-char *LF *CR) + + + +Resnick Standards Track [Page 30] + +RFC 2822 Internet Message Format April 2001 + + +obs-char = %d0-9 / %d11 / ; %d0-127 except CR and + %d12 / %d14-127 ; LF + +obs-utext = obs-text + +obs-phrase = word *(word / "." / CFWS) + +obs-phrase-list = phrase / 1*([phrase] [CFWS] "," [CFWS]) [phrase] + + Bare CR and bare LF appear in messages with two different meanings. + In many cases, bare CR or bare LF are used improperly instead of CRLF + to indicate line separators. In other cases, bare CR and bare LF are + used simply as ASCII control characters with their traditional ASCII + meanings. + +4.2. Obsolete folding white space + + In the obsolete syntax, any amount of folding white space MAY be + inserted where the obs-FWS rule is allowed. This creates the + possibility of having two consecutive "folds" in a line, and + therefore the possibility that a line which makes up a folded header + field could be composed entirely of white space. + + obs-FWS = 1*WSP *(CRLF 1*WSP) + +4.3. Obsolete Date and Time + + The syntax for the obsolete date format allows a 2 digit year in the + date field and allows for a list of alphabetic time zone + specifications that were used in earlier versions of this standard. + It also permits comments and folding white space between many of the + tokens. + +obs-day-of-week = [CFWS] day-name [CFWS] + +obs-year = [CFWS] 2*DIGIT [CFWS] + +obs-month = CFWS month-name CFWS + +obs-day = [CFWS] 1*2DIGIT [CFWS] + +obs-hour = [CFWS] 2DIGIT [CFWS] + +obs-minute = [CFWS] 2DIGIT [CFWS] + +obs-second = [CFWS] 2DIGIT [CFWS] + +obs-zone = "UT" / "GMT" / ; Universal Time + + + +Resnick Standards Track [Page 31] + +RFC 2822 Internet Message Format April 2001 + + + ; North American UT + ; offsets + "EST" / "EDT" / ; Eastern: - 5/ - 4 + "CST" / "CDT" / ; Central: - 6/ - 5 + "MST" / "MDT" / ; Mountain: - 7/ - 6 + "PST" / "PDT" / ; Pacific: - 8/ - 7 + + %d65-73 / ; Military zones - "A" + %d75-90 / ; through "I" and "K" + %d97-105 / ; through "Z", both + %d107-122 ; upper and lower case + + Where a two or three digit year occurs in a date, the year is to be + interpreted as follows: If a two digit year is encountered whose + value is between 00 and 49, the year is interpreted by adding 2000, + ending up with a value between 2000 and 2049. If a two digit year is + encountered with a value between 50 and 99, or any three digit year + is encountered, the year is interpreted by adding 1900. + + In the obsolete time zone, "UT" and "GMT" are indications of + "Universal Time" and "Greenwich Mean Time" respectively and are both + semantically identical to "+0000". + + The remaining three character zones are the US time zones. The first + letter, "E", "C", "M", or "P" stands for "Eastern", "Central", + "Mountain" and "Pacific". The second letter is either "S" for + "Standard" time, or "D" for "Daylight" (or summer) time. Their + interpretations are as follows: + + EDT is semantically equivalent to -0400 + EST is semantically equivalent to -0500 + CDT is semantically equivalent to -0500 + CST is semantically equivalent to -0600 + MDT is semantically equivalent to -0600 + MST is semantically equivalent to -0700 + PDT is semantically equivalent to -0700 + PST is semantically equivalent to -0800 + + The 1 character military time zones were defined in a non-standard + way in [RFC822] and are therefore unpredictable in their meaning. + The original definitions of the military zones "A" through "I" are + equivalent to "+0100" through "+0900" respectively; "K", "L", and "M" + are equivalent to "+1000", "+1100", and "+1200" respectively; "N" + through "Y" are equivalent to "-0100" through "-1200" respectively; + and "Z" is equivalent to "+0000". However, because of the error in + [RFC822], they SHOULD all be considered equivalent to "-0000" unless + there is out-of-band information confirming their meaning. + + + + +Resnick Standards Track [Page 32] + +RFC 2822 Internet Message Format April 2001 + + + Other multi-character (usually between 3 and 5) alphabetic time zones + have been used in Internet messages. Any such time zone whose + meaning is not known SHOULD be considered equivalent to "-0000" + unless there is out-of-band information confirming their meaning. + +4.4. Obsolete Addressing + + There are three primary differences in addressing. First, mailbox + addresses were allowed to have a route portion before the addr-spec + when enclosed in "<" and ">". The route is simply a comma-separated + list of domain names, each preceded by "@", and the list terminated + by a colon. Second, CFWS were allowed between the period-separated + elements of local-part and domain (i.e., dot-atom was not used). In + addition, local-part is allowed to contain quoted-string in addition + to just atom. Finally, mailbox-list and address-list were allowed to + have "null" members. That is, there could be two or more commas in + such a list with nothing in between them. + +obs-angle-addr = [CFWS] "<" [obs-route] addr-spec ">" [CFWS] + +obs-route = [CFWS] obs-domain-list ":" [CFWS] + +obs-domain-list = "@" domain *(*(CFWS / "," ) [CFWS] "@" domain) + +obs-local-part = word *("." word) + +obs-domain = atom *("." atom) + +obs-mbox-list = 1*([mailbox] [CFWS] "," [CFWS]) [mailbox] + +obs-addr-list = 1*([address] [CFWS] "," [CFWS]) [address] + + When interpreting addresses, the route portion SHOULD be ignored. + +4.5. Obsolete header fields + + Syntactically, the primary difference in the obsolete field syntax is + that it allows multiple occurrences of any of the fields and they may + occur in any order. Also, any amount of white space is allowed + before the ":" at the end of the field name. + +obs-fields = *(obs-return / + obs-received / + obs-orig-date / + obs-from / + obs-sender / + obs-reply-to / + obs-to / + + + +Resnick Standards Track [Page 33] + +RFC 2822 Internet Message Format April 2001 + + + obs-cc / + obs-bcc / + obs-message-id / + obs-in-reply-to / + obs-references / + obs-subject / + obs-comments / + obs-keywords / + obs-resent-date / + obs-resent-from / + obs-resent-send / + obs-resent-rply / + obs-resent-to / + obs-resent-cc / + obs-resent-bcc / + obs-resent-mid / + obs-optional) + + Except for destination address fields (described in section 4.5.3), + the interpretation of multiple occurrences of fields is unspecified. + Also, the interpretation of trace fields and resent fields which do + not occur in blocks prepended to the message is unspecified as well. + Unless otherwise noted in the following sections, interpretation of + other fields is identical to the interpretation of their non-obsolete + counterparts in section 3. + +4.5.1. Obsolete origination date field + +obs-orig-date = "Date" *WSP ":" date-time CRLF + +4.5.2. Obsolete originator fields + +obs-from = "From" *WSP ":" mailbox-list CRLF + +obs-sender = "Sender" *WSP ":" mailbox CRLF + +obs-reply-to = "Reply-To" *WSP ":" mailbox-list CRLF + +4.5.3. Obsolete destination address fields + +obs-to = "To" *WSP ":" address-list CRLF + +obs-cc = "Cc" *WSP ":" address-list CRLF + +obs-bcc = "Bcc" *WSP ":" (address-list / [CFWS]) CRLF + + + + + + +Resnick Standards Track [Page 34] + +RFC 2822 Internet Message Format April 2001 + + + When multiple occurrences of destination address fields occur in a + message, they SHOULD be treated as if the address-list in the first + occurrence of the field is combined with the address lists of the + subsequent occurrences by adding a comma and concatenating. + +4.5.4. Obsolete identification fields + + The obsolete "In-Reply-To:" and "References:" fields differ from the + current syntax in that they allow phrase (words or quoted strings) to + appear. The obsolete forms of the left and right sides of msg-id + allow interspersed CFWS, making them syntactically identical to + local-part and domain respectively. + +obs-message-id = "Message-ID" *WSP ":" msg-id CRLF + +obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF + +obs-references = "References" *WSP ":" *(phrase / msg-id) CRLF + +obs-id-left = local-part + +obs-id-right = domain + + For purposes of interpretation, the phrases in the "In-Reply-To:" and + "References:" fields are ignored. + + Semantically, none of the optional CFWS surrounding the local-part + and the domain are part of the obs-id-left and obs-id-right + respectively. + +4.5.5. Obsolete informational fields + +obs-subject = "Subject" *WSP ":" unstructured CRLF + +obs-comments = "Comments" *WSP ":" unstructured CRLF + +obs-keywords = "Keywords" *WSP ":" obs-phrase-list CRLF + +4.5.6. Obsolete resent fields + + The obsolete syntax adds a "Resent-Reply-To:" field, which consists + of the field name, the optional comments and folding white space, the + colon, and a comma separated list of addresses. + +obs-resent-from = "Resent-From" *WSP ":" mailbox-list CRLF + +obs-resent-send = "Resent-Sender" *WSP ":" mailbox CRLF + + + + +Resnick Standards Track [Page 35] + +RFC 2822 Internet Message Format April 2001 + + +obs-resent-date = "Resent-Date" *WSP ":" date-time CRLF + +obs-resent-to = "Resent-To" *WSP ":" address-list CRLF + +obs-resent-cc = "Resent-Cc" *WSP ":" address-list CRLF + +obs-resent-bcc = "Resent-Bcc" *WSP ":" + (address-list / [CFWS]) CRLF + +obs-resent-mid = "Resent-Message-ID" *WSP ":" msg-id CRLF + +obs-resent-rply = "Resent-Reply-To" *WSP ":" address-list CRLF + + As with other resent fields, the "Resent-Reply-To:" field is to be + treated as trace information only. + +4.5.7. Obsolete trace fields + + The obs-return and obs-received are again given here as template + definitions, just as return and received are in section 3. Their + full syntax is given in [RFC2821]. + +obs-return = "Return-Path" *WSP ":" path CRLF + +obs-received = "Received" *WSP ":" name-val-list CRLF + +obs-path = obs-angle-addr + +4.5.8. Obsolete optional fields + +obs-optional = field-name *WSP ":" unstructured CRLF + +5. Security Considerations + + Care needs to be taken when displaying messages on a terminal or + terminal emulator. Powerful terminals may act on escape sequences + and other combinations of ASCII control characters with a variety of + consequences. They can remap the keyboard or permit other + modifications to the terminal which could lead to denial of service + or even damaged data. They can trigger (sometimes programmable) + answerback messages which can allow a message to cause commands to be + issued on the recipient's behalf. They can also effect the operation + of terminal attached devices such as printers. Message viewers may + wish to strip potentially dangerous terminal escape sequences from + the message prior to display. However, other escape sequences appear + in messages for useful purposes (cf. [RFC2045, RFC2046, RFC2047, + RFC2048, RFC2049, ISO2022]) and therefore should not be stripped + indiscriminately. + + + +Resnick Standards Track [Page 36] + +RFC 2822 Internet Message Format April 2001 + + + Transmission of non-text objects in messages raises additional + security issues. These issues are discussed in [RFC2045, RFC2046, + RFC2047, RFC2048, RFC2049]. + + Many implementations use the "Bcc:" (blind carbon copy) field + described in section 3.6.3 to facilitate sending messages to + recipients without revealing the addresses of one or more of the + addressees to the other recipients. Mishandling this use of "Bcc:" + has implications for confidential information that might be revealed, + which could eventually lead to security problems through knowledge of + even the existence of a particular mail address. For example, if + using the first method described in section 3.6.3, where the "Bcc:" + line is removed from the message, blind recipients have no explicit + indication that they have been sent a blind copy, except insofar as + their address does not appear in the message header. Because of + this, one of the blind addressees could potentially send a reply to + all of the shown recipients and accidentally reveal that the message + went to the blind recipient. When the second method from section + 3.6.3 is used, the blind recipient's address appears in the "Bcc:" + field of a separate copy of the message. If the "Bcc:" field sent + contains all of the blind addressees, all of the "Bcc:" recipients + will be seen by each "Bcc:" recipient. Even if a separate message is + sent to each "Bcc:" recipient with only the individual's address, + implementations still need to be careful to process replies to the + message as per section 3.6.3 so as not to accidentally reveal the + blind recipient to other recipients. + +6. Bibliography + + [ASCII] American National Standards Institute (ANSI), Coded + Character Set - 7-Bit American National Standard Code for + Information Interchange, ANSI X3.4, 1986. + + [ISO2022] International Organization for Standardization (ISO), + Information processing - ISO 7-bit and 8-bit coded + character sets - Code extension techniques, Third edition + - 1986-05-01, ISO 2022, 1986. + + [RFC822] Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", RFC 822, August 1982. + + [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message + Bodies", RFC 2045, November 1996. + + [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Two: Media Types", RFC 2046, + November 1996. + + + +Resnick Standards Track [Page 37] + +RFC 2822 Internet Message Format April 2001 + + + [RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME) + Part Three: Message Header Extensions for Non-ASCII Text", + RFC 2047, November 1996. + + [RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose + Internet Mail Extensions (MIME) Part Four: Format of + Internet Message Bodies", RFC 2048, November 1996. + + [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Five: Conformance Criteria and + Examples", RFC 2049, November 1996. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2234] Crocker, D., Editor, and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", RFC 2234, November 1997. + + [RFC2821] Klensin, J., Editor, "Simple Mail Transfer Protocol", RFC + 2821, March 2001. + + [STD3] Braden, R., "Host Requirements", STD 3, RFC 1122 and RFC + 1123, October 1989. + + [STD12] Mills, D., "Network Time Protocol", STD 12, RFC 1119, + September 1989. + + [STD13] Mockapetris, P., "Domain Name System", STD 13, RFC 1034 + and RFC 1035, November 1987. + + [STD14] Partridge, C., "Mail Routing and the Domain System", STD + 14, RFC 974, January 1986. + +7. Editor's Address + + Peter W. Resnick + QUALCOMM Incorporated + 5775 Morehouse Drive + San Diego, CA 92121-1714 + USA + + Phone: +1 858 651 4478 + Fax: +1 858 651 1102 + EMail: presnick@qualcomm.com + + + + + + + +Resnick Standards Track [Page 38] + +RFC 2822 Internet Message Format April 2001 + + +8. Acknowledgements + + Many people contributed to this document. They included folks who + participated in the Detailed Revision and Update of Messaging + Standards (DRUMS) Working Group of the Internet Engineering Task + Force (IETF), the chair of DRUMS, the Area Directors of the IETF, and + people who simply sent their comments in via e-mail. The editor is + deeply indebted to them all and thanks them sincerely. The below + list includes everyone who sent e-mail concerning this document. + Hopefully, everyone who contributed is named here: + + Matti Aarnio Barry Finkel Larry Masinter + Tanaka Akira Erik Forsberg Denis McKeon + Russ Allbery Chuck Foster William P McQuillan + Eric Allman Paul Fox Alexey Melnikov + Harald Tveit Alvestrand Klaus M. Frank Perry E. Metzger + Ran Atkinson Ned Freed Steven Miller + Jos Backus Jochen Friedrich Keith Moore + Bruce Balden Randall C. Gellens John Gardiner Myers + Dave Barr Sukvinder Singh Gill Chris Newman + Alan Barrett Tim Goodwin John W. Noerenberg + John Beck Philip Guenther Eric Norman + J. Robert von Behren Tony Hansen Mike O'Dell + Jos den Bekker John Hawkinson Larry Osterman + D. J. Bernstein Philip Hazel Paul Overell + James Berriman Kai Henningsen Jacob Palme + Norbert Bollow Robert Herriot Michael A. Patton + Raj Bose Paul Hethmon Uzi Paz + Antony Bowesman Jim Hill Michael A. Quinlan + Scott Bradner Paul E. Hoffman Eric S. Raymond + Randy Bush Steve Hole Sam Roberts + Tom Byrer Kari Hurtta Hugh Sasse + Bruce Campbell Marco S. Hyman Bart Schaefer + Larry Campbell Ofer Inbar Tom Scola + W. J. Carpenter Olle Jarnefors Wolfgang Segmuller + Michael Chapman Kevin Johnson Nick Shelness + Richard Clayton Sudish Joseph John Stanley + Maurizio Codogno Maynard Kang Einar Stefferud + Jim Conklin Prabhat Keni Jeff Stephenson + R. Kelley Cook John C. Klensin Bernard Stern + Steve Coya Graham Klyne Peter Sylvester + Mark Crispin Brad Knowles Mark Symons + Dave Crocker Shuhei Kobayashi Eric Thomas + Matt Curtin Peter Koch Lee Thompson + Michael D'Errico Dan Kohn Karel De Vriendt + Cyrus Daboo Christian Kuhtz Matthew Wall + Jutta Degener Anand Kumria Rolf Weber + Mark Delany Steen Larsen Brent B. Welch + + + +Resnick Standards Track [Page 39] + +RFC 2822 Internet Message Format April 2001 + + + Steve Dorner Eliot Lear Dan Wing + Harold A. Driscoll Barry Leiba Jack De Winter + Michael Elkins Jay Levitt Gregory J. Woodhouse + Robert Elz Lars-Johan Liman Greg A. Woods + Johnny Eriksson Charles Lindsey Kazu Yamamoto + Erik E. Fair Pete Loshin Alain Zahm + Roger Fajman Simon Lyall Jamie Zawinski + Patrik Faltstrom Bill Manning Timothy S. Zurcher + Claus Andre Farber John Martin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 40] + +RFC 2822 Internet Message Format April 2001 + + +Appendix A. Example messages + + This section presents a selection of messages. These are intended to + assist in the implementation of this standard, but should not be + taken as normative; that is to say, although the examples in this + section were carefully reviewed, if there happens to be a conflict + between these examples and the syntax described in sections 3 and 4 + of this document, the syntax in those sections is to be taken as + correct. + + Messages are delimited in this section between lines of "----". The + "----" lines are not part of the message itself. + +A.1. Addressing examples + + The following are examples of messages that might be sent between two + individuals. + +A.1.1. A message from one person to another with simple addressing + + This could be called a canonical message. It has a single author, + John Doe, a single recipient, Mary Smith, a subject, the date, a + message identifier, and a textual message in the body. + +---- +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 41] + +RFC 2822 Internet Message Format April 2001 + + + If John's secretary Michael actually sent the message, though John + was the author and replies to this message should go back to him, the + sender field would be used: + +---- +From: John Doe <jdoe@machine.example> +Sender: Michael Jones <mjones@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + +A.1.2. Different types of mailboxes + + This message includes multiple addresses in the destination fields + and also uses several different forms of addresses. + +---- +From: "Joe Q. Public" <john.q.public@example.com> +To: Mary Smith <mary@x.test>, jdoe@example.org, Who? <one@y.test> +Cc: <boss@nil.test>, "Giant; \"Big\" Box" <sysservices@example.net> +Date: Tue, 1 Jul 2003 10:52:37 +0200 +Message-ID: <5678.21-Nov-1997@example.com> + +Hi everyone. +---- + + Note that the display names for Joe Q. Public and Giant; "Big" Box + needed to be enclosed in double-quotes because the former contains + the period and the latter contains both semicolon and double-quote + characters (the double-quote characters appearing as quoted-pair + construct). Conversely, the display name for Who? could appear + without them because the question mark is legal in an atom. Notice + also that jdoe@example.org and boss@nil.test have no display names + associated with them at all, and jdoe@example.org uses the simpler + address form without the angle brackets. + + + + + + + + + + + +Resnick Standards Track [Page 42] + +RFC 2822 Internet Message Format April 2001 + + +A.1.3. Group addresses + +---- +From: Pete <pete@silly.example> +To: A Group:Chris Jones <c@a.test>,joe@where.test,John <jdoe@one.test>; +Cc: Undisclosed recipients:; +Date: Thu, 13 Feb 1969 23:32:54 -0330 +Message-ID: <testabcd.1234@silly.example> + +Testing. +---- + + In this message, the "To:" field has a single group recipient named A + Group which contains 3 addresses, and a "Cc:" field with an empty + group recipient named Undisclosed recipients. + +A.2. Reply messages + + The following is a series of three messages that make up a + conversation thread between John and Mary. John firsts sends a + message to Mary, Mary then replies to John's message, and then John + replies to Mary's reply message. + + Note especially the "Message-ID:", "References:", and "In-Reply-To:" + fields in each message. + +---- +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + + + + + + + + + + + + + + + +Resnick Standards Track [Page 43] + +RFC 2822 Internet Message Format April 2001 + + + When sending replies, the Subject field is often retained, though + prepended with "Re: " as described in section 3.6.5. + +---- +From: Mary Smith <mary@example.net> +To: John Doe <jdoe@machine.example> +Reply-To: "Mary Smith: Personal Account" <smith@home.example> +Subject: Re: Saying Hello +Date: Fri, 21 Nov 1997 10:01:10 -0600 +Message-ID: <3456@example.net> +In-Reply-To: <1234@local.machine.example> +References: <1234@local.machine.example> + +This is a reply to your hello. +---- + + Note the "Reply-To:" field in the above message. When John replies + to Mary's message above, the reply should go to the address in the + "Reply-To:" field instead of the address in the "From:" field. + +---- +To: "Mary Smith: Personal Account" <smith@home.example> +From: John Doe <jdoe@machine.example> +Subject: Re: Saying Hello +Date: Fri, 21 Nov 1997 11:00:00 -0600 +Message-ID: <abcd.1234@local.machine.tld> +In-Reply-To: <3456@example.net> +References: <1234@local.machine.example> <3456@example.net> + +This is a reply to your reply. +---- + +A.3. Resent messages + + Start with the message that has been used as an example several + times: + +---- +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + + + + +Resnick Standards Track [Page 44] + +RFC 2822 Internet Message Format April 2001 + + + Say that Mary, upon receiving this message, wishes to send a copy of + the message to Jane such that (a) the message would appear to have + come straight from John; (b) if Jane replies to the message, the + reply should go back to John; and (c) all of the original + information, like the date the message was originally sent to Mary, + the message identifier, and the original addressee, is preserved. In + this case, resent fields are prepended to the message: + +---- +Resent-From: Mary Smith <mary@example.net> +Resent-To: Jane Brown <j-brown@other.example> +Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800 +Resent-Message-ID: <78910@example.net> +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + + If Jane, in turn, wished to resend this message to another person, + she would prepend her own set of resent header fields to the above + and send that. + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 45] + +RFC 2822 Internet Message Format April 2001 + + +A.4. Messages with trace fields + + As messages are sent through the transport system as described in + [RFC2821], trace fields are prepended to the message. The following + is an example of what those trace fields might look like. Note that + there is some folding white space in the first one since these lines + can be long. + +---- +Received: from x.y.test + by example.net + via TCP + with ESMTP + id ABC12345 + for <mary@example.net>; 21 Nov 1997 10:05:43 -0600 +Received: from machine.example by x.y.test; 21 Nov 1997 10:01:22 -0600 +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: Fri, 21 Nov 1997 09:55:06 -0600 +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 46] + +RFC 2822 Internet Message Format April 2001 + + +A.5. White space, comments, and other oddities + + White space, including folding white space, and comments can be + inserted between many of the tokens of fields. Taking the example + from A.1.3, white space and comments can be inserted into all of the + fields. + +---- +From: Pete(A wonderful \) chap) <pete(his account)@silly.test(his host)> +To:A Group(Some people) + :Chris Jones <c@(Chris's host.)public.example>, + joe@example.org, + John <jdoe@one.test> (my dear friend); (the end of the group) +Cc:(Empty list)(start)Undisclosed recipients :(nobody(that I know)) ; +Date: Thu, + 13 + Feb + 1969 + 23:32 + -0330 (Newfoundland Time) +Message-ID: <testabcd.1234@silly.test> + +Testing. +---- + + The above example is aesthetically displeasing, but perfectly legal. + Note particularly (1) the comments in the "From:" field (including + one that has a ")" character appearing as part of a quoted-pair); (2) + the white space absent after the ":" in the "To:" field as well as + the comment and folding white space after the group name, the special + character (".") in the comment in Chris Jones's address, and the + folding white space before and after "joe@example.org,"; (3) the + multiple and nested comments in the "Cc:" field as well as the + comment immediately following the ":" after "Cc"; (4) the folding + white space (but no comments except at the end) and the missing + seconds in the time of the date field; and (5) the white space before + (but not within) the identifier in the "Message-ID:" field. + +A.6. Obsoleted forms + + The following are examples of obsolete (that is, the "MUST NOT + generate") syntactic elements described in section 4 of this + document. + + + + + + + + +Resnick Standards Track [Page 47] + +RFC 2822 Internet Message Format April 2001 + + +A.6.1. Obsolete addressing + + Note in the below example the lack of quotes around Joe Q. Public, + the route that appears in the address for Mary Smith, the two commas + that appear in the "To:" field, and the spaces that appear around the + "." in the jdoe address. + +---- +From: Joe Q. Public <john.q.public@example.com> +To: Mary Smith <@machine.tld:mary@example.net>, , jdoe@test . example +Date: Tue, 1 Jul 2003 10:52:37 +0200 +Message-ID: <5678.21-Nov-1997@example.com> + +Hi everyone. +---- + +A.6.2. Obsolete dates + + The following message uses an obsolete date format, including a non- + numeric time zone and a two digit year. Note that although the + day-of-week is missing, that is not specific to the obsolete syntax; + it is optional in the current syntax as well. + +---- +From: John Doe <jdoe@machine.example> +To: Mary Smith <mary@example.net> +Subject: Saying Hello +Date: 21 Nov 97 09:55:06 GMT +Message-ID: <1234@local.machine.example> + +This is a message just to say hello. +So, "Hello". +---- + +A.6.3. Obsolete white space and comments + + White space and comments can appear between many more elements than + in the current syntax. Also, folding lines that are made up entirely + of white space are legal. + + + + + + + + + + + + +Resnick Standards Track [Page 48] + +RFC 2822 Internet Message Format April 2001 + + +---- +From : John Doe <jdoe@machine(comment). example> +To : Mary Smith +__ + <mary@example.net> +Subject : Saying Hello +Date : Fri, 21 Nov 1997 09(comment): 55 : 06 -0600 +Message-ID : <1234 @ local(blah) .machine .example> + +This is a message just to say hello. +So, "Hello". +---- + + Note especially the second line of the "To:" field. It starts with + two space characters. (Note that "__" represent blank spaces.) + Therefore, it is considered part of the folding as described in + section 4.2. Also, the comments and white space throughout + addresses, dates, and message identifiers are all part of the + obsolete syntax. + +Appendix B. Differences from earlier standards + + This appendix contains a list of changes that have been made in the + Internet Message Format from earlier standards, specifically [RFC822] + and [STD3]. Items marked with an asterisk (*) below are items which + appear in section 4 of this document and therefore can no longer be + generated. + + 1. Period allowed in obsolete form of phrase. + 2. ABNF moved out of document to [RFC2234]. + 3. Four or more digits allowed for year. + 4. Header field ordering (and lack thereof) made explicit. + 5. Encrypted header field removed. + 6. Received syntax loosened to allow any token/value pair. + 7. Specifically allow and give meaning to "-0000" time zone. + 8. Folding white space is not allowed between every token. + 9. Requirement for destinations removed. + 10. Forwarding and resending redefined. + 11. Extension header fields no longer specifically called out. + 12. ASCII 0 (null) removed.* + 13. Folding continuation lines cannot contain only white space.* + 14. Free insertion of comments not allowed in date.* + 15. Non-numeric time zones not allowed.* + 16. Two digit years not allowed.* + 17. Three digit years interpreted, but not allowed for generation. + 18. Routes in addresses not allowed.* + 19. CFWS within local-parts and domains not allowed.* + 20. Empty members of address lists not allowed.* + + + +Resnick Standards Track [Page 49] + +RFC 2822 Internet Message Format April 2001 + + + 21. Folding white space between field name and colon not allowed.* + 22. Comments between field name and colon not allowed. + 23. Tightened syntax of in-reply-to and references.* + 24. CFWS within msg-id not allowed.* + 25. Tightened semantics of resent fields as informational only. + 26. Resent-Reply-To not allowed.* + 27. No multiple occurrences of fields (except resent and received).* + 28. Free CR and LF not allowed.* + 29. Routes in return path not allowed.* + 30. Line length limits specified. + 31. Bcc more clearly specified. + +Appendix C. Notices + + Intellectual Property + + The IETF takes no position regarding the validity or scope of any + intellectual property or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; neither does it represent that it + has made any effort to identify any such rights. Information on the + IETF's procedures with respect to rights in standards-track and + standards-related documentation can be found in BCP-11. Copies of + claims of rights made available for publication and any assurances of + licenses to be made available, or the result of an attempt made to + obtain a general license or permission for the use of such + proprietary rights by implementors or users of this specification can + be obtained from the IETF Secretariat. + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 50] + +RFC 2822 Internet Message Format April 2001 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2001). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 51] + diff --git a/rfc/rfc3501.txt b/rfc/rfc3501.txt @@ -0,0 +1,6051 @@ + + + + + + +Network Working Group M. Crispin +Request for Comments: 3501 University of Washington +Obsoletes: 2060 March 2003 +Category: Standards Track + + + INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1 + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2003). All Rights Reserved. + +Abstract + + The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1) + allows a client to access and manipulate electronic mail messages on + a server. IMAP4rev1 permits manipulation of mailboxes (remote + message folders) in a way that is functionally equivalent to local + folders. IMAP4rev1 also provides the capability for an offline + client to resynchronize with the server. + + IMAP4rev1 includes operations for creating, deleting, and renaming + mailboxes, checking for new messages, permanently removing messages, + setting and clearing flags, RFC 2822 and RFC 2045 parsing, searching, + and selective fetching of message attributes, texts, and portions + thereof. Messages in IMAP4rev1 are accessed by the use of numbers. + These numbers are either message sequence numbers or unique + identifiers. + + IMAP4rev1 supports a single server. A mechanism for accessing + configuration information to support multiple IMAP4rev1 servers is + discussed in RFC 2244. + + IMAP4rev1 does not specify a means of posting mail; this function is + handled by a mail transfer protocol such as RFC 2821. + + + + + + + + +Crispin Standards Track [Page 1] + +RFC 3501 IMAPv4 March 2003 + + +Table of Contents + + IMAP4rev1 Protocol Specification ................................ 4 + 1. How to Read This Document ............................... 4 + 1.1. Organization of This Document ........................... 4 + 1.2. Conventions Used in This Document ....................... 4 + 1.3. Special Notes to Implementors ........................... 5 + 2. Protocol Overview ....................................... 6 + 2.1. Link Level .............................................. 6 + 2.2. Commands and Responses .................................. 6 + 2.2.1. Client Protocol Sender and Server Protocol Receiver ..... 6 + 2.2.2. Server Protocol Sender and Client Protocol Receiver ..... 7 + 2.3. Message Attributes ...................................... 8 + 2.3.1. Message Numbers ......................................... 8 + 2.3.1.1. Unique Identifier (UID) Message Attribute ....... 8 + 2.3.1.2. Message Sequence Number Message Attribute ....... 10 + 2.3.2. Flags Message Attribute ................................. 11 + 2.3.3. Internal Date Message Attribute ......................... 12 + 2.3.4. [RFC-2822] Size Message Attribute ....................... 12 + 2.3.5. Envelope Structure Message Attribute .................... 12 + 2.3.6. Body Structure Message Attribute ........................ 12 + 2.4. Message Texts ........................................... 13 + 3. State and Flow Diagram .................................. 13 + 3.1. Not Authenticated State ................................. 13 + 3.2. Authenticated State ..................................... 13 + 3.3. Selected State .......................................... 13 + 3.4. Logout State ............................................ 14 + 4. Data Formats ............................................ 16 + 4.1. Atom .................................................... 16 + 4.2. Number .................................................. 16 + 4.3. String .................................................. 16 + 4.3.1. 8-bit and Binary Strings ................................ 17 + 4.4. Parenthesized List ...................................... 17 + 4.5. NIL ..................................................... 17 + 5. Operational Considerations .............................. 18 + 5.1. Mailbox Naming .......................................... 18 + 5.1.1. Mailbox Hierarchy Naming ................................ 19 + 5.1.2. Mailbox Namespace Naming Convention ..................... 19 + 5.1.3. Mailbox International Naming Convention ................. 19 + 5.2. Mailbox Size and Message Status Updates ................. 21 + 5.3. Response when no Command in Progress .................... 21 + 5.4. Autologout Timer ........................................ 22 + 5.5. Multiple Commands in Progress ........................... 22 + 6. Client Commands ........................................ 23 + 6.1. Client Commands - Any State ............................ 24 + 6.1.1. CAPABILITY Command ..................................... 24 + 6.1.2. NOOP Command ........................................... 25 + 6.1.3. LOGOUT Command ......................................... 26 + + + +Crispin Standards Track [Page 2] + +RFC 3501 IMAPv4 March 2003 + + + 6.2. Client Commands - Not Authenticated State .............. 26 + 6.2.1. STARTTLS Command ....................................... 27 + 6.2.2. AUTHENTICATE Command ................................... 28 + 6.2.3. LOGIN Command .......................................... 30 + 6.3. Client Commands - Authenticated State .................. 31 + 6.3.1. SELECT Command ......................................... 32 + 6.3.2. EXAMINE Command ........................................ 34 + 6.3.3. CREATE Command ......................................... 34 + 6.3.4. DELETE Command ......................................... 35 + 6.3.5. RENAME Command ......................................... 37 + 6.3.6. SUBSCRIBE Command ...................................... 39 + 6.3.7. UNSUBSCRIBE Command .................................... 39 + 6.3.8. LIST Command ........................................... 40 + 6.3.9. LSUB Command ........................................... 43 + 6.3.10. STATUS Command ......................................... 44 + 6.3.11. APPEND Command ......................................... 46 + 6.4. Client Commands - Selected State ....................... 47 + 6.4.1. CHECK Command .......................................... 47 + 6.4.2. CLOSE Command .......................................... 48 + 6.4.3. EXPUNGE Command ........................................ 49 + 6.4.4. SEARCH Command ......................................... 49 + 6.4.5. FETCH Command .......................................... 54 + 6.4.6. STORE Command .......................................... 58 + 6.4.7. COPY Command ........................................... 59 + 6.4.8. UID Command ............................................ 60 + 6.5. Client Commands - Experimental/Expansion ............... 62 + 6.5.1. X<atom> Command ........................................ 62 + 7. Server Responses ....................................... 62 + 7.1. Server Responses - Status Responses .................... 63 + 7.1.1. OK Response ............................................ 65 + 7.1.2. NO Response ............................................ 66 + 7.1.3. BAD Response ........................................... 66 + 7.1.4. PREAUTH Response ....................................... 67 + 7.1.5. BYE Response ........................................... 67 + 7.2. Server Responses - Server and Mailbox Status ........... 68 + 7.2.1. CAPABILITY Response .................................... 68 + 7.2.2. LIST Response .......................................... 69 + 7.2.3. LSUB Response .......................................... 70 + 7.2.4 STATUS Response ........................................ 70 + 7.2.5. SEARCH Response ........................................ 71 + 7.2.6. FLAGS Response ......................................... 71 + 7.3. Server Responses - Mailbox Size ........................ 71 + 7.3.1. EXISTS Response ........................................ 71 + 7.3.2. RECENT Response ........................................ 72 + 7.4. Server Responses - Message Status ...................... 72 + 7.4.1. EXPUNGE Response ....................................... 72 + 7.4.2. FETCH Response ......................................... 73 + 7.5. Server Responses - Command Continuation Request ........ 79 + + + +Crispin Standards Track [Page 3] + +RFC 3501 IMAPv4 March 2003 + + + 8. Sample IMAP4rev1 connection ............................ 80 + 9. Formal Syntax .......................................... 81 + 10. Author's Note .......................................... 92 + 11. Security Considerations ................................ 92 + 11.1. STARTTLS Security Considerations ....................... 92 + 11.2. Other Security Considerations .......................... 93 + 12. IANA Considerations .................................... 94 + Appendices ..................................................... 95 + A. References ............................................. 95 + B. Changes from RFC 2060 .................................. 97 + C. Key Word Index ......................................... 103 + Author's Address ............................................... 107 + Full Copyright Statement ....................................... 108 + +IMAP4rev1 Protocol Specification + +1. How to Read This Document + +1.1. Organization of This Document + + This document is written from the point of view of the implementor of + an IMAP4rev1 client or server. Beyond the protocol overview in + section 2, it is not optimized for someone trying to understand the + operation of the protocol. The material in sections 3 through 5 + provides the general context and definitions with which IMAP4rev1 + operates. + + Sections 6, 7, and 9 describe the IMAP commands, responses, and + syntax, respectively. The relationships among these are such that it + is almost impossible to understand any of them separately. In + particular, do not attempt to deduce command syntax from the command + section alone; instead refer to the Formal Syntax section. + +1.2. Conventions Used in This Document + + "Conventions" are basic principles or procedures. Document + conventions are noted in this section. + + In examples, "C:" and "S:" indicate lines sent by the client and + server respectively. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "MAY", and "OPTIONAL" in this document are to + be interpreted as described in [KEYWORDS]. + + The word "can" (not "may") is used to refer to a possible + circumstance or situation, as opposed to an optional facility of the + protocol. + + + +Crispin Standards Track [Page 4] + +RFC 3501 IMAPv4 March 2003 + + + "User" is used to refer to a human user, whereas "client" refers to + the software being run by the user. + + "Connection" refers to the entire sequence of client/server + interaction from the initial establishment of the network connection + until its termination. + + "Session" refers to the sequence of client/server interaction from + the time that a mailbox is selected (SELECT or EXAMINE command) until + the time that selection ends (SELECT or EXAMINE of another mailbox, + CLOSE command, or connection termination). + + Characters are 7-bit US-ASCII unless otherwise specified. Other + character sets are indicated using a "CHARSET", as described in + [MIME-IMT] and defined in [CHARSET]. CHARSETs have important + additional semantics in addition to defining character set; refer to + these documents for more detail. + + There are several protocol conventions in IMAP. These refer to + aspects of the specification which are not strictly part of the IMAP + protocol, but reflect generally-accepted practice. Implementations + need to be aware of these conventions, and avoid conflicts whether or + not they implement the convention. For example, "&" may not be used + as a hierarchy delimiter since it conflicts with the Mailbox + International Naming Convention, and other uses of "&" in mailbox + names are impacted as well. + +1.3. Special Notes to Implementors + + Implementors of the IMAP protocol are strongly encouraged to read the + IMAP implementation recommendations document [IMAP-IMPLEMENTATION] in + conjunction with this document, to help understand the intricacies of + this protocol and how best to build an interoperable product. + + IMAP4rev1 is designed to be upwards compatible from the [IMAP2] and + unpublished IMAP2bis protocols. IMAP4rev1 is largely compatible with + the IMAP4 protocol described in RFC 1730; the exception being in + certain facilities added in RFC 1730 that proved problematic and were + subsequently removed. In the course of the evolution of IMAP4rev1, + some aspects in the earlier protocols have become obsolete. Obsolete + commands, responses, and data formats which an IMAP4rev1 + implementation can encounter when used with an earlier implementation + are described in [IMAP-OBSOLETE]. + + Other compatibility issues with IMAP2bis, the most common variant of + the earlier protocol, are discussed in [IMAP-COMPAT]. A full + discussion of compatibility issues with rare (and presumed extinct) + + + + +Crispin Standards Track [Page 5] + +RFC 3501 IMAPv4 March 2003 + + + variants of [IMAP2] is in [IMAP-HISTORICAL]; this document is + primarily of historical interest. + + IMAP was originally developed for the older [RFC-822] standard, and + as a consequence several fetch items in IMAP incorporate "RFC822" in + their name. With the exception of RFC822.SIZE, there are more modern + replacements; for example, the modern version of RFC822.HEADER is + BODY.PEEK[HEADER]. In all cases, "RFC822" should be interpreted as a + reference to the updated [RFC-2822] standard. + +2. Protocol Overview + +2.1. Link Level + + The IMAP4rev1 protocol assumes a reliable data stream such as that + provided by TCP. When TCP is used, an IMAP4rev1 server listens on + port 143. + +2.2. Commands and Responses + + An IMAP4rev1 connection consists of the establishment of a + client/server network connection, an initial greeting from the + server, and client/server interactions. These client/server + interactions consist of a client command, server data, and a server + completion result response. + + All interactions transmitted by client and server are in the form of + lines, that is, strings that end with a CRLF. The protocol receiver + of an IMAP4rev1 client or server is either reading a line, or is + reading a sequence of octets with a known count followed by a line. + +2.2.1. Client Protocol Sender and Server Protocol Receiver + + The client command begins an operation. Each client command is + prefixed with an identifier (typically a short alphanumeric string, + e.g., A0001, A0002, etc.) called a "tag". A different tag is + generated by the client for each command. + + Clients MUST follow the syntax outlined in this specification + strictly. It is a syntax error to send a command with missing or + extraneous spaces or arguments. + + There are two cases in which a line from the client does not + represent a complete command. In one case, a command argument is + quoted with an octet count (see the description of literal in String + under Data Formats); in the other case, the command arguments require + server feedback (see the AUTHENTICATE command). In either case, the + + + + +Crispin Standards Track [Page 6] + +RFC 3501 IMAPv4 March 2003 + + + server sends a command continuation request response if it is ready + for the octets (if appropriate) and the remainder of the command. + This response is prefixed with the token "+". + + Note: If instead, the server detected an error in the + command, it sends a BAD completion response with a tag + matching the command (as described below) to reject the + command and prevent the client from sending any more of the + command. + + It is also possible for the server to send a completion + response for some other command (if multiple commands are + in progress), or untagged data. In either case, the + command continuation request is still pending; the client + takes the appropriate action for the response, and reads + another response from the server. In all cases, the client + MUST send a complete command (including receiving all + command continuation request responses and command + continuations for the command) before initiating a new + command. + + The protocol receiver of an IMAP4rev1 server reads a command line + from the client, parses the command and its arguments, and transmits + server data and a server command completion result response. + +2.2.2. Server Protocol Sender and Client Protocol Receiver + + Data transmitted by the server to the client and status responses + that do not indicate command completion are prefixed with the token + "*", and are called untagged responses. + + Server data MAY be sent as a result of a client command, or MAY be + sent unilaterally by the server. There is no syntactic difference + between server data that resulted from a specific command and server + data that were sent unilaterally. + + The server completion result response indicates the success or + failure of the operation. It is tagged with the same tag as the + client command which began the operation. Thus, if more than one + command is in progress, the tag in a server completion response + identifies the command to which the response applies. There are + three possible server completion responses: OK (indicating success), + NO (indicating failure), or BAD (indicating a protocol error such as + unrecognized command or command syntax error). + + Servers SHOULD enforce the syntax outlined in this specification + strictly. Any client command with a protocol syntax error, including + (but not limited to) missing or extraneous spaces or arguments, + + + +Crispin Standards Track [Page 7] + +RFC 3501 IMAPv4 March 2003 + + + SHOULD be rejected, and the client given a BAD server completion + response. + + The protocol receiver of an IMAP4rev1 client reads a response line + from the server. It then takes action on the response based upon the + first token of the response, which can be a tag, a "*", or a "+". + + A client MUST be prepared to accept any server response at all times. + This includes server data that was not requested. Server data SHOULD + be recorded, so that the client can reference its recorded copy + rather than sending a command to the server to request the data. In + the case of certain server data, the data MUST be recorded. + + This topic is discussed in greater detail in the Server Responses + section. + +2.3. Message Attributes + + In addition to message text, each message has several attributes + associated with it. These attributes can be retrieved individually + or in conjunction with other attributes or message texts. + +2.3.1. Message Numbers + + Messages in IMAP4rev1 are accessed by one of two numbers; the unique + identifier or the message sequence number. + + +2.3.1.1. Unique Identifier (UID) Message Attribute + + A 32-bit value assigned to each message, which when used with the + unique identifier validity value (see below) forms a 64-bit value + that MUST NOT refer to any other message in the mailbox or any + subsequent mailbox with the same name forever. Unique identifiers + are assigned in a strictly ascending fashion in the mailbox; as each + message is added to the mailbox it is assigned a higher UID than the + message(s) which were added previously. Unlike message sequence + numbers, unique identifiers are not necessarily contiguous. + + The unique identifier of a message MUST NOT change during the + session, and SHOULD NOT change between sessions. Any change of + unique identifiers between sessions MUST be detectable using the + UIDVALIDITY mechanism discussed below. Persistent unique identifiers + are required for a client to resynchronize its state from a previous + session with the server (e.g., disconnected or offline access + clients); this is discussed further in [IMAP-DISC]. + + + + + +Crispin Standards Track [Page 8] + +RFC 3501 IMAPv4 March 2003 + + + Associated with every mailbox are two values which aid in unique + identifier handling: the next unique identifier value and the unique + identifier validity value. + + The next unique identifier value is the predicted value that will be + assigned to a new message in the mailbox. Unless the unique + identifier validity also changes (see below), the next unique + identifier value MUST have the following two characteristics. First, + the next unique identifier value MUST NOT change unless new messages + are added to the mailbox; and second, the next unique identifier + value MUST change whenever new messages are added to the mailbox, + even if those new messages are subsequently expunged. + + Note: The next unique identifier value is intended to + provide a means for a client to determine whether any + messages have been delivered to the mailbox since the + previous time it checked this value. It is not intended to + provide any guarantee that any message will have this + unique identifier. A client can only assume, at the time + that it obtains the next unique identifier value, that + messages arriving after that time will have a UID greater + than or equal to that value. + + The unique identifier validity value is sent in a UIDVALIDITY + response code in an OK untagged response at mailbox selection time. + If unique identifiers from an earlier session fail to persist in this + session, the unique identifier validity value MUST be greater than + the one used in the earlier session. + + Note: Ideally, unique identifiers SHOULD persist at all + times. Although this specification recognizes that failure + to persist can be unavoidable in certain server + environments, it STRONGLY ENCOURAGES message store + implementation techniques that avoid this problem. For + example: + + 1) Unique identifiers MUST be strictly ascending in the + mailbox at all times. If the physical message store is + re-ordered by a non-IMAP agent, this requires that the + unique identifiers in the mailbox be regenerated, since + the former unique identifiers are no longer strictly + ascending as a result of the re-ordering. + + 2) If the message store has no mechanism to store unique + identifiers, it must regenerate unique identifiers at + each session, and each session must have a unique + UIDVALIDITY value. + + + + +Crispin Standards Track [Page 9] + +RFC 3501 IMAPv4 March 2003 + + + 3) If the mailbox is deleted and a new mailbox with the + same name is created at a later date, the server must + either keep track of unique identifiers from the + previous instance of the mailbox, or it must assign a + new UIDVALIDITY value to the new instance of the + mailbox. A good UIDVALIDITY value to use in this case + is a 32-bit representation of the creation date/time of + the mailbox. It is alright to use a constant such as + 1, but only if it guaranteed that unique identifiers + will never be reused, even in the case of a mailbox + being deleted (or renamed) and a new mailbox by the + same name created at some future time. + + 4) The combination of mailbox name, UIDVALIDITY, and UID + must refer to a single immutable message on that server + forever. In particular, the internal date, [RFC-2822] + size, envelope, body structure, and message texts + (RFC822, RFC822.HEADER, RFC822.TEXT, and all BODY[...] + fetch data items) must never change. This does not + include message numbers, nor does it include attributes + that can be set by a STORE command (e.g., FLAGS). + + +2.3.1.2. Message Sequence Number Message Attribute + + A relative position from 1 to the number of messages in the mailbox. + This position MUST be ordered by ascending unique identifier. As + each new message is added, it is assigned a message sequence number + that is 1 higher than the number of messages in the mailbox before + that new message was added. + + Message sequence numbers can be reassigned during the session. For + example, when a message is permanently removed (expunged) from the + mailbox, the message sequence number for all subsequent messages is + decremented. The number of messages in the mailbox is also + decremented. Similarly, a new message can be assigned a message + sequence number that was once held by some other message prior to an + expunge. + + In addition to accessing messages by relative position in the + mailbox, message sequence numbers can be used in mathematical + calculations. For example, if an untagged "11 EXISTS" is received, + and previously an untagged "8 EXISTS" was received, three new + messages have arrived with message sequence numbers of 9, 10, and 11. + Another example, if message 287 in a 523 message mailbox has UID + 12345, there are exactly 286 messages which have lesser UIDs and 236 + messages which have greater UIDs. + + + + +Crispin Standards Track [Page 10] + +RFC 3501 IMAPv4 March 2003 + + +2.3.2. Flags Message Attribute + + A list of zero or more named tokens associated with the message. A + flag is set by its addition to this list, and is cleared by its + removal. There are two types of flags in IMAP4rev1. A flag of + either type can be permanent or session-only. + + A system flag is a flag name that is pre-defined in this + specification. All system flags begin with "\". Certain system + flags (\Deleted and \Seen) have special semantics described + elsewhere. The currently-defined system flags are: + + \Seen + Message has been read + + \Answered + Message has been answered + + \Flagged + Message is "flagged" for urgent/special attention + + \Deleted + Message is "deleted" for removal by later EXPUNGE + + \Draft + Message has not completed composition (marked as a draft). + + \Recent + Message is "recently" arrived in this mailbox. This session + is the first session to have been notified about this + message; if the session is read-write, subsequent sessions + will not see \Recent set for this message. This flag can not + be altered by the client. + + If it is not possible to determine whether or not this + session is the first session to be notified about a message, + then that message SHOULD be considered recent. + + If multiple connections have the same mailbox selected + simultaneously, it is undefined which of these connections + will see newly-arrived messages with \Recent set and which + will see it without \Recent set. + + A keyword is defined by the server implementation. Keywords do not + begin with "\". Servers MAY permit the client to define new keywords + in the mailbox (see the description of the PERMANENTFLAGS response + code for more information). + + + + +Crispin Standards Track [Page 11] + +RFC 3501 IMAPv4 March 2003 + + + A flag can be permanent or session-only on a per-flag basis. + Permanent flags are those which the client can add or remove from the + message flags permanently; that is, concurrent and subsequent + sessions will see any change in permanent flags. Changes to session + flags are valid only in that session. + + Note: The \Recent system flag is a special case of a + session flag. \Recent can not be used as an argument in a + STORE or APPEND command, and thus can not be changed at + all. + +2.3.3. Internal Date Message Attribute + + The internal date and time of the message on the server. This + is not the date and time in the [RFC-2822] header, but rather a + date and time which reflects when the message was received. In + the case of messages delivered via [SMTP], this SHOULD be the + date and time of final delivery of the message as defined by + [SMTP]. In the case of messages delivered by the IMAP4rev1 COPY + command, this SHOULD be the internal date and time of the source + message. In the case of messages delivered by the IMAP4rev1 + APPEND command, this SHOULD be the date and time as specified in + the APPEND command description. All other cases are + implementation defined. + +2.3.4. [RFC-2822] Size Message Attribute + + The number of octets in the message, as expressed in [RFC-2822] + format. + +2.3.5. Envelope Structure Message Attribute + + A parsed representation of the [RFC-2822] header of the message. + Note that the IMAP Envelope structure is not the same as an + [SMTP] envelope. + +2.3.6. Body Structure Message Attribute + + A parsed representation of the [MIME-IMB] body structure + information of the message. + + + + + + + + + + + +Crispin Standards Track [Page 12] + +RFC 3501 IMAPv4 March 2003 + + +2.4. Message Texts + + In addition to being able to fetch the full [RFC-2822] text of a + message, IMAP4rev1 permits the fetching of portions of the full + message text. Specifically, it is possible to fetch the + [RFC-2822] message header, [RFC-2822] message body, a [MIME-IMB] + body part, or a [MIME-IMB] header. + +3. State and Flow Diagram + + Once the connection between client and server is established, an + IMAP4rev1 connection is in one of four states. The initial + state is identified in the server greeting. Most commands are + only valid in certain states. It is a protocol error for the + client to attempt a command while the connection is in an + inappropriate state, and the server will respond with a BAD or + NO (depending upon server implementation) command completion + result. + +3.1. Not Authenticated State + + In the not authenticated state, the client MUST supply + authentication credentials before most commands will be + permitted. This state is entered when a connection starts + unless the connection has been pre-authenticated. + +3.2. Authenticated State + + In the authenticated state, the client is authenticated and MUST + select a mailbox to access before commands that affect messages + will be permitted. This state is entered when a + pre-authenticated connection starts, when acceptable + authentication credentials have been provided, after an error in + selecting a mailbox, or after a successful CLOSE command. + +3.3. Selected State + + In a selected state, a mailbox has been selected to access. + This state is entered when a mailbox has been successfully + selected. + + + + + + + + + + + +Crispin Standards Track [Page 13] + +RFC 3501 IMAPv4 March 2003 + + +3.4. Logout State + + In the logout state, the connection is being terminated. This + state can be entered as a result of a client request (via the + LOGOUT command) or by unilateral action on the part of either + the client or server. + + If the client requests the logout state, the server MUST send an + untagged BYE response and a tagged OK response to the LOGOUT + command before the server closes the connection; and the client + MUST read the tagged OK response to the LOGOUT command before + the client closes the connection. + + A server MUST NOT unilaterally close the connection without + sending an untagged BYE response that contains the reason for + having done so. A client SHOULD NOT unilaterally close the + connection, and instead SHOULD issue a LOGOUT command. If the + server detects that the client has unilaterally closed the + connection, the server MAY omit the untagged BYE response and + simply close its connection. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 14] + +RFC 3501 IMAPv4 March 2003 + + + +----------------------+ + |connection established| + +----------------------+ + || + \/ + +--------------------------------------+ + | server greeting | + +--------------------------------------+ + || (1) || (2) || (3) + \/ || || + +-----------------+ || || + |Not Authenticated| || || + +-----------------+ || || + || (7) || (4) || || + || \/ \/ || + || +----------------+ || + || | Authenticated |<=++ || + || +----------------+ || || + || || (7) || (5) || (6) || + || || \/ || || + || || +--------+ || || + || || |Selected|==++ || + || || +--------+ || + || || || (7) || + \/ \/ \/ \/ + +--------------------------------------+ + | Logout | + +--------------------------------------+ + || + \/ + +-------------------------------+ + |both sides close the connection| + +-------------------------------+ + + (1) connection without pre-authentication (OK greeting) + (2) pre-authenticated connection (PREAUTH greeting) + (3) rejected connection (BYE greeting) + (4) successful LOGIN or AUTHENTICATE command + (5) successful SELECT or EXAMINE command + (6) CLOSE command, or failed SELECT or EXAMINE command + (7) LOGOUT command, server shutdown, or connection closed + + + + + + + + + + +Crispin Standards Track [Page 15] + +RFC 3501 IMAPv4 March 2003 + + +4. Data Formats + + IMAP4rev1 uses textual commands and responses. Data in + IMAP4rev1 can be in one of several forms: atom, number, string, + parenthesized list, or NIL. Note that a particular data item + may take more than one form; for example, a data item defined as + using "astring" syntax may be either an atom or a string. + +4.1. Atom + + An atom consists of one or more non-special characters. + +4.2. Number + + A number consists of one or more digit characters, and + represents a numeric value. + +4.3. String + + A string is in one of two forms: either literal or quoted + string. The literal form is the general form of string. The + quoted string form is an alternative that avoids the overhead of + processing a literal at the cost of limitations of characters + which may be used. + + A literal is a sequence of zero or more octets (including CR and + LF), prefix-quoted with an octet count in the form of an open + brace ("{"), the number of octets, close brace ("}"), and CRLF. + In the case of literals transmitted from server to client, the + CRLF is immediately followed by the octet data. In the case of + literals transmitted from client to server, the client MUST wait + to receive a command continuation request (described later in + this document) before sending the octet data (and the remainder + of the command). + + A quoted string is a sequence of zero or more 7-bit characters, + excluding CR and LF, with double quote (<">) characters at each + end. + + The empty string is represented as either "" (a quoted string + with zero characters between double quotes) or as {0} followed + by CRLF (a literal with an octet count of 0). + + Note: Even if the octet count is 0, a client transmitting a + literal MUST wait to receive a command continuation request. + + + + + + +Crispin Standards Track [Page 16] + +RFC 3501 IMAPv4 March 2003 + + +4.3.1. 8-bit and Binary Strings + + 8-bit textual and binary mail is supported through the use of a + [MIME-IMB] content transfer encoding. IMAP4rev1 implementations MAY + transmit 8-bit or multi-octet characters in literals, but SHOULD do + so only when the [CHARSET] is identified. + + Although a BINARY body encoding is defined, unencoded binary strings + are not permitted. A "binary string" is any string with NUL + characters. Implementations MUST encode binary data into a textual + form, such as BASE64, before transmitting the data. A string with an + excessive amount of CTL characters MAY also be considered to be + binary. + +4.4. Parenthesized List + + Data structures are represented as a "parenthesized list"; a sequence + of data items, delimited by space, and bounded at each end by + parentheses. A parenthesized list can contain other parenthesized + lists, using multiple levels of parentheses to indicate nesting. + + The empty list is represented as () -- a parenthesized list with no + members. + +4.5. NIL + + The special form "NIL" represents the non-existence of a particular + data item that is represented as a string or parenthesized list, as + distinct from the empty string "" or the empty parenthesized list (). + + Note: NIL is never used for any data item which takes the + form of an atom. For example, a mailbox name of "NIL" is a + mailbox named NIL as opposed to a non-existent mailbox + name. This is because mailbox uses "astring" syntax which + is an atom or a string. Conversely, an addr-name of NIL is + a non-existent personal name, because addr-name uses + "nstring" syntax which is NIL or a string, but never an + atom. + + + + + + + + + + + + + +Crispin Standards Track [Page 17] + +RFC 3501 IMAPv4 March 2003 + + +5. Operational Considerations + + The following rules are listed here to ensure that all IMAP4rev1 + implementations interoperate properly. + +5.1. Mailbox Naming + + Mailbox names are 7-bit. Client implementations MUST NOT attempt to + create 8-bit mailbox names, and SHOULD interpret any 8-bit mailbox + names returned by LIST or LSUB as UTF-8. Server implementations + SHOULD prohibit the creation of 8-bit mailbox names, and SHOULD NOT + return 8-bit mailbox names in LIST or LSUB. See section 5.1.3 for + more information on how to represent non-ASCII mailbox names. + + Note: 8-bit mailbox names were undefined in earlier + versions of this protocol. Some sites used a local 8-bit + character set to represent non-ASCII mailbox names. Such + usage is not interoperable, and is now formally deprecated. + + The case-insensitive mailbox name INBOX is a special name reserved to + mean "the primary mailbox for this user on this server". The + interpretation of all other names is implementation-dependent. + + In particular, this specification takes no position on case + sensitivity in non-INBOX mailbox names. Some server implementations + are fully case-sensitive; others preserve case of a newly-created + name but otherwise are case-insensitive; and yet others coerce names + to a particular case. Client implementations MUST interact with any + of these. If a server implementation interprets non-INBOX mailbox + names as case-insensitive, it MUST treat names using the + international naming convention specially as described in section + 5.1.3. + + There are certain client considerations when creating a new mailbox + name: + + 1) Any character which is one of the atom-specials (see the Formal + Syntax) will require that the mailbox name be represented as a + quoted string or literal. + + 2) CTL and other non-graphic characters are difficult to represent + in a user interface and are best avoided. + + 3) Although the list-wildcard characters ("%" and "*") are valid + in a mailbox name, it is difficult to use such mailbox names + with the LIST and LSUB commands due to the conflict with + wildcard interpretation. + + + + +Crispin Standards Track [Page 18] + +RFC 3501 IMAPv4 March 2003 + + + 4) Usually, a character (determined by the server implementation) + is reserved to delimit levels of hierarchy. + + 5) Two characters, "#" and "&", have meanings by convention, and + should be avoided except when used in that convention. + +5.1.1. Mailbox Hierarchy Naming + + If it is desired to export hierarchical mailbox names, mailbox names + MUST be left-to-right hierarchical using a single character to + separate levels of hierarchy. The same hierarchy separator character + is used for all levels of hierarchy within a single name. + +5.1.2. Mailbox Namespace Naming Convention + + By convention, the first hierarchical element of any mailbox name + which begins with "#" identifies the "namespace" of the remainder of + the name. This makes it possible to disambiguate between different + types of mailbox stores, each of which have their own namespaces. + + For example, implementations which offer access to USENET + newsgroups MAY use the "#news" namespace to partition the + USENET newsgroup namespace from that of other mailboxes. + Thus, the comp.mail.misc newsgroup would have a mailbox + name of "#news.comp.mail.misc", and the name + "comp.mail.misc" can refer to a different object (e.g., a + user's private mailbox). + +5.1.3. Mailbox International Naming Convention + + By convention, international mailbox names in IMAP4rev1 are specified + using a modified version of the UTF-7 encoding described in [UTF-7]. + Modified UTF-7 may also be usable in servers that implement an + earlier version of this protocol. + + In modified UTF-7, printable US-ASCII characters, except for "&", + represent themselves; that is, characters with octet values 0x20-0x25 + and 0x27-0x7e. The character "&" (0x26) is represented by the + two-octet sequence "&-". + + All other characters (octet values 0x00-0x1f and 0x7f-0xff) are + represented in modified BASE64, with a further modification from + [UTF-7] that "," is used instead of "/". Modified BASE64 MUST NOT be + used to represent any printing US-ASCII character which can represent + itself. + + + + + + +Crispin Standards Track [Page 19] + +RFC 3501 IMAPv4 March 2003 + + + "&" is used to shift to modified BASE64 and "-" to shift back to + US-ASCII. There is no implicit shift from BASE64 to US-ASCII, and + null shifts ("-&" while in BASE64; note that "&-" while in US-ASCII + means "&") are not permitted. However, all names start in US-ASCII, + and MUST end in US-ASCII; that is, a name that ends with a non-ASCII + ISO-10646 character MUST end with a "-"). + + The purpose of these modifications is to correct the following + problems with UTF-7: + + 1) UTF-7 uses the "+" character for shifting; this conflicts with + the common use of "+" in mailbox names, in particular USENET + newsgroup names. + + 2) UTF-7's encoding is BASE64 which uses the "/" character; this + conflicts with the use of "/" as a popular hierarchy delimiter. + + 3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with + the use of "\" as a popular hierarchy delimiter. + + 4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with + the use of "~" in some servers as a home directory indicator. + + 5) UTF-7 permits multiple alternate forms to represent the same + string; in particular, printable US-ASCII characters can be + represented in encoded form. + + Although modified UTF-7 is a convention, it establishes certain + requirements on server handling of any mailbox name with an + embedded "&" character. In particular, server implementations + MUST preserve the exact form of the modified BASE64 portion of a + modified UTF-7 name and treat that text as case-sensitive, even if + names are otherwise case-insensitive or case-folded. + + Server implementations SHOULD verify that any mailbox name with an + embedded "&" character, used as an argument to CREATE, is: in the + correctly modified UTF-7 syntax, has no superfluous shifts, and + has no encoding in modified BASE64 of any printing US-ASCII + character which can represent itself. However, client + implementations MUST NOT depend upon the server doing this, and + SHOULD NOT attempt to create a mailbox name with an embedded "&" + character unless it complies with the modified UTF-7 syntax. + + Server implementations which export a mail store that does not + follow the modified UTF-7 convention MUST convert to modified + UTF-7 any mailbox name that contains either non-ASCII characters + or the "&" character. + + + + +Crispin Standards Track [Page 20] + +RFC 3501 IMAPv4 March 2003 + + + For example, here is a mailbox name which mixes English, + Chinese, and Japanese text: + ~peter/mail/&U,BTFw-/&ZeVnLIqe- + + For example, the string "&Jjo!" is not a valid mailbox + name because it does not contain a shift to US-ASCII + before the "!". The correct form is "&Jjo-!". The + string "&U,BTFw-&ZeVnLIqe-" is not permitted because it + contains a superfluous shift. The correct form is + "&U,BTF2XlZyyKng-". + +5.2. Mailbox Size and Message Status Updates + + At any time, a server can send data that the client did not request. + Sometimes, such behavior is REQUIRED. For example, agents other than + the server MAY add messages to the mailbox (e.g., new message + delivery), change the flags of the messages in the mailbox (e.g., + simultaneous access to the same mailbox by multiple agents), or even + remove messages from the mailbox. A server MUST send mailbox size + updates automatically if a mailbox size change is observed during the + processing of a command. A server SHOULD send message flag updates + automatically, without requiring the client to request such updates + explicitly. + + Special rules exist for server notification of a client about the + removal of messages to prevent synchronization errors; see the + description of the EXPUNGE response for more detail. In particular, + it is NOT permitted to send an EXISTS response that would reduce the + number of messages in the mailbox; only the EXPUNGE response can do + this. + + Regardless of what implementation decisions a client makes on + remembering data from the server, a client implementation MUST record + mailbox size updates. It MUST NOT assume that any command after the + initial mailbox selection will return the size of the mailbox. + +5.3. Response when no Command in Progress + + Server implementations are permitted to send an untagged response + (except for EXPUNGE) while there is no command in progress. Server + implementations that send such responses MUST deal with flow control + considerations. Specifically, they MUST either (1) verify that the + size of the data does not exceed the underlying transport's available + window size, or (2) use non-blocking writes. + + + + + + + +Crispin Standards Track [Page 21] + +RFC 3501 IMAPv4 March 2003 + + +5.4. Autologout Timer + + If a server has an inactivity autologout timer, the duration of that + timer MUST be at least 30 minutes. The receipt of ANY command from + the client during that interval SHOULD suffice to reset the + autologout timer. + +5.5. Multiple Commands in Progress + + The client MAY send another command without waiting for the + completion result response of a command, subject to ambiguity rules + (see below) and flow control constraints on the underlying data + stream. Similarly, a server MAY begin processing another command + before processing the current command to completion, subject to + ambiguity rules. However, any command continuation request responses + and command continuations MUST be negotiated before any subsequent + command is initiated. + + The exception is if an ambiguity would result because of a command + that would affect the results of other commands. Clients MUST NOT + send multiple commands without waiting if an ambiguity would result. + If the server detects a possible ambiguity, it MUST execute commands + to completion in the order given by the client. + + The most obvious example of ambiguity is when a command would affect + the results of another command, e.g., a FETCH of a message's flags + and a STORE of that same message's flags. + + A non-obvious ambiguity occurs with commands that permit an untagged + EXPUNGE response (commands other than FETCH, STORE, and SEARCH), + since an untagged EXPUNGE response can invalidate sequence numbers in + a subsequent command. This is not a problem for FETCH, STORE, or + SEARCH commands because servers are prohibited from sending EXPUNGE + responses while any of those commands are in progress. Therefore, if + the client sends any command other than FETCH, STORE, or SEARCH, it + MUST wait for the completion result response before sending a command + with message sequence numbers. + + Note: UID FETCH, UID STORE, and UID SEARCH are different + commands from FETCH, STORE, and SEARCH. If the client + sends a UID command, it must wait for a completion result + response before sending a command with message sequence + numbers. + + + + + + + + +Crispin Standards Track [Page 22] + +RFC 3501 IMAPv4 March 2003 + + + For example, the following non-waiting command sequences are invalid: + + FETCH + NOOP + STORE + STORE + COPY + FETCH + COPY + COPY + CHECK + FETCH + + The following are examples of valid non-waiting command sequences: + + FETCH + STORE + SEARCH + CHECK + STORE + COPY + EXPUNGE + + UID SEARCH + UID SEARCH may be valid or invalid as a non-waiting + command sequence, depending upon whether or not the second UID + SEARCH contains message sequence numbers. + +6. Client Commands + + IMAP4rev1 commands are described in this section. Commands are + organized by the state in which the command is permitted. Commands + which are permitted in multiple states are listed in the minimum + permitted state (for example, commands valid in authenticated and + selected state are listed in the authenticated state commands). + + Command arguments, identified by "Arguments:" in the command + descriptions below, are described by function, not by syntax. The + precise syntax of command arguments is described in the Formal Syntax + section. + + Some commands cause specific server responses to be returned; these + are identified by "Responses:" in the command descriptions below. + See the response descriptions in the Responses section for + information on these responses, and the Formal Syntax section for the + precise syntax of these responses. It is possible for server data to + be transmitted as a result of any command. Thus, commands that do + not specifically require server data specify "no specific responses + for this command" instead of "none". + + The "Result:" in the command description refers to the possible + tagged status responses to a command, and any special interpretation + of these status responses. + + The state of a connection is only changed by successful commands + which are documented as changing state. A rejected command (BAD + response) never changes the state of the connection or of the + selected mailbox. A failed command (NO response) generally does not + change the state of the connection or of the selected mailbox; the + exception being the SELECT and EXAMINE commands. + + + +Crispin Standards Track [Page 23] + +RFC 3501 IMAPv4 March 2003 + + +6.1. Client Commands - Any State + + The following commands are valid in any state: CAPABILITY, NOOP, and + LOGOUT. + +6.1.1. CAPABILITY Command + + Arguments: none + + Responses: REQUIRED untagged response: CAPABILITY + + Result: OK - capability completed + BAD - command unknown or arguments invalid + + The CAPABILITY command requests a listing of capabilities that the + server supports. The server MUST send a single untagged + CAPABILITY response with "IMAP4rev1" as one of the listed + capabilities before the (tagged) OK response. + + A capability name which begins with "AUTH=" indicates that the + server supports that particular authentication mechanism. All + such names are, by definition, part of this specification. For + example, the authorization capability for an experimental + "blurdybloop" authenticator would be "AUTH=XBLURDYBLOOP" and not + "XAUTH=BLURDYBLOOP" or "XAUTH=XBLURDYBLOOP". + + Other capability names refer to extensions, revisions, or + amendments to this specification. See the documentation of the + CAPABILITY response for additional information. No capabilities, + beyond the base IMAP4rev1 set defined in this specification, are + enabled without explicit client action to invoke the capability. + + Client and server implementations MUST implement the STARTTLS, + LOGINDISABLED, and AUTH=PLAIN (described in [IMAP-TLS]) + capabilities. See the Security Considerations section for + important information. + + See the section entitled "Client Commands - + Experimental/Expansion" for information about the form of site or + implementation-specific capabilities. + + + + + + + + + + + +Crispin Standards Track [Page 24] + +RFC 3501 IMAPv4 March 2003 + + + Example: C: abcd CAPABILITY + S: * CAPABILITY IMAP4rev1 STARTTLS AUTH=GSSAPI + LOGINDISABLED + S: abcd OK CAPABILITY completed + C: efgh STARTTLS + S: efgh OK STARTLS completed + <TLS negotiation, further commands are under [TLS] layer> + C: ijkl CAPABILITY + S: * CAPABILITY IMAP4rev1 AUTH=GSSAPI AUTH=PLAIN + S: ijkl OK CAPABILITY completed + + +6.1.2. NOOP Command + + Arguments: none + + Responses: no specific responses for this command (but see below) + + Result: OK - noop completed + BAD - command unknown or arguments invalid + + The NOOP command always succeeds. It does nothing. + + Since any command can return a status update as untagged data, the + NOOP command can be used as a periodic poll for new messages or + message status updates during a period of inactivity (this is the + preferred method to do this). The NOOP command can also be used + to reset any inactivity autologout timer on the server. + + Example: C: a002 NOOP + S: a002 OK NOOP completed + . . . + C: a047 NOOP + S: * 22 EXPUNGE + S: * 23 EXISTS + S: * 3 RECENT + S: * 14 FETCH (FLAGS (\Seen \Deleted)) + S: a047 OK NOOP completed + + + + + + + + + + + + + +Crispin Standards Track [Page 25] + +RFC 3501 IMAPv4 March 2003 + + +6.1.3. LOGOUT Command + + Arguments: none + + Responses: REQUIRED untagged response: BYE + + Result: OK - logout completed + BAD - command unknown or arguments invalid + + The LOGOUT command informs the server that the client is done with + the connection. The server MUST send a BYE untagged response + before the (tagged) OK response, and then close the network + connection. + + Example: C: A023 LOGOUT + S: * BYE IMAP4rev1 Server logging out + S: A023 OK LOGOUT completed + (Server and client then close the connection) + +6.2. Client Commands - Not Authenticated State + + In the not authenticated state, the AUTHENTICATE or LOGIN command + establishes authentication and enters the authenticated state. The + AUTHENTICATE command provides a general mechanism for a variety of + authentication techniques, privacy protection, and integrity + checking; whereas the LOGIN command uses a traditional user name and + plaintext password pair and has no means of establishing privacy + protection or integrity checking. + + The STARTTLS command is an alternate form of establishing session + privacy protection and integrity checking, but does not establish + authentication or enter the authenticated state. + + Server implementations MAY allow access to certain mailboxes without + establishing authentication. This can be done by means of the + ANONYMOUS [SASL] authenticator described in [ANONYMOUS]. An older + convention is a LOGIN command using the userid "anonymous"; in this + case, a password is required although the server may choose to accept + any password. The restrictions placed on anonymous users are + implementation-dependent. + + Once authenticated (including as anonymous), it is not possible to + re-enter not authenticated state. + + + + + + + + +Crispin Standards Track [Page 26] + +RFC 3501 IMAPv4 March 2003 + + + In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), + the following commands are valid in the not authenticated state: + STARTTLS, AUTHENTICATE and LOGIN. See the Security Considerations + section for important information about these commands. + +6.2.1. STARTTLS Command + + Arguments: none + + Responses: no specific response for this command + + Result: OK - starttls completed, begin TLS negotiation + BAD - command unknown or arguments invalid + + A [TLS] negotiation begins immediately after the CRLF at the end + of the tagged OK response from the server. Once a client issues a + STARTTLS command, it MUST NOT issue further commands until a + server response is seen and the [TLS] negotiation is complete. + + The server remains in the non-authenticated state, even if client + credentials are supplied during the [TLS] negotiation. This does + not preclude an authentication mechanism such as EXTERNAL (defined + in [SASL]) from using client identity determined by the [TLS] + negotiation. + + Once [TLS] has been started, the client MUST discard cached + information about server capabilities and SHOULD re-issue the + CAPABILITY command. This is necessary to protect against man-in- + the-middle attacks which alter the capabilities list prior to + STARTTLS. The server MAY advertise different capabilities after + STARTTLS. + + Example: C: a001 CAPABILITY + S: * CAPABILITY IMAP4rev1 STARTTLS LOGINDISABLED + S: a001 OK CAPABILITY completed + C: a002 STARTTLS + S: a002 OK Begin TLS negotiation now + <TLS negotiation, further commands are under [TLS] layer> + C: a003 CAPABILITY + S: * CAPABILITY IMAP4rev1 AUTH=PLAIN + S: a003 OK CAPABILITY completed + C: a004 LOGIN joe password + S: a004 OK LOGIN completed + + + + + + + + +Crispin Standards Track [Page 27] + +RFC 3501 IMAPv4 March 2003 + + +6.2.2. AUTHENTICATE Command + + Arguments: authentication mechanism name + + Responses: continuation data can be requested + + Result: OK - authenticate completed, now in authenticated state + NO - authenticate failure: unsupported authentication + mechanism, credentials rejected + BAD - command unknown or arguments invalid, + authentication exchange cancelled + + The AUTHENTICATE command indicates a [SASL] authentication + mechanism to the server. If the server supports the requested + authentication mechanism, it performs an authentication protocol + exchange to authenticate and identify the client. It MAY also + negotiate an OPTIONAL security layer for subsequent protocol + interactions. If the requested authentication mechanism is not + supported, the server SHOULD reject the AUTHENTICATE command by + sending a tagged NO response. + + The AUTHENTICATE command does not support the optional "initial + response" feature of [SASL]. Section 5.1 of [SASL] specifies how + to handle an authentication mechanism which uses an initial + response. + + The service name specified by this protocol's profile of [SASL] is + "imap". + + The authentication protocol exchange consists of a series of + server challenges and client responses that are specific to the + authentication mechanism. A server challenge consists of a + command continuation request response with the "+" token followed + by a BASE64 encoded string. The client response consists of a + single line consisting of a BASE64 encoded string. If the client + wishes to cancel an authentication exchange, it issues a line + consisting of a single "*". If the server receives such a + response, it MUST reject the AUTHENTICATE command by sending a + tagged BAD response. + + If a security layer is negotiated through the [SASL] + authentication exchange, it takes effect immediately following the + CRLF that concludes the authentication exchange for the client, + and the CRLF of the tagged OK response for the server. + + While client and server implementations MUST implement the + AUTHENTICATE command itself, it is not required to implement any + authentication mechanisms other than the PLAIN mechanism described + + + +Crispin Standards Track [Page 28] + +RFC 3501 IMAPv4 March 2003 + + + in [IMAP-TLS]. Also, an authentication mechanism is not required + to support any security layers. + + Note: a server implementation MUST implement a + configuration in which it does NOT permit any plaintext + password mechanisms, unless either the STARTTLS command + has been negotiated or some other mechanism that + protects the session from password snooping has been + provided. Server sites SHOULD NOT use any configuration + which permits a plaintext password mechanism without + such a protection mechanism against password snooping. + Client and server implementations SHOULD implement + additional [SASL] mechanisms that do not use plaintext + passwords, such the GSSAPI mechanism described in [SASL] + and/or the [DIGEST-MD5] mechanism. + + Servers and clients can support multiple authentication + mechanisms. The server SHOULD list its supported authentication + mechanisms in the response to the CAPABILITY command so that the + client knows which authentication mechanisms to use. + + A server MAY include a CAPABILITY response code in the tagged OK + response of a successful AUTHENTICATE command in order to send + capabilities automatically. It is unnecessary for a client to + send a separate CAPABILITY command if it recognizes these + automatic capabilities. This should only be done if a security + layer was not negotiated by the AUTHENTICATE command, because the + tagged OK response as part of an AUTHENTICATE command is not + protected by encryption/integrity checking. [SASL] requires the + client to re-issue a CAPABILITY command in this case. + + If an AUTHENTICATE command fails with a NO response, the client + MAY try another authentication mechanism by issuing another + AUTHENTICATE command. It MAY also attempt to authenticate by + using the LOGIN command (see section 6.2.3 for more detail). In + other words, the client MAY request authentication types in + decreasing order of preference, with the LOGIN command as a last + resort. + + The authorization identity passed from the client to the server + during the authentication exchange is interpreted by the server as + the user name whose privileges the client is requesting. + + + + + + + + + +Crispin Standards Track [Page 29] + +RFC 3501 IMAPv4 March 2003 + + + Example: S: * OK IMAP4rev1 Server + C: A001 AUTHENTICATE GSSAPI + S: + + C: YIIB+wYJKoZIhvcSAQICAQBuggHqMIIB5qADAgEFoQMCAQ6iBw + MFACAAAACjggEmYYIBIjCCAR6gAwIBBaESGxB1Lndhc2hpbmd0 + b24uZWR1oi0wK6ADAgEDoSQwIhsEaW1hcBsac2hpdmFtcy5jYW + Mud2FzaGluZ3Rvbi5lZHWjgdMwgdCgAwIBAaEDAgEDooHDBIHA + cS1GSa5b+fXnPZNmXB9SjL8Ollj2SKyb+3S0iXMljen/jNkpJX + AleKTz6BQPzj8duz8EtoOuNfKgweViyn/9B9bccy1uuAE2HI0y + C/PHXNNU9ZrBziJ8Lm0tTNc98kUpjXnHZhsMcz5Mx2GR6dGknb + I0iaGcRerMUsWOuBmKKKRmVMMdR9T3EZdpqsBd7jZCNMWotjhi + vd5zovQlFqQ2Wjc2+y46vKP/iXxWIuQJuDiisyXF0Y8+5GTpAL + pHDc1/pIGmMIGjoAMCAQGigZsEgZg2on5mSuxoDHEA1w9bcW9n + FdFxDKpdrQhVGVRDIzcCMCTzvUboqb5KjY1NJKJsfjRQiBYBdE + NKfzK+g5DlV8nrw81uOcP8NOQCLR5XkoMHC0Dr/80ziQzbNqhx + O6652Npft0LQwJvenwDI13YxpwOdMXzkWZN/XrEqOWp6GCgXTB + vCyLWLlWnbaUkZdEYbKHBPjd8t/1x5Yg== + S: + YGgGCSqGSIb3EgECAgIAb1kwV6ADAgEFoQMCAQ+iSzBJoAMC + AQGiQgRAtHTEuOP2BXb9sBYFR4SJlDZxmg39IxmRBOhXRKdDA0 + uHTCOT9Bq3OsUTXUlk0CsFLoa8j+gvGDlgHuqzWHPSQg== + C: + S: + YDMGCSqGSIb3EgECAgIBAAD/////6jcyG4GE3KkTzBeBiVHe + ceP2CWY0SR0fAQAgAAQEBAQ= + C: YDMGCSqGSIb3EgECAgIBAAD/////3LQBHXTpFfZgrejpLlLImP + wkhbfa2QteAQAgAG1yYwE= + S: A001 OK GSSAPI authentication successful + + Note: The line breaks within server challenges and client + responses are for editorial clarity and are not in real + authenticators. + + +6.2.3. LOGIN Command + + Arguments: user name + password + + Responses: no specific responses for this command + + Result: OK - login completed, now in authenticated state + NO - login failure: user name or password rejected + BAD - command unknown or arguments invalid + + The LOGIN command identifies the client to the server and carries + the plaintext password authenticating this user. + + + + + + +Crispin Standards Track [Page 30] + +RFC 3501 IMAPv4 March 2003 + + + A server MAY include a CAPABILITY response code in the tagged OK + response to a successful LOGIN command in order to send + capabilities automatically. It is unnecessary for a client to + send a separate CAPABILITY command if it recognizes these + automatic capabilities. + + Example: C: a001 LOGIN SMITH SESAME + S: a001 OK LOGIN completed + + Note: Use of the LOGIN command over an insecure network + (such as the Internet) is a security risk, because anyone + monitoring network traffic can obtain plaintext passwords. + The LOGIN command SHOULD NOT be used except as a last + resort, and it is recommended that client implementations + have a means to disable any automatic use of the LOGIN + command. + + Unless either the STARTTLS command has been negotiated or + some other mechanism that protects the session from + password snooping has been provided, a server + implementation MUST implement a configuration in which it + advertises the LOGINDISABLED capability and does NOT permit + the LOGIN command. Server sites SHOULD NOT use any + configuration which permits the LOGIN command without such + a protection mechanism against password snooping. A client + implementation MUST NOT send a LOGIN command if the + LOGINDISABLED capability is advertised. + +6.3. Client Commands - Authenticated State + + In the authenticated state, commands that manipulate mailboxes as + atomic entities are permitted. Of these commands, the SELECT and + EXAMINE commands will select a mailbox for access and enter the + selected state. + + In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), + the following commands are valid in the authenticated state: SELECT, + EXAMINE, CREATE, DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB, + STATUS, and APPEND. + + + + + + + + + + + + +Crispin Standards Track [Page 31] + +RFC 3501 IMAPv4 March 2003 + + +6.3.1. SELECT Command + + Arguments: mailbox name + + Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT + REQUIRED OK untagged responses: UNSEEN, PERMANENTFLAGS, + UIDNEXT, UIDVALIDITY + + Result: OK - select completed, now in selected state + NO - select failure, now in authenticated state: no + such mailbox, can't access mailbox + BAD - command unknown or arguments invalid + + The SELECT command selects a mailbox so that messages in the + mailbox can be accessed. Before returning an OK to the client, + the server MUST send the following untagged data to the client. + Note that earlier versions of this protocol only required the + FLAGS, EXISTS, and RECENT untagged data; consequently, client + implementations SHOULD implement default behavior for missing data + as discussed with the individual item. + + FLAGS Defined flags in the mailbox. See the description + of the FLAGS response for more detail. + + <n> EXISTS The number of messages in the mailbox. See the + description of the EXISTS response for more detail. + + <n> RECENT The number of messages with the \Recent flag set. + See the description of the RECENT response for more + detail. + + OK [UNSEEN <n>] + The message sequence number of the first unseen + message in the mailbox. If this is missing, the + client can not make any assumptions about the first + unseen message in the mailbox, and needs to issue a + SEARCH command if it wants to find it. + + OK [PERMANENTFLAGS (<list of flags>)] + A list of message flags that the client can change + permanently. If this is missing, the client should + assume that all flags can be changed permanently. + + OK [UIDNEXT <n>] + The next unique identifier value. Refer to section + 2.3.1.1 for more information. If this is missing, + the client can not make any assumptions about the + next unique identifier value. + + + +Crispin Standards Track [Page 32] + +RFC 3501 IMAPv4 March 2003 + + + OK [UIDVALIDITY <n>] + The unique identifier validity value. Refer to + section 2.3.1.1 for more information. If this is + missing, the server does not support unique + identifiers. + + Only one mailbox can be selected at a time in a connection; + simultaneous access to multiple mailboxes requires multiple + connections. The SELECT command automatically deselects any + currently selected mailbox before attempting the new selection. + Consequently, if a mailbox is selected and a SELECT command that + fails is attempted, no mailbox is selected. + + If the client is permitted to modify the mailbox, the server + SHOULD prefix the text of the tagged OK response with the + "[READ-WRITE]" response code. + + If the client is not permitted to modify the mailbox but is + permitted read access, the mailbox is selected as read-only, and + the server MUST prefix the text of the tagged OK response to + SELECT with the "[READ-ONLY]" response code. Read-only access + through SELECT differs from the EXAMINE command in that certain + read-only mailboxes MAY permit the change of permanent state on a + per-user (as opposed to global) basis. Netnews messages marked in + a server-based .newsrc file are an example of such per-user + permanent state that can be modified with read-only mailboxes. + + Example: C: A142 SELECT INBOX + S: * 172 EXISTS + S: * 1 RECENT + S: * OK [UNSEEN 12] Message 12 is first unseen + S: * OK [UIDVALIDITY 3857529045] UIDs valid + S: * OK [UIDNEXT 4392] Predicted next UID + S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) + S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited + S: A142 OK [READ-WRITE] SELECT completed + + + + + + + + + + + + + + + +Crispin Standards Track [Page 33] + +RFC 3501 IMAPv4 March 2003 + + +6.3.2. EXAMINE Command + + Arguments: mailbox name + + Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT + REQUIRED OK untagged responses: UNSEEN, PERMANENTFLAGS, + UIDNEXT, UIDVALIDITY + + Result: OK - examine completed, now in selected state + NO - examine failure, now in authenticated state: no + such mailbox, can't access mailbox + BAD - command unknown or arguments invalid + + The EXAMINE command is identical to SELECT and returns the same + output; however, the selected mailbox is identified as read-only. + No changes to the permanent state of the mailbox, including + per-user state, are permitted; in particular, EXAMINE MUST NOT + cause messages to lose the \Recent flag. + + The text of the tagged OK response to the EXAMINE command MUST + begin with the "[READ-ONLY]" response code. + + Example: C: A932 EXAMINE blurdybloop + S: * 17 EXISTS + S: * 2 RECENT + S: * OK [UNSEEN 8] Message 8 is first unseen + S: * OK [UIDVALIDITY 3857529045] UIDs valid + S: * OK [UIDNEXT 4392] Predicted next UID + S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) + S: * OK [PERMANENTFLAGS ()] No permanent flags permitted + S: A932 OK [READ-ONLY] EXAMINE completed + + +6.3.3. CREATE Command + + Arguments: mailbox name + + Responses: no specific responses for this command + + Result: OK - create completed + NO - create failure: can't create mailbox with that name + BAD - command unknown or arguments invalid + + The CREATE command creates a mailbox with the given name. An OK + response is returned only if a new mailbox with that name has been + created. It is an error to attempt to create INBOX or a mailbox + with a name that refers to an extant mailbox. Any error in + creation will return a tagged NO response. + + + +Crispin Standards Track [Page 34] + +RFC 3501 IMAPv4 March 2003 + + + If the mailbox name is suffixed with the server's hierarchy + separator character (as returned from the server by a LIST + command), this is a declaration that the client intends to create + mailbox names under this name in the hierarchy. Server + implementations that do not require this declaration MUST ignore + the declaration. In any case, the name created is without the + trailing hierarchy delimiter. + + If the server's hierarchy separator character appears elsewhere in + the name, the server SHOULD create any superior hierarchical names + that are needed for the CREATE command to be successfully + completed. In other words, an attempt to create "foo/bar/zap" on + a server in which "/" is the hierarchy separator character SHOULD + create foo/ and foo/bar/ if they do not already exist. + + If a new mailbox is created with the same name as a mailbox which + was deleted, its unique identifiers MUST be greater than any + unique identifiers used in the previous incarnation of the mailbox + UNLESS the new incarnation has a different unique identifier + validity value. See the description of the UID command for more + detail. + + Example: C: A003 CREATE owatagusiam/ + S: A003 OK CREATE completed + C: A004 CREATE owatagusiam/blurdybloop + S: A004 OK CREATE completed + + Note: The interpretation of this example depends on whether + "/" was returned as the hierarchy separator from LIST. If + "/" is the hierarchy separator, a new level of hierarchy + named "owatagusiam" with a member called "blurdybloop" is + created. Otherwise, two mailboxes at the same hierarchy + level are created. + + +6.3.4. DELETE Command + + Arguments: mailbox name + + Responses: no specific responses for this command + + Result: OK - delete completed + NO - delete failure: can't delete mailbox with that name + BAD - command unknown or arguments invalid + + + + + + + +Crispin Standards Track [Page 35] + +RFC 3501 IMAPv4 March 2003 + + + The DELETE command permanently removes the mailbox with the given + name. A tagged OK response is returned only if the mailbox has + been deleted. It is an error to attempt to delete INBOX or a + mailbox name that does not exist. + + The DELETE command MUST NOT remove inferior hierarchical names. + For example, if a mailbox "foo" has an inferior "foo.bar" + (assuming "." is the hierarchy delimiter character), removing + "foo" MUST NOT remove "foo.bar". It is an error to attempt to + delete a name that has inferior hierarchical names and also has + the \Noselect mailbox name attribute (see the description of the + LIST response for more details). + + It is permitted to delete a name that has inferior hierarchical + names and does not have the \Noselect mailbox name attribute. In + this case, all messages in that mailbox are removed, and the name + will acquire the \Noselect mailbox name attribute. + + The value of the highest-used unique identifier of the deleted + mailbox MUST be preserved so that a new mailbox created with the + same name will not reuse the identifiers of the former + incarnation, UNLESS the new incarnation has a different unique + identifier validity value. See the description of the UID command + for more detail. + + Examples: C: A682 LIST "" * + S: * LIST () "/" blurdybloop + S: * LIST (\Noselect) "/" foo + S: * LIST () "/" foo/bar + S: A682 OK LIST completed + C: A683 DELETE blurdybloop + S: A683 OK DELETE completed + C: A684 DELETE foo + S: A684 NO Name "foo" has inferior hierarchical names + C: A685 DELETE foo/bar + S: A685 OK DELETE Completed + C: A686 LIST "" * + S: * LIST (\Noselect) "/" foo + S: A686 OK LIST completed + C: A687 DELETE foo + S: A687 OK DELETE Completed + + + + + + + + + + +Crispin Standards Track [Page 36] + +RFC 3501 IMAPv4 March 2003 + + + C: A82 LIST "" * + S: * LIST () "." blurdybloop + S: * LIST () "." foo + S: * LIST () "." foo.bar + S: A82 OK LIST completed + C: A83 DELETE blurdybloop + S: A83 OK DELETE completed + C: A84 DELETE foo + S: A84 OK DELETE Completed + C: A85 LIST "" * + S: * LIST () "." foo.bar + S: A85 OK LIST completed + C: A86 LIST "" % + S: * LIST (\Noselect) "." foo + S: A86 OK LIST completed + + +6.3.5. RENAME Command + + Arguments: existing mailbox name + new mailbox name + + Responses: no specific responses for this command + + Result: OK - rename completed + NO - rename failure: can't rename mailbox with that name, + can't rename to mailbox with that name + BAD - command unknown or arguments invalid + + The RENAME command changes the name of a mailbox. A tagged OK + response is returned only if the mailbox has been renamed. It is + an error to attempt to rename from a mailbox name that does not + exist or to a mailbox name that already exists. Any error in + renaming will return a tagged NO response. + + If the name has inferior hierarchical names, then the inferior + hierarchical names MUST also be renamed. For example, a rename of + "foo" to "zap" will rename "foo/bar" (assuming "/" is the + hierarchy delimiter character) to "zap/bar". + + If the server's hierarchy separator character appears in the name, + the server SHOULD create any superior hierarchical names that are + needed for the RENAME command to complete successfully. In other + words, an attempt to rename "foo/bar/zap" to baz/rag/zowie on a + server in which "/" is the hierarchy separator character SHOULD + create baz/ and baz/rag/ if they do not already exist. + + + + + +Crispin Standards Track [Page 37] + +RFC 3501 IMAPv4 March 2003 + + + The value of the highest-used unique identifier of the old mailbox + name MUST be preserved so that a new mailbox created with the same + name will not reuse the identifiers of the former incarnation, + UNLESS the new incarnation has a different unique identifier + validity value. See the description of the UID command for more + detail. + + Renaming INBOX is permitted, and has special behavior. It moves + all messages in INBOX to a new mailbox with the given name, + leaving INBOX empty. If the server implementation supports + inferior hierarchical names of INBOX, these are unaffected by a + rename of INBOX. + + Examples: C: A682 LIST "" * + S: * LIST () "/" blurdybloop + S: * LIST (\Noselect) "/" foo + S: * LIST () "/" foo/bar + S: A682 OK LIST completed + C: A683 RENAME blurdybloop sarasoop + S: A683 OK RENAME completed + C: A684 RENAME foo zowie + S: A684 OK RENAME Completed + C: A685 LIST "" * + S: * LIST () "/" sarasoop + S: * LIST (\Noselect) "/" zowie + S: * LIST () "/" zowie/bar + S: A685 OK LIST completed + + C: Z432 LIST "" * + S: * LIST () "." INBOX + S: * LIST () "." INBOX.bar + S: Z432 OK LIST completed + C: Z433 RENAME INBOX old-mail + S: Z433 OK RENAME completed + C: Z434 LIST "" * + S: * LIST () "." INBOX + S: * LIST () "." INBOX.bar + S: * LIST () "." old-mail + S: Z434 OK LIST completed + + + + + + + + + + + + +Crispin Standards Track [Page 38] + +RFC 3501 IMAPv4 March 2003 + + +6.3.6. SUBSCRIBE Command + + Arguments: mailbox + + Responses: no specific responses for this command + + Result: OK - subscribe completed + NO - subscribe failure: can't subscribe to that name + BAD - command unknown or arguments invalid + + The SUBSCRIBE command adds the specified mailbox name to the + server's set of "active" or "subscribed" mailboxes as returned by + the LSUB command. This command returns a tagged OK response only + if the subscription is successful. + + A server MAY validate the mailbox argument to SUBSCRIBE to verify + that it exists. However, it MUST NOT unilaterally remove an + existing mailbox name from the subscription list even if a mailbox + by that name no longer exists. + + Note: This requirement is because a server site can + choose to routinely remove a mailbox with a well-known + name (e.g., "system-alerts") after its contents expire, + with the intention of recreating it when new contents + are appropriate. + + + Example: C: A002 SUBSCRIBE #news.comp.mail.mime + S: A002 OK SUBSCRIBE completed + + +6.3.7. UNSUBSCRIBE Command + + Arguments: mailbox name + + Responses: no specific responses for this command + + Result: OK - unsubscribe completed + NO - unsubscribe failure: can't unsubscribe that name + BAD - command unknown or arguments invalid + + The UNSUBSCRIBE command removes the specified mailbox name from + the server's set of "active" or "subscribed" mailboxes as returned + by the LSUB command. This command returns a tagged OK response + only if the unsubscription is successful. + + Example: C: A002 UNSUBSCRIBE #news.comp.mail.mime + S: A002 OK UNSUBSCRIBE completed + + + +Crispin Standards Track [Page 39] + +RFC 3501 IMAPv4 March 2003 + + +6.3.8. LIST Command + + Arguments: reference name + mailbox name with possible wildcards + + Responses: untagged responses: LIST + + Result: OK - list completed + NO - list failure: can't list that reference or name + BAD - command unknown or arguments invalid + + The LIST command returns a subset of names from the complete set + of all names available to the client. Zero or more untagged LIST + replies are returned, containing the name attributes, hierarchy + delimiter, and name; see the description of the LIST reply for + more detail. + + The LIST command SHOULD return its data quickly, without undue + delay. For example, it SHOULD NOT go to excess trouble to + calculate the \Marked or \Unmarked status or perform other + processing; if each name requires 1 second of processing, then a + list of 1200 names would take 20 minutes! + + An empty ("" string) reference name argument indicates that the + mailbox name is interpreted as by SELECT. The returned mailbox + names MUST match the supplied mailbox name pattern. A non-empty + reference name argument is the name of a mailbox or a level of + mailbox hierarchy, and indicates the context in which the mailbox + name is interpreted. + + An empty ("" string) mailbox name argument is a special request to + return the hierarchy delimiter and the root name of the name given + in the reference. The value returned as the root MAY be the empty + string if the reference is non-rooted or is an empty string. In + all cases, a hierarchy delimiter (or NIL if there is no hierarchy) + is returned. This permits a client to get the hierarchy delimiter + (or find out that the mailbox names are flat) even when no + mailboxes by that name currently exist. + + The reference and mailbox name arguments are interpreted into a + canonical form that represents an unambiguous left-to-right + hierarchy. The returned mailbox names will be in the interpreted + form. + + + + + + + + +Crispin Standards Track [Page 40] + +RFC 3501 IMAPv4 March 2003 + + + Note: The interpretation of the reference argument is + implementation-defined. It depends upon whether the + server implementation has a concept of the "current + working directory" and leading "break out characters", + which override the current working directory. + + For example, on a server which exports a UNIX or NT + filesystem, the reference argument contains the current + working directory, and the mailbox name argument would + contain the name as interpreted in the current working + directory. + + If a server implementation has no concept of break out + characters, the canonical form is normally the reference + name appended with the mailbox name. Note that if the + server implements the namespace convention (section + 5.1.2), "#" is a break out character and must be treated + as such. + + If the reference argument is not a level of mailbox + hierarchy (that is, it is a \NoInferiors name), and/or + the reference argument does not end with the hierarchy + delimiter, it is implementation-dependent how this is + interpreted. For example, a reference of "foo/bar" and + mailbox name of "rag/baz" could be interpreted as + "foo/bar/rag/baz", "foo/barrag/baz", or "foo/rag/baz". + A client SHOULD NOT use such a reference argument except + at the explicit request of the user. A hierarchical + browser MUST NOT make any assumptions about server + interpretation of the reference unless the reference is + a level of mailbox hierarchy AND ends with the hierarchy + delimiter. + + Any part of the reference argument that is included in the + interpreted form SHOULD prefix the interpreted form. It SHOULD + also be in the same form as the reference name argument. This + rule permits the client to determine if the returned mailbox name + is in the context of the reference argument, or if something about + the mailbox argument overrode the reference argument. Without + this rule, the client would have to have knowledge of the server's + naming semantics including what characters are "breakouts" that + override a naming context. + + + + + + + + + +Crispin Standards Track [Page 41] + +RFC 3501 IMAPv4 March 2003 + + + For example, here are some examples of how references + and mailbox names might be interpreted on a UNIX-based + server: + + Reference Mailbox Name Interpretation + ------------ ------------ -------------- + ~smith/Mail/ foo.* ~smith/Mail/foo.* + archive/ % archive/% + #news. comp.mail.* #news.comp.mail.* + ~smith/Mail/ /usr/doc/foo /usr/doc/foo + archive/ ~fred/Mail/* ~fred/Mail/* + + The first three examples demonstrate interpretations in + the context of the reference argument. Note that + "~smith/Mail" SHOULD NOT be transformed into something + like "/u2/users/smith/Mail", or it would be impossible + for the client to determine that the interpretation was + in the context of the reference. + + The character "*" is a wildcard, and matches zero or more + characters at this position. The character "%" is similar to "*", + but it does not match a hierarchy delimiter. If the "%" wildcard + is the last character of a mailbox name argument, matching levels + of hierarchy are also returned. If these levels of hierarchy are + not also selectable mailboxes, they are returned with the + \Noselect mailbox name attribute (see the description of the LIST + response for more details). + + Server implementations are permitted to "hide" otherwise + accessible mailboxes from the wildcard characters, by preventing + certain characters or names from matching a wildcard in certain + situations. For example, a UNIX-based server might restrict the + interpretation of "*" so that an initial "/" character does not + match. + + The special name INBOX is included in the output from LIST, if + INBOX is supported by this server for this user and if the + uppercase string "INBOX" matches the interpreted reference and + mailbox name arguments with wildcards as described above. The + criteria for omitting INBOX is whether SELECT INBOX will return + failure; it is not relevant whether the user's real INBOX resides + on this or some other server. + + + + + + + + + +Crispin Standards Track [Page 42] + +RFC 3501 IMAPv4 March 2003 + + + Example: C: A101 LIST "" "" + S: * LIST (\Noselect) "/" "" + S: A101 OK LIST Completed + C: A102 LIST #news.comp.mail.misc "" + S: * LIST (\Noselect) "." #news. + S: A102 OK LIST Completed + C: A103 LIST /usr/staff/jones "" + S: * LIST (\Noselect) "/" / + S: A103 OK LIST Completed + C: A202 LIST ~/Mail/ % + S: * LIST (\Noselect) "/" ~/Mail/foo + S: * LIST () "/" ~/Mail/meetings + S: A202 OK LIST completed + + +6.3.9. LSUB Command + + Arguments: reference name + mailbox name with possible wildcards + + Responses: untagged responses: LSUB + + Result: OK - lsub completed + NO - lsub failure: can't list that reference or name + BAD - command unknown or arguments invalid + + The LSUB command returns a subset of names from the set of names + that the user has declared as being "active" or "subscribed". + Zero or more untagged LSUB replies are returned. The arguments to + LSUB are in the same form as those for LIST. + + The returned untagged LSUB response MAY contain different mailbox + flags from a LIST untagged response. If this should happen, the + flags in the untagged LIST are considered more authoritative. + + A special situation occurs when using LSUB with the % wildcard. + Consider what happens if "foo/bar" (with a hierarchy delimiter of + "/") is subscribed but "foo" is not. A "%" wildcard to LSUB must + return foo, not foo/bar, in the LSUB response, and it MUST be + flagged with the \Noselect attribute. + + The server MUST NOT unilaterally remove an existing mailbox name + from the subscription list even if a mailbox by that name no + longer exists. + + + + + + + +Crispin Standards Track [Page 43] + +RFC 3501 IMAPv4 March 2003 + + + Example: C: A002 LSUB "#news." "comp.mail.*" + S: * LSUB () "." #news.comp.mail.mime + S: * LSUB () "." #news.comp.mail.misc + S: A002 OK LSUB completed + C: A003 LSUB "#news." "comp.%" + S: * LSUB (\NoSelect) "." #news.comp.mail + S: A003 OK LSUB completed + + +6.3.10. STATUS Command + + Arguments: mailbox name + status data item names + + Responses: untagged responses: STATUS + + Result: OK - status completed + NO - status failure: no status for that name + BAD - command unknown or arguments invalid + + The STATUS command requests the status of the indicated mailbox. + It does not change the currently selected mailbox, nor does it + affect the state of any messages in the queried mailbox (in + particular, STATUS MUST NOT cause messages to lose the \Recent + flag). + + The STATUS command provides an alternative to opening a second + IMAP4rev1 connection and doing an EXAMINE command on a mailbox to + query that mailbox's status without deselecting the current + mailbox in the first IMAP4rev1 connection. + + Unlike the LIST command, the STATUS command is not guaranteed to + be fast in its response. Under certain circumstances, it can be + quite slow. In some implementations, the server is obliged to + open the mailbox read-only internally to obtain certain status + information. Also unlike the LIST command, the STATUS command + does not accept wildcards. + + Note: The STATUS command is intended to access the + status of mailboxes other than the currently selected + mailbox. Because the STATUS command can cause the + mailbox to be opened internally, and because this + information is available by other means on the selected + mailbox, the STATUS command SHOULD NOT be used on the + currently selected mailbox. + + + + + + +Crispin Standards Track [Page 44] + +RFC 3501 IMAPv4 March 2003 + + + The STATUS command MUST NOT be used as a "check for new + messages in the selected mailbox" operation (refer to + sections 7, 7.3.1, and 7.3.2 for more information about + the proper method for new message checking). + + Because the STATUS command is not guaranteed to be fast + in its results, clients SHOULD NOT expect to be able to + issue many consecutive STATUS commands and obtain + reasonable performance. + + The currently defined status data items that can be requested are: + + MESSAGES + The number of messages in the mailbox. + + RECENT + The number of messages with the \Recent flag set. + + UIDNEXT + The next unique identifier value of the mailbox. Refer to + section 2.3.1.1 for more information. + + UIDVALIDITY + The unique identifier validity value of the mailbox. Refer to + section 2.3.1.1 for more information. + + UNSEEN + The number of messages which do not have the \Seen flag set. + + + Example: C: A042 STATUS blurdybloop (UIDNEXT MESSAGES) + S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292) + S: A042 OK STATUS completed + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 45] + +RFC 3501 IMAPv4 March 2003 + + +6.3.11. APPEND Command + + Arguments: mailbox name + OPTIONAL flag parenthesized list + OPTIONAL date/time string + message literal + + Responses: no specific responses for this command + + Result: OK - append completed + NO - append error: can't append to that mailbox, error + in flags or date/time or message text + BAD - command unknown or arguments invalid + + The APPEND command appends the literal argument as a new message + to the end of the specified destination mailbox. This argument + SHOULD be in the format of an [RFC-2822] message. 8-bit + characters are permitted in the message. A server implementation + that is unable to preserve 8-bit data properly MUST be able to + reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB] + content transfer encoding. + + Note: There MAY be exceptions, e.g., draft messages, in + which required [RFC-2822] header lines are omitted in + the message literal argument to APPEND. The full + implications of doing so MUST be understood and + carefully weighed. + + If a flag parenthesized list is specified, the flags SHOULD be set + in the resulting message; otherwise, the flag list of the + resulting message is set to empty by default. In either case, the + Recent flag is also set. + + If a date-time is specified, the internal date SHOULD be set in + the resulting message; otherwise, the internal date of the + resulting message is set to the current date and time by default. + + If the append is unsuccessful for any reason, the mailbox MUST be + restored to its state before the APPEND attempt; no partial + appending is permitted. + + If the destination mailbox does not exist, a server MUST return an + error, and MUST NOT automatically create the mailbox. Unless it + is certain that the destination mailbox can not be created, the + server MUST send the response code "[TRYCREATE]" as the prefix of + the text of the tagged NO response. This gives a hint to the + client that it can attempt a CREATE command and retry the APPEND + if the CREATE is successful. + + + +Crispin Standards Track [Page 46] + +RFC 3501 IMAPv4 March 2003 + + + If the mailbox is currently selected, the normal new message + actions SHOULD occur. Specifically, the server SHOULD notify the + client immediately via an untagged EXISTS response. If the server + does not do so, the client MAY issue a NOOP command (or failing + that, a CHECK command) after one or more APPEND commands. + + Example: C: A003 APPEND saved-messages (\Seen) {310} + S: + Ready for literal data + C: Date: Mon, 7 Feb 1994 21:52:25 -0800 (PST) + C: From: Fred Foobar <foobar@Blurdybloop.COM> + C: Subject: afternoon meeting + C: To: mooch@owatagu.siam.edu + C: Message-Id: <B27397-0100000@Blurdybloop.COM> + C: MIME-Version: 1.0 + C: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII + C: + C: Hello Joe, do you think we can meet at 3:30 tomorrow? + C: + S: A003 OK APPEND completed + + Note: The APPEND command is not used for message delivery, + because it does not provide a mechanism to transfer [SMTP] + envelope information. + +6.4. Client Commands - Selected State + + In the selected state, commands that manipulate messages in a mailbox + are permitted. + + In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT), + and the authenticated state commands (SELECT, EXAMINE, CREATE, + DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB, STATUS, and + APPEND), the following commands are valid in the selected state: + CHECK, CLOSE, EXPUNGE, SEARCH, FETCH, STORE, COPY, and UID. + +6.4.1. CHECK Command + + Arguments: none + + Responses: no specific responses for this command + + Result: OK - check completed + BAD - command unknown or arguments invalid + + The CHECK command requests a checkpoint of the currently selected + mailbox. A checkpoint refers to any implementation-dependent + housekeeping associated with the mailbox (e.g., resolving the + server's in-memory state of the mailbox with the state on its + + + +Crispin Standards Track [Page 47] + +RFC 3501 IMAPv4 March 2003 + + + disk) that is not normally executed as part of each command. A + checkpoint MAY take a non-instantaneous amount of real time to + complete. If a server implementation has no such housekeeping + considerations, CHECK is equivalent to NOOP. + + There is no guarantee that an EXISTS untagged response will happen + as a result of CHECK. NOOP, not CHECK, SHOULD be used for new + message polling. + + Example: C: FXXZ CHECK + S: FXXZ OK CHECK Completed + + +6.4.2. CLOSE Command + + Arguments: none + + Responses: no specific responses for this command + + Result: OK - close completed, now in authenticated state + BAD - command unknown or arguments invalid + + The CLOSE command permanently removes all messages that have the + \Deleted flag set from the currently selected mailbox, and returns + to the authenticated state from the selected state. No untagged + EXPUNGE responses are sent. + + No messages are removed, and no error is given, if the mailbox is + selected by an EXAMINE command or is otherwise selected read-only. + + Even if a mailbox is selected, a SELECT, EXAMINE, or LOGOUT + command MAY be issued without previously issuing a CLOSE command. + The SELECT, EXAMINE, and LOGOUT commands implicitly close the + currently selected mailbox without doing an expunge. However, + when many messages are deleted, a CLOSE-LOGOUT or CLOSE-SELECT + sequence is considerably faster than an EXPUNGE-LOGOUT or + EXPUNGE-SELECT because no untagged EXPUNGE responses (which the + client would probably ignore) are sent. + + Example: C: A341 CLOSE + S: A341 OK CLOSE completed + + + + + + + + + + +Crispin Standards Track [Page 48] + +RFC 3501 IMAPv4 March 2003 + + +6.4.3. EXPUNGE Command + + Arguments: none + + Responses: untagged responses: EXPUNGE + + Result: OK - expunge completed + NO - expunge failure: can't expunge (e.g., permission + denied) + BAD - command unknown or arguments invalid + + The EXPUNGE command permanently removes all messages that have the + \Deleted flag set from the currently selected mailbox. Before + returning an OK to the client, an untagged EXPUNGE response is + sent for each message that is removed. + + Example: C: A202 EXPUNGE + S: * 3 EXPUNGE + S: * 3 EXPUNGE + S: * 5 EXPUNGE + S: * 8 EXPUNGE + S: A202 OK EXPUNGE completed + + Note: In this example, messages 3, 4, 7, and 11 had the + \Deleted flag set. See the description of the EXPUNGE + response for further explanation. + + +6.4.4. SEARCH Command + + Arguments: OPTIONAL [CHARSET] specification + searching criteria (one or more) + + Responses: REQUIRED untagged response: SEARCH + + Result: OK - search completed + NO - search error: can't search that [CHARSET] or + criteria + BAD - command unknown or arguments invalid + + The SEARCH command searches the mailbox for messages that match + the given searching criteria. Searching criteria consist of one + or more search keys. The untagged SEARCH response from the server + contains a listing of message sequence numbers corresponding to + those messages that match the searching criteria. + + + + + + +Crispin Standards Track [Page 49] + +RFC 3501 IMAPv4 March 2003 + + + When multiple keys are specified, the result is the intersection + (AND function) of all the messages that match those keys. For + example, the criteria DELETED FROM "SMITH" SINCE 1-Feb-1994 refers + to all deleted messages from Smith that were placed in the mailbox + since February 1, 1994. A search key can also be a parenthesized + list of one or more search keys (e.g., for use with the OR and NOT + keys). + + Server implementations MAY exclude [MIME-IMB] body parts with + terminal content media types other than TEXT and MESSAGE from + consideration in SEARCH matching. + + The OPTIONAL [CHARSET] specification consists of the word + "CHARSET" followed by a registered [CHARSET]. It indicates the + [CHARSET] of the strings that appear in the search criteria. + [MIME-IMB] content transfer encodings, and [MIME-HDRS] strings in + [RFC-2822]/[MIME-IMB] headers, MUST be decoded before comparing + text in a [CHARSET] other than US-ASCII. US-ASCII MUST be + supported; other [CHARSET]s MAY be supported. + + If the server does not support the specified [CHARSET], it MUST + return a tagged NO response (not a BAD). This response SHOULD + contain the BADCHARSET response code, which MAY list the + [CHARSET]s supported by the server. + + In all search keys that use strings, a message matches the key if + the string is a substring of the field. The matching is + case-insensitive. + + The defined search keys are as follows. Refer to the Formal + Syntax section for the precise syntactic definitions of the + arguments. + + <sequence set> + Messages with message sequence numbers corresponding to the + specified message sequence number set. + + ALL + All messages in the mailbox; the default initial key for + ANDing. + + ANSWERED + Messages with the \Answered flag set. + + + + + + + + +Crispin Standards Track [Page 50] + +RFC 3501 IMAPv4 March 2003 + + + BCC <string> + Messages that contain the specified string in the envelope + structure's BCC field. + + BEFORE <date> + Messages whose internal date (disregarding time and timezone) + is earlier than the specified date. + + BODY <string> + Messages that contain the specified string in the body of the + message. + + CC <string> + Messages that contain the specified string in the envelope + structure's CC field. + + DELETED + Messages with the \Deleted flag set. + + DRAFT + Messages with the \Draft flag set. + + FLAGGED + Messages with the \Flagged flag set. + + FROM <string> + Messages that contain the specified string in the envelope + structure's FROM field. + + HEADER <field-name> <string> + Messages that have a header with the specified field-name (as + defined in [RFC-2822]) and that contains the specified string + in the text of the header (what comes after the colon). If the + string to search is zero-length, this matches all messages that + have a header line with the specified field-name regardless of + the contents. + + KEYWORD <flag> + Messages with the specified keyword flag set. + + LARGER <n> + Messages with an [RFC-2822] size larger than the specified + number of octets. + + NEW + Messages that have the \Recent flag set but not the \Seen flag. + This is functionally equivalent to "(RECENT UNSEEN)". + + + + +Crispin Standards Track [Page 51] + +RFC 3501 IMAPv4 March 2003 + + + NOT <search-key> + Messages that do not match the specified search key. + + OLD + Messages that do not have the \Recent flag set. This is + functionally equivalent to "NOT RECENT" (as opposed to "NOT + NEW"). + + ON <date> + Messages whose internal date (disregarding time and timezone) + is within the specified date. + + OR <search-key1> <search-key2> + Messages that match either search key. + + RECENT + Messages that have the \Recent flag set. + + SEEN + Messages that have the \Seen flag set. + + SENTBEFORE <date> + Messages whose [RFC-2822] Date: header (disregarding time and + timezone) is earlier than the specified date. + + SENTON <date> + Messages whose [RFC-2822] Date: header (disregarding time and + timezone) is within the specified date. + + SENTSINCE <date> + Messages whose [RFC-2822] Date: header (disregarding time and + timezone) is within or later than the specified date. + + SINCE <date> + Messages whose internal date (disregarding time and timezone) + is within or later than the specified date. + + SMALLER <n> + Messages with an [RFC-2822] size smaller than the specified + number of octets. + + + + + + + + + + + +Crispin Standards Track [Page 52] + +RFC 3501 IMAPv4 March 2003 + + + SUBJECT <string> + Messages that contain the specified string in the envelope + structure's SUBJECT field. + + TEXT <string> + Messages that contain the specified string in the header or + body of the message. + + TO <string> + Messages that contain the specified string in the envelope + structure's TO field. + + UID <sequence set> + Messages with unique identifiers corresponding to the specified + unique identifier set. Sequence set ranges are permitted. + + UNANSWERED + Messages that do not have the \Answered flag set. + + UNDELETED + Messages that do not have the \Deleted flag set. + + UNDRAFT + Messages that do not have the \Draft flag set. + + UNFLAGGED + Messages that do not have the \Flagged flag set. + + UNKEYWORD <flag> + Messages that do not have the specified keyword flag set. + + UNSEEN + Messages that do not have the \Seen flag set. + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 53] + +RFC 3501 IMAPv4 March 2003 + + + Example: C: A282 SEARCH FLAGGED SINCE 1-Feb-1994 NOT FROM "Smith" + S: * SEARCH 2 84 882 + S: A282 OK SEARCH completed + C: A283 SEARCH TEXT "string not in mailbox" + S: * SEARCH + S: A283 OK SEARCH completed + C: A284 SEARCH CHARSET UTF-8 TEXT {6} + C: XXXXXX + S: * SEARCH 43 + S: A284 OK SEARCH completed + + Note: Since this document is restricted to 7-bit ASCII + text, it is not possible to show actual UTF-8 data. The + "XXXXXX" is a placeholder for what would be 6 octets of + 8-bit data in an actual transaction. + + +6.4.5. FETCH Command + + Arguments: sequence set + message data item names or macro + + Responses: untagged responses: FETCH + + Result: OK - fetch completed + NO - fetch error: can't fetch that data + BAD - command unknown or arguments invalid + + The FETCH command retrieves data associated with a message in the + mailbox. The data items to be fetched can be either a single atom + or a parenthesized list. + + Most data items, identified in the formal syntax under the + msg-att-static rule, are static and MUST NOT change for any + particular message. Other data items, identified in the formal + syntax under the msg-att-dynamic rule, MAY change, either as a + result of a STORE command or due to external events. + + For example, if a client receives an ENVELOPE for a + message when it already knows the envelope, it can + safely ignore the newly transmitted envelope. + + There are three macros which specify commonly-used sets of data + items, and can be used instead of data items. A macro must be + used by itself, and not in conjunction with other macros or data + items. + + + + + +Crispin Standards Track [Page 54] + +RFC 3501 IMAPv4 March 2003 + + + ALL + Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE) + + FAST + Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE) + + FULL + Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE + BODY) + + The currently defined data items that can be fetched are: + + BODY + Non-extensible form of BODYSTRUCTURE. + + BODY[<section>]<<partial>> + The text of a particular body section. The section + specification is a set of zero or more part specifiers + delimited by periods. A part specifier is either a part number + or one of the following: HEADER, HEADER.FIELDS, + HEADER.FIELDS.NOT, MIME, and TEXT. An empty section + specification refers to the entire message, including the + header. + + Every message has at least one part number. Non-[MIME-IMB] + messages, and non-multipart [MIME-IMB] messages with no + encapsulated message, only have a part 1. + + Multipart messages are assigned consecutive part numbers, as + they occur in the message. If a particular part is of type + message or multipart, its parts MUST be indicated by a period + followed by the part number within that nested multipart part. + + A part of type MESSAGE/RFC822 also has nested part numbers, + referring to parts of the MESSAGE part's body. + + The HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, and TEXT part + specifiers can be the sole part specifier or can be prefixed by + one or more numeric part specifiers, provided that the numeric + part specifier refers to a part of type MESSAGE/RFC822. The + MIME part specifier MUST be prefixed by one or more numeric + part specifiers. + + The HEADER, HEADER.FIELDS, and HEADER.FIELDS.NOT part + specifiers refer to the [RFC-2822] header of the message or of + an encapsulated [MIME-IMT] MESSAGE/RFC822 message. + HEADER.FIELDS and HEADER.FIELDS.NOT are followed by a list of + field-name (as defined in [RFC-2822]) names, and return a + + + +Crispin Standards Track [Page 55] + +RFC 3501 IMAPv4 March 2003 + + + subset of the header. The subset returned by HEADER.FIELDS + contains only those header fields with a field-name that + matches one of the names in the list; similarly, the subset + returned by HEADER.FIELDS.NOT contains only the header fields + with a non-matching field-name. The field-matching is + case-insensitive but otherwise exact. Subsetting does not + exclude the [RFC-2822] delimiting blank line between the header + and the body; the blank line is included in all header fetches, + except in the case of a message which has no body and no blank + line. + + The MIME part specifier refers to the [MIME-IMB] header for + this part. + + The TEXT part specifier refers to the text body of the message, + omitting the [RFC-2822] header. + + Here is an example of a complex message with some of its + part specifiers: + + HEADER ([RFC-2822] header of the message) + TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED + 1 TEXT/PLAIN + 2 APPLICATION/OCTET-STREAM + 3 MESSAGE/RFC822 + 3.HEADER ([RFC-2822] header of the message) + 3.TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED + 3.1 TEXT/PLAIN + 3.2 APPLICATION/OCTET-STREAM + 4 MULTIPART/MIXED + 4.1 IMAGE/GIF + 4.1.MIME ([MIME-IMB] header for the IMAGE/GIF) + 4.2 MESSAGE/RFC822 + 4.2.HEADER ([RFC-2822] header of the message) + 4.2.TEXT ([RFC-2822] text body of the message) MULTIPART/MIXED + 4.2.1 TEXT/PLAIN + 4.2.2 MULTIPART/ALTERNATIVE + 4.2.2.1 TEXT/PLAIN + 4.2.2.2 TEXT/RICHTEXT + + + It is possible to fetch a substring of the designated text. + This is done by appending an open angle bracket ("<"), the + octet position of the first desired octet, a period, the + maximum number of octets desired, and a close angle bracket + (">") to the part specifier. If the starting octet is beyond + the end of the text, an empty string is returned. + + + + +Crispin Standards Track [Page 56] + +RFC 3501 IMAPv4 March 2003 + + + Any partial fetch that attempts to read beyond the end of the + text is truncated as appropriate. A partial fetch that starts + at octet 0 is returned as a partial fetch, even if this + truncation happened. + + Note: This means that BODY[]<0.2048> of a 1500-octet message + will return BODY[]<0> with a literal of size 1500, not + BODY[]. + + Note: A substring fetch of a HEADER.FIELDS or + HEADER.FIELDS.NOT part specifier is calculated after + subsetting the header. + + The \Seen flag is implicitly set; if this causes the flags to + change, they SHOULD be included as part of the FETCH responses. + + BODY.PEEK[<section>]<<partial>> + An alternate form of BODY[<section>] that does not implicitly + set the \Seen flag. + + BODYSTRUCTURE + The [MIME-IMB] body structure of the message. This is computed + by the server by parsing the [MIME-IMB] header fields in the + [RFC-2822] header and [MIME-IMB] headers. + + ENVELOPE + The envelope structure of the message. This is computed by the + server by parsing the [RFC-2822] header into the component + parts, defaulting various fields as necessary. + + FLAGS + The flags that are set for this message. + + INTERNALDATE + The internal date of the message. + + RFC822 + Functionally equivalent to BODY[], differing in the syntax of + the resulting untagged FETCH data (RFC822 is returned). + + RFC822.HEADER + Functionally equivalent to BODY.PEEK[HEADER], differing in the + syntax of the resulting untagged FETCH data (RFC822.HEADER is + returned). + + RFC822.SIZE + The [RFC-2822] size of the message. + + + + +Crispin Standards Track [Page 57] + +RFC 3501 IMAPv4 March 2003 + + + RFC822.TEXT + Functionally equivalent to BODY[TEXT], differing in the syntax + of the resulting untagged FETCH data (RFC822.TEXT is returned). + + UID + The unique identifier for the message. + + + Example: C: A654 FETCH 2:4 (FLAGS BODY[HEADER.FIELDS (DATE FROM)]) + S: * 2 FETCH .... + S: * 3 FETCH .... + S: * 4 FETCH .... + S: A654 OK FETCH completed + + +6.4.6. STORE Command + + Arguments: sequence set + message data item name + value for message data item + + Responses: untagged responses: FETCH + + Result: OK - store completed + NO - store error: can't store that data + BAD - command unknown or arguments invalid + + The STORE command alters data associated with a message in the + mailbox. Normally, STORE will return the updated value of the + data with an untagged FETCH response. A suffix of ".SILENT" in + the data item name prevents the untagged FETCH, and the server + SHOULD assume that the client has determined the updated value + itself or does not care about the updated value. + + Note: Regardless of whether or not the ".SILENT" suffix + was used, the server SHOULD send an untagged FETCH + response if a change to a message's flags from an + external source is observed. The intent is that the + status of the flags is determinate without a race + condition. + + + + + + + + + + + +Crispin Standards Track [Page 58] + +RFC 3501 IMAPv4 March 2003 + + + The currently defined data items that can be stored are: + + FLAGS <flag list> + Replace the flags for the message (other than \Recent) with the + argument. The new value of the flags is returned as if a FETCH + of those flags was done. + + FLAGS.SILENT <flag list> + Equivalent to FLAGS, but without returning a new value. + + +FLAGS <flag list> + Add the argument to the flags for the message. The new value + of the flags is returned as if a FETCH of those flags was done. + + +FLAGS.SILENT <flag list> + Equivalent to +FLAGS, but without returning a new value. + + -FLAGS <flag list> + Remove the argument from the flags for the message. The new + value of the flags is returned as if a FETCH of those flags was + done. + + -FLAGS.SILENT <flag list> + Equivalent to -FLAGS, but without returning a new value. + + + Example: C: A003 STORE 2:4 +FLAGS (\Deleted) + S: * 2 FETCH (FLAGS (\Deleted \Seen)) + S: * 3 FETCH (FLAGS (\Deleted)) + S: * 4 FETCH (FLAGS (\Deleted \Flagged \Seen)) + S: A003 OK STORE completed + + +6.4.7. COPY Command + + Arguments: sequence set + mailbox name + + Responses: no specific responses for this command + + Result: OK - copy completed + NO - copy error: can't copy those messages or to that + name + BAD - command unknown or arguments invalid + + + + + + + +Crispin Standards Track [Page 59] + +RFC 3501 IMAPv4 March 2003 + + + The COPY command copies the specified message(s) to the end of the + specified destination mailbox. The flags and internal date of the + message(s) SHOULD be preserved, and the Recent flag SHOULD be set, + in the copy. + + If the destination mailbox does not exist, a server SHOULD return + an error. It SHOULD NOT automatically create the mailbox. Unless + it is certain that the destination mailbox can not be created, the + server MUST send the response code "[TRYCREATE]" as the prefix of + the text of the tagged NO response. This gives a hint to the + client that it can attempt a CREATE command and retry the COPY if + the CREATE is successful. + + If the COPY command is unsuccessful for any reason, server + implementations MUST restore the destination mailbox to its state + before the COPY attempt. + + Example: C: A003 COPY 2:4 MEETING + S: A003 OK COPY completed + + +6.4.8. UID Command + + Arguments: command name + command arguments + + Responses: untagged responses: FETCH, SEARCH + + Result: OK - UID command completed + NO - UID command error + BAD - command unknown or arguments invalid + + The UID command has two forms. In the first form, it takes as its + arguments a COPY, FETCH, or STORE command with arguments + appropriate for the associated command. However, the numbers in + the sequence set argument are unique identifiers instead of + message sequence numbers. Sequence set ranges are permitted, but + there is no guarantee that unique identifiers will be contiguous. + + A non-existent unique identifier is ignored without any error + message generated. Thus, it is possible for a UID FETCH command + to return an OK without any data or a UID COPY or UID STORE to + return an OK without performing any operations. + + In the second form, the UID command takes a SEARCH command with + SEARCH command arguments. The interpretation of the arguments is + the same as with SEARCH; however, the numbers returned in a SEARCH + response for a UID SEARCH command are unique identifiers instead + + + +Crispin Standards Track [Page 60] + +RFC 3501 IMAPv4 March 2003 + + + of message sequence numbers. For example, the command UID SEARCH + 1:100 UID 443:557 returns the unique identifiers corresponding to + the intersection of two sequence sets, the message sequence number + range 1:100 and the UID range 443:557. + + Note: in the above example, the UID range 443:557 + appears. The same comment about a non-existent unique + identifier being ignored without any error message also + applies here. Hence, even if neither UID 443 or 557 + exist, this range is valid and would include an existing + UID 495. + + Also note that a UID range of 559:* always includes the + UID of the last message in the mailbox, even if 559 is + higher than any assigned UID value. This is because the + contents of a range are independent of the order of the + range endpoints. Thus, any UID range with * as one of + the endpoints indicates at least one message (the + message with the highest numbered UID), unless the + mailbox is empty. + + The number after the "*" in an untagged FETCH response is always a + message sequence number, not a unique identifier, even for a UID + command response. However, server implementations MUST implicitly + include the UID message data item as part of any FETCH response + caused by a UID command, regardless of whether a UID was specified + as a message data item to the FETCH. + + + Note: The rule about including the UID message data item as part + of a FETCH response primarily applies to the UID FETCH and UID + STORE commands, including a UID FETCH command that does not + include UID as a message data item. Although it is unlikely that + the other UID commands will cause an untagged FETCH, this rule + applies to these commands as well. + + Example: C: A999 UID FETCH 4827313:4828442 FLAGS + S: * 23 FETCH (FLAGS (\Seen) UID 4827313) + S: * 24 FETCH (FLAGS (\Seen) UID 4827943) + S: * 25 FETCH (FLAGS (\Seen) UID 4828442) + S: A999 OK UID FETCH completed + + + + + + + + + + +Crispin Standards Track [Page 61] + +RFC 3501 IMAPv4 March 2003 + + +6.5. Client Commands - Experimental/Expansion + + +6.5.1. X<atom> Command + + Arguments: implementation defined + + Responses: implementation defined + + Result: OK - command completed + NO - failure + BAD - command unknown or arguments invalid + + Any command prefixed with an X is an experimental command. + Commands which are not part of this specification, a standard or + standards-track revision of this specification, or an + IESG-approved experimental protocol, MUST use the X prefix. + + Any added untagged responses issued by an experimental command + MUST also be prefixed with an X. Server implementations MUST NOT + send any such untagged responses, unless the client requested it + by issuing the associated experimental command. + + Example: C: a441 CAPABILITY + S: * CAPABILITY IMAP4rev1 XPIG-LATIN + S: a441 OK CAPABILITY completed + C: A442 XPIG-LATIN + S: * XPIG-LATIN ow-nay eaking-spay ig-pay atin-lay + S: A442 OK XPIG-LATIN ompleted-cay + +7. Server Responses + + Server responses are in three forms: status responses, server data, + and command continuation request. The information contained in a + server response, identified by "Contents:" in the response + descriptions below, is described by function, not by syntax. The + precise syntax of server responses is described in the Formal Syntax + section. + + The client MUST be prepared to accept any response at all times. + + Status responses can be tagged or untagged. Tagged status responses + indicate the completion result (OK, NO, or BAD status) of a client + command, and have a tag matching the command. + + Some status responses, and all server data, are untagged. An + untagged response is indicated by the token "*" instead of a tag. + Untagged status responses indicate server greeting, or server status + + + +Crispin Standards Track [Page 62] + +RFC 3501 IMAPv4 March 2003 + + + that does not indicate the completion of a command (for example, an + impending system shutdown alert). For historical reasons, untagged + server data responses are also called "unsolicited data", although + strictly speaking, only unilateral server data is truly + "unsolicited". + + Certain server data MUST be recorded by the client when it is + received; this is noted in the description of that data. Such data + conveys critical information which affects the interpretation of all + subsequent commands and responses (e.g., updates reflecting the + creation or destruction of messages). + + Other server data SHOULD be recorded for later reference; if the + client does not need to record the data, or if recording the data has + no obvious purpose (e.g., a SEARCH response when no SEARCH command is + in progress), the data SHOULD be ignored. + + An example of unilateral untagged server data occurs when the IMAP + connection is in the selected state. In the selected state, the + server checks the mailbox for new messages as part of command + execution. Normally, this is part of the execution of every command; + hence, a NOOP command suffices to check for new messages. If new + messages are found, the server sends untagged EXISTS and RECENT + responses reflecting the new size of the mailbox. Server + implementations that offer multiple simultaneous access to the same + mailbox SHOULD also send appropriate unilateral untagged FETCH and + EXPUNGE responses if another agent changes the state of any message + flags or expunges any messages. + + Command continuation request responses use the token "+" instead of a + tag. These responses are sent by the server to indicate acceptance + of an incomplete client command and readiness for the remainder of + the command. + +7.1. Server Responses - Status Responses + + Status responses are OK, NO, BAD, PREAUTH and BYE. OK, NO, and BAD + can be tagged or untagged. PREAUTH and BYE are always untagged. + + Status responses MAY include an OPTIONAL "response code". A response + code consists of data inside square brackets in the form of an atom, + possibly followed by a space and arguments. The response code + contains additional information or status codes for client software + beyond the OK/NO/BAD condition, and are defined when there is a + specific action that a client can take based upon the additional + information. + + + + + +Crispin Standards Track [Page 63] + +RFC 3501 IMAPv4 March 2003 + + + The currently defined response codes are: + + ALERT + + The human-readable text contains a special alert that MUST be + presented to the user in a fashion that calls the user's + attention to the message. + + BADCHARSET + + Optionally followed by a parenthesized list of charsets. A + SEARCH failed because the given charset is not supported by + this implementation. If the optional list of charsets is + given, this lists the charsets that are supported by this + implementation. + + CAPABILITY + + Followed by a list of capabilities. This can appear in the + initial OK or PREAUTH response to transmit an initial + capabilities list. This makes it unnecessary for a client to + send a separate CAPABILITY command if it recognizes this + response. + + PARSE + + The human-readable text represents an error in parsing the + [RFC-2822] header or [MIME-IMB] headers of a message in the + mailbox. + + PERMANENTFLAGS + + Followed by a parenthesized list of flags, indicates which of + the known flags the client can change permanently. Any flags + that are in the FLAGS untagged response, but not the + PERMANENTFLAGS list, can not be set permanently. If the client + attempts to STORE a flag that is not in the PERMANENTFLAGS + list, the server will either ignore the change or store the + state change for the remainder of the current session only. + The PERMANENTFLAGS list can also include the special flag \*, + which indicates that it is possible to create new keywords by + attempting to store those flags in the mailbox. + + + + + + + + + +Crispin Standards Track [Page 64] + +RFC 3501 IMAPv4 March 2003 + + + READ-ONLY + + The mailbox is selected read-only, or its access while selected + has changed from read-write to read-only. + + READ-WRITE + + The mailbox is selected read-write, or its access while + selected has changed from read-only to read-write. + + TRYCREATE + + An APPEND or COPY attempt is failing because the target mailbox + does not exist (as opposed to some other reason). This is a + hint to the client that the operation can succeed if the + mailbox is first created by the CREATE command. + + UIDNEXT + + Followed by a decimal number, indicates the next unique + identifier value. Refer to section 2.3.1.1 for more + information. + + UIDVALIDITY + + Followed by a decimal number, indicates the unique identifier + validity value. Refer to section 2.3.1.1 for more information. + + UNSEEN + + Followed by a decimal number, indicates the number of the first + message without the \Seen flag set. + + Additional response codes defined by particular client or server + implementations SHOULD be prefixed with an "X" until they are + added to a revision of this protocol. Client implementations + SHOULD ignore response codes that they do not recognize. + +7.1.1. OK Response + + Contents: OPTIONAL response code + human-readable text + + The OK response indicates an information message from the server. + When tagged, it indicates successful completion of the associated + command. The human-readable text MAY be presented to the user as + an information message. The untagged form indicates an + + + + +Crispin Standards Track [Page 65] + +RFC 3501 IMAPv4 March 2003 + + + information-only message; the nature of the information MAY be + indicated by a response code. + + The untagged form is also used as one of three possible greetings + at connection startup. It indicates that the connection is not + yet authenticated and that a LOGIN command is needed. + + Example: S: * OK IMAP4rev1 server ready + C: A001 LOGIN fred blurdybloop + S: * OK [ALERT] System shutdown in 10 minutes + S: A001 OK LOGIN Completed + + +7.1.2. NO Response + + Contents: OPTIONAL response code + human-readable text + + The NO response indicates an operational error message from the + server. When tagged, it indicates unsuccessful completion of the + associated command. The untagged form indicates a warning; the + command can still complete successfully. The human-readable text + describes the condition. + + Example: C: A222 COPY 1:2 owatagusiam + S: * NO Disk is 98% full, please delete unnecessary data + S: A222 OK COPY completed + C: A223 COPY 3:200 blurdybloop + S: * NO Disk is 98% full, please delete unnecessary data + S: * NO Disk is 99% full, please delete unnecessary data + S: A223 NO COPY failed: disk is full + + +7.1.3. BAD Response + + Contents: OPTIONAL response code + human-readable text + + The BAD response indicates an error message from the server. When + tagged, it reports a protocol-level error in the client's command; + the tag indicates the command that caused the error. The untagged + form indicates a protocol-level error for which the associated + command can not be determined; it can also indicate an internal + server failure. The human-readable text describes the condition. + + + + + + + +Crispin Standards Track [Page 66] + +RFC 3501 IMAPv4 March 2003 + + + Example: C: ...very long command line... + S: * BAD Command line too long + C: ...empty line... + S: * BAD Empty command line + C: A443 EXPUNGE + S: * BAD Disk crash, attempting salvage to a new disk! + S: * OK Salvage successful, no data lost + S: A443 OK Expunge completed + + +7.1.4. PREAUTH Response + + Contents: OPTIONAL response code + human-readable text + + The PREAUTH response is always untagged, and is one of three + possible greetings at connection startup. It indicates that the + connection has already been authenticated by external means; thus + no LOGIN command is needed. + + Example: S: * PREAUTH IMAP4rev1 server logged in as Smith + + +7.1.5. BYE Response + + Contents: OPTIONAL response code + human-readable text + + The BYE response is always untagged, and indicates that the server + is about to close the connection. The human-readable text MAY be + displayed to the user in a status report by the client. The BYE + response is sent under one of four conditions: + + 1) as part of a normal logout sequence. The server will close + the connection after sending the tagged OK response to the + LOGOUT command. + + 2) as a panic shutdown announcement. The server closes the + connection immediately. + + 3) as an announcement of an inactivity autologout. The server + closes the connection immediately. + + 4) as one of three possible greetings at connection startup, + indicating that the server is not willing to accept a + connection from this client. The server closes the + connection immediately. + + + + +Crispin Standards Track [Page 67] + +RFC 3501 IMAPv4 March 2003 + + + The difference between a BYE that occurs as part of a normal + LOGOUT sequence (the first case) and a BYE that occurs because of + a failure (the other three cases) is that the connection closes + immediately in the failure case. In all cases the client SHOULD + continue to read response data from the server until the + connection is closed; this will ensure that any pending untagged + or completion responses are read and processed. + + Example: S: * BYE Autologout; idle for too long + +7.2. Server Responses - Server and Mailbox Status + + These responses are always untagged. This is how server and mailbox + status data are transmitted from the server to the client. Many of + these responses typically result from a command with the same name. + +7.2.1. CAPABILITY Response + + Contents: capability listing + + The CAPABILITY response occurs as a result of a CAPABILITY + command. The capability listing contains a space-separated + listing of capability names that the server supports. The + capability listing MUST include the atom "IMAP4rev1". + + In addition, client and server implementations MUST implement the + STARTTLS, LOGINDISABLED, and AUTH=PLAIN (described in [IMAP-TLS]) + capabilities. See the Security Considerations section for + important information. + + A capability name which begins with "AUTH=" indicates that the + server supports that particular authentication mechanism. + + The LOGINDISABLED capability indicates that the LOGIN command is + disabled, and that the server will respond with a tagged NO + response to any attempt to use the LOGIN command even if the user + name and password are valid. An IMAP client MUST NOT issue the + LOGIN command if the server advertises the LOGINDISABLED + capability. + + Other capability names indicate that the server supports an + extension, revision, or amendment to the IMAP4rev1 protocol. + Server responses MUST conform to this document until the client + issues a command that uses the associated capability. + + Capability names MUST either begin with "X" or be standard or + standards-track IMAP4rev1 extensions, revisions, or amendments + registered with IANA. A server MUST NOT offer unregistered or + + + +Crispin Standards Track [Page 68] + +RFC 3501 IMAPv4 March 2003 + + + non-standard capability names, unless such names are prefixed with + an "X". + + Client implementations SHOULD NOT require any capability name + other than "IMAP4rev1", and MUST ignore any unknown capability + names. + + A server MAY send capabilities automatically, by using the + CAPABILITY response code in the initial PREAUTH or OK responses, + and by sending an updated CAPABILITY response code in the tagged + OK response as part of a successful authentication. It is + unnecessary for a client to send a separate CAPABILITY command if + it recognizes these automatic capabilities. + + Example: S: * CAPABILITY IMAP4rev1 STARTTLS AUTH=GSSAPI XPIG-LATIN + + +7.2.2. LIST Response + + Contents: name attributes + hierarchy delimiter + name + + The LIST response occurs as a result of a LIST command. It + returns a single name that matches the LIST specification. There + can be multiple LIST responses for a single LIST command. + + Four name attributes are defined: + + \Noinferiors + It is not possible for any child levels of hierarchy to exist + under this name; no child levels exist now and none can be + created in the future. + + \Noselect + It is not possible to use this name as a selectable mailbox. + + \Marked + The mailbox has been marked "interesting" by the server; the + mailbox probably contains messages that have been added since + the last time the mailbox was selected. + + \Unmarked + The mailbox does not contain any additional messages since the + last time the mailbox was selected. + + + + + + +Crispin Standards Track [Page 69] + +RFC 3501 IMAPv4 March 2003 + + + If it is not feasible for the server to determine whether or not + the mailbox is "interesting", or if the name is a \Noselect name, + the server SHOULD NOT send either \Marked or \Unmarked. + + The hierarchy delimiter is a character used to delimit levels of + hierarchy in a mailbox name. A client can use it to create child + mailboxes, and to search higher or lower levels of naming + hierarchy. All children of a top-level hierarchy node MUST use + the same separator character. A NIL hierarchy delimiter means + that no hierarchy exists; the name is a "flat" name. + + The name represents an unambiguous left-to-right hierarchy, and + MUST be valid for use as a reference in LIST and LSUB commands. + Unless \Noselect is indicated, the name MUST also be valid as an + argument for commands, such as SELECT, that accept mailbox names. + + Example: S: * LIST (\Noselect) "/" ~/Mail/foo + + +7.2.3. LSUB Response + + Contents: name attributes + hierarchy delimiter + name + + The LSUB response occurs as a result of an LSUB command. It + returns a single name that matches the LSUB specification. There + can be multiple LSUB responses for a single LSUB command. The + data is identical in format to the LIST response. + + Example: S: * LSUB () "." #news.comp.mail.misc + + +7.2.4 STATUS Response + + Contents: name + status parenthesized list + + The STATUS response occurs as a result of an STATUS command. It + returns the mailbox name that matches the STATUS specification and + the requested mailbox status information. + + Example: S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292) + + + + + + + + +Crispin Standards Track [Page 70] + +RFC 3501 IMAPv4 March 2003 + + +7.2.5. SEARCH Response + + Contents: zero or more numbers + + The SEARCH response occurs as a result of a SEARCH or UID SEARCH + command. The number(s) refer to those messages that match the + search criteria. For SEARCH, these are message sequence numbers; + for UID SEARCH, these are unique identifiers. Each number is + delimited by a space. + + Example: S: * SEARCH 2 3 6 + + +7.2.6. FLAGS Response + + Contents: flag parenthesized list + + The FLAGS response occurs as a result of a SELECT or EXAMINE + command. The flag parenthesized list identifies the flags (at a + minimum, the system-defined flags) that are applicable for this + mailbox. Flags other than the system flags can also exist, + depending on server implementation. + + The update from the FLAGS response MUST be recorded by the client. + + Example: S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) + + +7.3. Server Responses - Mailbox Size + + These responses are always untagged. This is how changes in the size + of the mailbox are transmitted from the server to the client. + Immediately following the "*" token is a number that represents a + message count. + +7.3.1. EXISTS Response + + Contents: none + + The EXISTS response reports the number of messages in the mailbox. + This response occurs as a result of a SELECT or EXAMINE command, + and if the size of the mailbox changes (e.g., new messages). + + The update from the EXISTS response MUST be recorded by the + client. + + Example: S: * 23 EXISTS + + + + +Crispin Standards Track [Page 71] + +RFC 3501 IMAPv4 March 2003 + + +7.3.2. RECENT Response + + Contents: none + + The RECENT response reports the number of messages with the + \Recent flag set. This response occurs as a result of a SELECT or + EXAMINE command, and if the size of the mailbox changes (e.g., new + messages). + + Note: It is not guaranteed that the message sequence + numbers of recent messages will be a contiguous range of + the highest n messages in the mailbox (where n is the + value reported by the RECENT response). Examples of + situations in which this is not the case are: multiple + clients having the same mailbox open (the first session + to be notified will see it as recent, others will + probably see it as non-recent), and when the mailbox is + re-ordered by a non-IMAP agent. + + The only reliable way to identify recent messages is to + look at message flags to see which have the \Recent flag + set, or to do a SEARCH RECENT. + + The update from the RECENT response MUST be recorded by the + client. + + Example: S: * 5 RECENT + + +7.4. Server Responses - Message Status + + These responses are always untagged. This is how message data are + transmitted from the server to the client, often as a result of a + command with the same name. Immediately following the "*" token is a + number that represents a message sequence number. + +7.4.1. EXPUNGE Response + + Contents: none + + The EXPUNGE response reports that the specified message sequence + number has been permanently removed from the mailbox. The message + sequence number for each successive message in the mailbox is + immediately decremented by 1, and this decrement is reflected in + message sequence numbers in subsequent responses (including other + untagged EXPUNGE responses). + + + + + +Crispin Standards Track [Page 72] + +RFC 3501 IMAPv4 March 2003 + + + The EXPUNGE response also decrements the number of messages in the + mailbox; it is not necessary to send an EXISTS response with the + new value. + + As a result of the immediate decrement rule, message sequence + numbers that appear in a set of successive EXPUNGE responses + depend upon whether the messages are removed starting from lower + numbers to higher numbers, or from higher numbers to lower + numbers. For example, if the last 5 messages in a 9-message + mailbox are expunged, a "lower to higher" server will send five + untagged EXPUNGE responses for message sequence number 5, whereas + a "higher to lower server" will send successive untagged EXPUNGE + responses for message sequence numbers 9, 8, 7, 6, and 5. + + An EXPUNGE response MUST NOT be sent when no command is in + progress, nor while responding to a FETCH, STORE, or SEARCH + command. This rule is necessary to prevent a loss of + synchronization of message sequence numbers between client and + server. A command is not "in progress" until the complete command + has been received; in particular, a command is not "in progress" + during the negotiation of command continuation. + + Note: UID FETCH, UID STORE, and UID SEARCH are different + commands from FETCH, STORE, and SEARCH. An EXPUNGE + response MAY be sent during a UID command. + + The update from the EXPUNGE response MUST be recorded by the + client. + + Example: S: * 44 EXPUNGE + + +7.4.2. FETCH Response + + Contents: message data + + The FETCH response returns data about a message to the client. + The data are pairs of data item names and their values in + parentheses. This response occurs as the result of a FETCH or + STORE command, as well as by unilateral server decision (e.g., + flag updates). + + The current data items are: + + BODY + A form of BODYSTRUCTURE without extension data. + + + + + +Crispin Standards Track [Page 73] + +RFC 3501 IMAPv4 March 2003 + + + BODY[<section>]<<origin octet>> + A string expressing the body contents of the specified section. + The string SHOULD be interpreted by the client according to the + content transfer encoding, body type, and subtype. + + If the origin octet is specified, this string is a substring of + the entire body contents, starting at that origin octet. This + means that BODY[]<0> MAY be truncated, but BODY[] is NEVER + truncated. + + Note: The origin octet facility MUST NOT be used by a server + in a FETCH response unless the client specifically requested + it by means of a FETCH of a BODY[<section>]<<partial>> data + item. + + 8-bit textual data is permitted if a [CHARSET] identifier is + part of the body parameter parenthesized list for this section. + Note that headers (part specifiers HEADER or MIME, or the + header portion of a MESSAGE/RFC822 part), MUST be 7-bit; 8-bit + characters are not permitted in headers. Note also that the + [RFC-2822] delimiting blank line between the header and the + body is not affected by header line subsetting; the blank line + is always included as part of header data, except in the case + of a message which has no body and no blank line. + + Non-textual data such as binary data MUST be transfer encoded + into a textual form, such as BASE64, prior to being sent to the + client. To derive the original binary data, the client MUST + decode the transfer encoded string. + + BODYSTRUCTURE + A parenthesized list that describes the [MIME-IMB] body + structure of a message. This is computed by the server by + parsing the [MIME-IMB] header fields, defaulting various fields + as necessary. + + For example, a simple text message of 48 lines and 2279 octets + can have a body structure of: ("TEXT" "PLAIN" ("CHARSET" + "US-ASCII") NIL NIL "7BIT" 2279 48) + + Multiple parts are indicated by parenthesis nesting. Instead + of a body type as the first element of the parenthesized list, + there is a sequence of one or more nested body structures. The + second element of the parenthesized list is the multipart + subtype (mixed, digest, parallel, alternative, etc.). + + + + + + +Crispin Standards Track [Page 74] + +RFC 3501 IMAPv4 March 2003 + + + For example, a two part message consisting of a text and a + BASE64-encoded text attachment can have a body structure of: + (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 + 23)("TEXT" "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff") + "<960723163407.20117h@cac.washington.edu>" "Compiler diff" + "BASE64" 4554 73) "MIXED") + + Extension data follows the multipart subtype. Extension data + is never returned with the BODY fetch, but can be returned with + a BODYSTRUCTURE fetch. Extension data, if present, MUST be in + the defined order. The extension data of a multipart body part + are in the following order: + + body parameter parenthesized list + A parenthesized list of attribute/value pairs [e.g., ("foo" + "bar" "baz" "rag") where "bar" is the value of "foo", and + "rag" is the value of "baz"] as defined in [MIME-IMB]. + + body disposition + A parenthesized list, consisting of a disposition type + string, followed by a parenthesized list of disposition + attribute/value pairs as defined in [DISPOSITION]. + + body language + A string or parenthesized list giving the body language + value as defined in [LANGUAGE-TAGS]. + + body location + A string list giving the body content URI as defined in + [LOCATION]. + + Any following extension data are not yet defined in this + version of the protocol. Such extension data can consist of + zero or more NILs, strings, numbers, or potentially nested + parenthesized lists of such data. Client implementations that + do a BODYSTRUCTURE fetch MUST be prepared to accept such + extension data. Server implementations MUST NOT send such + extension data until it has been defined by a revision of this + protocol. + + The basic fields of a non-multipart body part are in the + following order: + + body type + A string giving the content media type name as defined in + [MIME-IMB]. + + + + + +Crispin Standards Track [Page 75] + +RFC 3501 IMAPv4 March 2003 + + + body subtype + A string giving the content subtype name as defined in + [MIME-IMB]. + + body parameter parenthesized list + A parenthesized list of attribute/value pairs [e.g., ("foo" + "bar" "baz" "rag") where "bar" is the value of "foo" and + "rag" is the value of "baz"] as defined in [MIME-IMB]. + + body id + A string giving the content id as defined in [MIME-IMB]. + + body description + A string giving the content description as defined in + [MIME-IMB]. + + body encoding + A string giving the content transfer encoding as defined in + [MIME-IMB]. + + body size + A number giving the size of the body in octets. Note that + this size is the size in its transfer encoding and not the + resulting size after any decoding. + + A body type of type MESSAGE and subtype RFC822 contains, + immediately after the basic fields, the envelope structure, + body structure, and size in text lines of the encapsulated + message. + + A body type of type TEXT contains, immediately after the basic + fields, the size of the body in text lines. Note that this + size is the size in its content transfer encoding and not the + resulting size after any decoding. + + Extension data follows the basic fields and the type-specific + fields listed above. Extension data is never returned with the + BODY fetch, but can be returned with a BODYSTRUCTURE fetch. + Extension data, if present, MUST be in the defined order. + + The extension data of a non-multipart body part are in the + following order: + + body MD5 + A string giving the body MD5 value as defined in [MD5]. + + + + + + +Crispin Standards Track [Page 76] + +RFC 3501 IMAPv4 March 2003 + + + body disposition + A parenthesized list with the same content and function as + the body disposition for a multipart body part. + + body language + A string or parenthesized list giving the body language + value as defined in [LANGUAGE-TAGS]. + + body location + A string list giving the body content URI as defined in + [LOCATION]. + + Any following extension data are not yet defined in this + version of the protocol, and would be as described above under + multipart extension data. + + ENVELOPE + A parenthesized list that describes the envelope structure of a + message. This is computed by the server by parsing the + [RFC-2822] header into the component parts, defaulting various + fields as necessary. + + The fields of the envelope structure are in the following + order: date, subject, from, sender, reply-to, to, cc, bcc, + in-reply-to, and message-id. The date, subject, in-reply-to, + and message-id fields are strings. The from, sender, reply-to, + to, cc, and bcc fields are parenthesized lists of address + structures. + + An address structure is a parenthesized list that describes an + electronic mail address. The fields of an address structure + are in the following order: personal name, [SMTP] + at-domain-list (source route), mailbox name, and host name. + + [RFC-2822] group syntax is indicated by a special form of + address structure in which the host name field is NIL. If the + mailbox name field is also NIL, this is an end of group marker + (semi-colon in RFC 822 syntax). If the mailbox name field is + non-NIL, this is a start of group marker, and the mailbox name + field holds the group name phrase. + + If the Date, Subject, In-Reply-To, and Message-ID header lines + are absent in the [RFC-2822] header, the corresponding member + of the envelope is NIL; if these header lines are present but + empty the corresponding member of the envelope is the empty + string. + + + + + +Crispin Standards Track [Page 77] + +RFC 3501 IMAPv4 March 2003 + + + Note: some servers may return a NIL envelope member in the + "present but empty" case. Clients SHOULD treat NIL and + empty string as identical. + + Note: [RFC-2822] requires that all messages have a valid + Date header. Therefore, the date member in the envelope can + not be NIL or the empty string. + + Note: [RFC-2822] requires that the In-Reply-To and + Message-ID headers, if present, have non-empty content. + Therefore, the in-reply-to and message-id members in the + envelope can not be the empty string. + + If the From, To, cc, and bcc header lines are absent in the + [RFC-2822] header, or are present but empty, the corresponding + member of the envelope is NIL. + + If the Sender or Reply-To lines are absent in the [RFC-2822] + header, or are present but empty, the server sets the + corresponding member of the envelope to be the same value as + the from member (the client is not expected to know to do + this). + + Note: [RFC-2822] requires that all messages have a valid + From header. Therefore, the from, sender, and reply-to + members in the envelope can not be NIL. + + FLAGS + A parenthesized list of flags that are set for this message. + + INTERNALDATE + A string representing the internal date of the message. + + RFC822 + Equivalent to BODY[]. + + RFC822.HEADER + Equivalent to BODY[HEADER]. Note that this did not result in + \Seen being set, because RFC822.HEADER response data occurs as + a result of a FETCH of RFC822.HEADER. BODY[HEADER] response + data occurs as a result of a FETCH of BODY[HEADER] (which sets + \Seen) or BODY.PEEK[HEADER] (which does not set \Seen). + + RFC822.SIZE + A number expressing the [RFC-2822] size of the message. + + + + + + +Crispin Standards Track [Page 78] + +RFC 3501 IMAPv4 March 2003 + + + RFC822.TEXT + Equivalent to BODY[TEXT]. + + UID + A number expressing the unique identifier of the message. + + + Example: S: * 23 FETCH (FLAGS (\Seen) RFC822.SIZE 44827) + + +7.5. Server Responses - Command Continuation Request + + The command continuation request response is indicated by a "+" token + instead of a tag. This form of response indicates that the server is + ready to accept the continuation of a command from the client. The + remainder of this response is a line of text. + + This response is used in the AUTHENTICATE command to transmit server + data to the client, and request additional client data. This + response is also used if an argument to any command is a literal. + + The client is not permitted to send the octets of the literal unless + the server indicates that it is expected. This permits the server to + process commands and reject errors on a line-by-line basis. The + remainder of the command, including the CRLF that terminates a + command, follows the octets of the literal. If there are any + additional command arguments, the literal octets are followed by a + space and those arguments. + + Example: C: A001 LOGIN {11} + S: + Ready for additional command text + C: FRED FOOBAR {7} + S: + Ready for additional command text + C: fat man + S: A001 OK LOGIN completed + C: A044 BLURDYBLOOP {102856} + S: A044 BAD No such command as "BLURDYBLOOP" + + + + + + + + + + + + + + +Crispin Standards Track [Page 79] + +RFC 3501 IMAPv4 March 2003 + + +8. Sample IMAP4rev1 connection + + The following is a transcript of an IMAP4rev1 connection. A long + line in this sample is broken for editorial clarity. + +S: * OK IMAP4rev1 Service Ready +C: a001 login mrc secret +S: a001 OK LOGIN completed +C: a002 select inbox +S: * 18 EXISTS +S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) +S: * 2 RECENT +S: * OK [UNSEEN 17] Message 17 is the first unseen message +S: * OK [UIDVALIDITY 3857529045] UIDs valid +S: a002 OK [READ-WRITE] SELECT completed +C: a003 fetch 12 full +S: * 12 FETCH (FLAGS (\Seen) INTERNALDATE "17-Jul-1996 02:44:25 -0700" + RFC822.SIZE 4286 ENVELOPE ("Wed, 17 Jul 1996 02:23:25 -0700 (PDT)" + "IMAP4rev1 WG mtg summary and minutes" + (("Terry Gray" NIL "gray" "cac.washington.edu")) + (("Terry Gray" NIL "gray" "cac.washington.edu")) + (("Terry Gray" NIL "gray" "cac.washington.edu")) + ((NIL NIL "imap" "cac.washington.edu")) + ((NIL NIL "minutes" "CNRI.Reston.VA.US") + ("John Klensin" NIL "KLENSIN" "MIT.EDU")) NIL NIL + "<B27397-0100000@cac.washington.edu>") + BODY ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 3028 + 92)) +S: a003 OK FETCH completed +C: a004 fetch 12 body[header] +S: * 12 FETCH (BODY[HEADER] {342} +S: Date: Wed, 17 Jul 1996 02:23:25 -0700 (PDT) +S: From: Terry Gray <gray@cac.washington.edu> +S: Subject: IMAP4rev1 WG mtg summary and minutes +S: To: imap@cac.washington.edu +S: cc: minutes@CNRI.Reston.VA.US, John Klensin <KLENSIN@MIT.EDU> +S: Message-Id: <B27397-0100000@cac.washington.edu> +S: MIME-Version: 1.0 +S: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII +S: +S: ) +S: a004 OK FETCH completed +C: a005 store 12 +flags \deleted +S: * 12 FETCH (FLAGS (\Seen \Deleted)) +S: a005 OK +FLAGS completed +C: a006 logout +S: * BYE IMAP4rev1 server terminating connection +S: a006 OK LOGOUT completed + + + +Crispin Standards Track [Page 80] + +RFC 3501 IMAPv4 March 2003 + + +9. Formal Syntax + + The following syntax specification uses the Augmented Backus-Naur + Form (ABNF) notation as specified in [ABNF]. + + In the case of alternative or optional rules in which a later rule + overlaps an earlier rule, the rule which is listed earlier MUST take + priority. For example, "\Seen" when parsed as a flag is the \Seen + flag name and not a flag-extension, even though "\Seen" can be parsed + as a flag-extension. Some, but not all, instances of this rule are + noted below. + + Note: [ABNF] rules MUST be followed strictly; in + particular: + + (1) Except as noted otherwise, all alphabetic characters + are case-insensitive. The use of upper or lower case + characters to define token strings is for editorial clarity + only. Implementations MUST accept these strings in a + case-insensitive fashion. + + (2) In all cases, SP refers to exactly one space. It is + NOT permitted to substitute TAB, insert additional spaces, + or otherwise treat SP as being equivalent to LWSP. + + (3) The ASCII NUL character, %x00, MUST NOT be used at any + time. + +address = "(" addr-name SP addr-adl SP addr-mailbox SP + addr-host ")" + +addr-adl = nstring + ; Holds route from [RFC-2822] route-addr if + ; non-NIL + +addr-host = nstring + ; NIL indicates [RFC-2822] group syntax. + ; Otherwise, holds [RFC-2822] domain name + +addr-mailbox = nstring + ; NIL indicates end of [RFC-2822] group; if + ; non-NIL and addr-host is NIL, holds + ; [RFC-2822] group name. + ; Otherwise, holds [RFC-2822] local-part + ; after removing [RFC-2822] quoting + + + + + + +Crispin Standards Track [Page 81] + +RFC 3501 IMAPv4 March 2003 + + +addr-name = nstring + ; If non-NIL, holds phrase from [RFC-2822] + ; mailbox after removing [RFC-2822] quoting + +append = "APPEND" SP mailbox [SP flag-list] [SP date-time] SP + literal + +astring = 1*ASTRING-CHAR / string + +ASTRING-CHAR = ATOM-CHAR / resp-specials + +atom = 1*ATOM-CHAR + +ATOM-CHAR = <any CHAR except atom-specials> + +atom-specials = "(" / ")" / "{" / SP / CTL / list-wildcards / + quoted-specials / resp-specials + +authenticate = "AUTHENTICATE" SP auth-type *(CRLF base64) + +auth-type = atom + ; Defined by [SASL] + +base64 = *(4base64-char) [base64-terminal] + +base64-char = ALPHA / DIGIT / "+" / "/" + ; Case-sensitive + +base64-terminal = (2base64-char "==") / (3base64-char "=") + +body = "(" (body-type-1part / body-type-mpart) ")" + +body-extension = nstring / number / + "(" body-extension *(SP body-extension) ")" + ; Future expansion. Client implementations + ; MUST accept body-extension fields. Server + ; implementations MUST NOT generate + ; body-extension fields except as defined by + ; future standard or standards-track + ; revisions of this specification. + +body-ext-1part = body-fld-md5 [SP body-fld-dsp [SP body-fld-lang + [SP body-fld-loc *(SP body-extension)]]] + ; MUST NOT be returned on non-extensible + ; "BODY" fetch + + + + + + +Crispin Standards Track [Page 82] + +RFC 3501 IMAPv4 March 2003 + + +body-ext-mpart = body-fld-param [SP body-fld-dsp [SP body-fld-lang + [SP body-fld-loc *(SP body-extension)]]] + ; MUST NOT be returned on non-extensible + ; "BODY" fetch + +body-fields = body-fld-param SP body-fld-id SP body-fld-desc SP + body-fld-enc SP body-fld-octets + +body-fld-desc = nstring + +body-fld-dsp = "(" string SP body-fld-param ")" / nil + +body-fld-enc = (DQUOTE ("7BIT" / "8BIT" / "BINARY" / "BASE64"/ + "QUOTED-PRINTABLE") DQUOTE) / string + +body-fld-id = nstring + +body-fld-lang = nstring / "(" string *(SP string) ")" + +body-fld-loc = nstring + +body-fld-lines = number + +body-fld-md5 = nstring + +body-fld-octets = number + +body-fld-param = "(" string SP string *(SP string SP string) ")" / nil + +body-type-1part = (body-type-basic / body-type-msg / body-type-text) + [SP body-ext-1part] + +body-type-basic = media-basic SP body-fields + ; MESSAGE subtype MUST NOT be "RFC822" + +body-type-mpart = 1*body SP media-subtype + [SP body-ext-mpart] + +body-type-msg = media-message SP body-fields SP envelope + SP body SP body-fld-lines + +body-type-text = media-text SP body-fields SP body-fld-lines + +capability = ("AUTH=" auth-type) / atom + ; New capabilities MUST begin with "X" or be + ; registered with IANA as standard or + ; standards-track + + + + +Crispin Standards Track [Page 83] + +RFC 3501 IMAPv4 March 2003 + + +capability-data = "CAPABILITY" *(SP capability) SP "IMAP4rev1" + *(SP capability) + ; Servers MUST implement the STARTTLS, AUTH=PLAIN, + ; and LOGINDISABLED capabilities + ; Servers which offer RFC 1730 compatibility MUST + ; list "IMAP4" as the first capability. + +CHAR8 = %x01-ff + ; any OCTET except NUL, %x00 + +command = tag SP (command-any / command-auth / command-nonauth / + command-select) CRLF + ; Modal based on state + +command-any = "CAPABILITY" / "LOGOUT" / "NOOP" / x-command + ; Valid in all states + +command-auth = append / create / delete / examine / list / lsub / + rename / select / status / subscribe / unsubscribe + ; Valid only in Authenticated or Selected state + +command-nonauth = login / authenticate / "STARTTLS" + ; Valid only when in Not Authenticated state + +command-select = "CHECK" / "CLOSE" / "EXPUNGE" / copy / fetch / store / + uid / search + ; Valid only when in Selected state + +continue-req = "+" SP (resp-text / base64) CRLF + +copy = "COPY" SP sequence-set SP mailbox + +create = "CREATE" SP mailbox + ; Use of INBOX gives a NO error + +date = date-text / DQUOTE date-text DQUOTE + +date-day = 1*2DIGIT + ; Day of month + +date-day-fixed = (SP DIGIT) / 2DIGIT + ; Fixed-format version of date-day + +date-month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / + "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" + +date-text = date-day "-" date-month "-" date-year + + + + +Crispin Standards Track [Page 84] + +RFC 3501 IMAPv4 March 2003 + + +date-year = 4DIGIT + +date-time = DQUOTE date-day-fixed "-" date-month "-" date-year + SP time SP zone DQUOTE + +delete = "DELETE" SP mailbox + ; Use of INBOX gives a NO error + +digit-nz = %x31-39 + ; 1-9 + +envelope = "(" env-date SP env-subject SP env-from SP + env-sender SP env-reply-to SP env-to SP env-cc SP + env-bcc SP env-in-reply-to SP env-message-id ")" + +env-bcc = "(" 1*address ")" / nil + +env-cc = "(" 1*address ")" / nil + +env-date = nstring + +env-from = "(" 1*address ")" / nil + +env-in-reply-to = nstring + +env-message-id = nstring + +env-reply-to = "(" 1*address ")" / nil + +env-sender = "(" 1*address ")" / nil + +env-subject = nstring + +env-to = "(" 1*address ")" / nil + +examine = "EXAMINE" SP mailbox + +fetch = "FETCH" SP sequence-set SP ("ALL" / "FULL" / "FAST" / + fetch-att / "(" fetch-att *(SP fetch-att) ")") + +fetch-att = "ENVELOPE" / "FLAGS" / "INTERNALDATE" / + "RFC822" [".HEADER" / ".SIZE" / ".TEXT"] / + "BODY" ["STRUCTURE"] / "UID" / + "BODY" section ["<" number "." nz-number ">"] / + "BODY.PEEK" section ["<" number "." nz-number ">"] + + + + + + +Crispin Standards Track [Page 85] + +RFC 3501 IMAPv4 March 2003 + + +flag = "\Answered" / "\Flagged" / "\Deleted" / + "\Seen" / "\Draft" / flag-keyword / flag-extension + ; Does not include "\Recent" + +flag-extension = "\" atom + ; Future expansion. Client implementations + ; MUST accept flag-extension flags. Server + ; implementations MUST NOT generate + ; flag-extension flags except as defined by + ; future standard or standards-track + ; revisions of this specification. + +flag-fetch = flag / "\Recent" + +flag-keyword = atom + +flag-list = "(" [flag *(SP flag)] ")" + +flag-perm = flag / "\*" + +greeting = "*" SP (resp-cond-auth / resp-cond-bye) CRLF + +header-fld-name = astring + +header-list = "(" header-fld-name *(SP header-fld-name) ")" + +list = "LIST" SP mailbox SP list-mailbox + +list-mailbox = 1*list-char / string + +list-char = ATOM-CHAR / list-wildcards / resp-specials + +list-wildcards = "%" / "*" + +literal = "{" number "}" CRLF *CHAR8 + ; Number represents the number of CHAR8s + +login = "LOGIN" SP userid SP password + +lsub = "LSUB" SP mailbox SP list-mailbox + + + + + + + + + + + +Crispin Standards Track [Page 86] + +RFC 3501 IMAPv4 March 2003 + + +mailbox = "INBOX" / astring + ; INBOX is case-insensitive. All case variants of + ; INBOX (e.g., "iNbOx") MUST be interpreted as INBOX + ; not as an astring. An astring which consists of + ; the case-insensitive sequence "I" "N" "B" "O" "X" + ; is considered to be INBOX and not an astring. + ; Refer to section 5.1 for further + ; semantic details of mailbox names. + +mailbox-data = "FLAGS" SP flag-list / "LIST" SP mailbox-list / + "LSUB" SP mailbox-list / "SEARCH" *(SP nz-number) / + "STATUS" SP mailbox SP "(" [status-att-list] ")" / + number SP "EXISTS" / number SP "RECENT" + +mailbox-list = "(" [mbx-list-flags] ")" SP + (DQUOTE QUOTED-CHAR DQUOTE / nil) SP mailbox + +mbx-list-flags = *(mbx-list-oflag SP) mbx-list-sflag + *(SP mbx-list-oflag) / + mbx-list-oflag *(SP mbx-list-oflag) + +mbx-list-oflag = "\Noinferiors" / flag-extension + ; Other flags; multiple possible per LIST response + +mbx-list-sflag = "\Noselect" / "\Marked" / "\Unmarked" + ; Selectability flags; only one per LIST response + +media-basic = ((DQUOTE ("APPLICATION" / "AUDIO" / "IMAGE" / + "MESSAGE" / "VIDEO") DQUOTE) / string) SP + media-subtype + ; Defined in [MIME-IMT] + +media-message = DQUOTE "MESSAGE" DQUOTE SP DQUOTE "RFC822" DQUOTE + ; Defined in [MIME-IMT] + +media-subtype = string + ; Defined in [MIME-IMT] + +media-text = DQUOTE "TEXT" DQUOTE SP media-subtype + ; Defined in [MIME-IMT] + +message-data = nz-number SP ("EXPUNGE" / ("FETCH" SP msg-att)) + +msg-att = "(" (msg-att-dynamic / msg-att-static) + *(SP (msg-att-dynamic / msg-att-static)) ")" + +msg-att-dynamic = "FLAGS" SP "(" [flag-fetch *(SP flag-fetch)] ")" + ; MAY change for a message + + + +Crispin Standards Track [Page 87] + +RFC 3501 IMAPv4 March 2003 + + +msg-att-static = "ENVELOPE" SP envelope / "INTERNALDATE" SP date-time / + "RFC822" [".HEADER" / ".TEXT"] SP nstring / + "RFC822.SIZE" SP number / + "BODY" ["STRUCTURE"] SP body / + "BODY" section ["<" number ">"] SP nstring / + "UID" SP uniqueid + ; MUST NOT change for a message + +nil = "NIL" + +nstring = string / nil + +number = 1*DIGIT + ; Unsigned 32-bit integer + ; (0 <= n < 4,294,967,296) + +nz-number = digit-nz *DIGIT + ; Non-zero unsigned 32-bit integer + ; (0 < n < 4,294,967,296) + +password = astring + +quoted = DQUOTE *QUOTED-CHAR DQUOTE + +QUOTED-CHAR = <any TEXT-CHAR except quoted-specials> / + "\" quoted-specials + +quoted-specials = DQUOTE / "\" + +rename = "RENAME" SP mailbox SP mailbox + ; Use of INBOX as a destination gives a NO error + +response = *(continue-req / response-data) response-done + +response-data = "*" SP (resp-cond-state / resp-cond-bye / + mailbox-data / message-data / capability-data) CRLF + +response-done = response-tagged / response-fatal + +response-fatal = "*" SP resp-cond-bye CRLF + ; Server closes connection immediately + +response-tagged = tag SP resp-cond-state CRLF + +resp-cond-auth = ("OK" / "PREAUTH") SP resp-text + ; Authentication condition + + + + + +Crispin Standards Track [Page 88] + +RFC 3501 IMAPv4 March 2003 + + +resp-cond-bye = "BYE" SP resp-text + +resp-cond-state = ("OK" / "NO" / "BAD") SP resp-text + ; Status condition + +resp-specials = "]" + +resp-text = ["[" resp-text-code "]" SP] text + +resp-text-code = "ALERT" / + "BADCHARSET" [SP "(" astring *(SP astring) ")" ] / + capability-data / "PARSE" / + "PERMANENTFLAGS" SP "(" + [flag-perm *(SP flag-perm)] ")" / + "READ-ONLY" / "READ-WRITE" / "TRYCREATE" / + "UIDNEXT" SP nz-number / "UIDVALIDITY" SP nz-number / + "UNSEEN" SP nz-number / + atom [SP 1*<any TEXT-CHAR except "]">] + +search = "SEARCH" [SP "CHARSET" SP astring] 1*(SP search-key) + ; CHARSET argument to MUST be registered with IANA + +search-key = "ALL" / "ANSWERED" / "BCC" SP astring / + "BEFORE" SP date / "BODY" SP astring / + "CC" SP astring / "DELETED" / "FLAGGED" / + "FROM" SP astring / "KEYWORD" SP flag-keyword / + "NEW" / "OLD" / "ON" SP date / "RECENT" / "SEEN" / + "SINCE" SP date / "SUBJECT" SP astring / + "TEXT" SP astring / "TO" SP astring / + "UNANSWERED" / "UNDELETED" / "UNFLAGGED" / + "UNKEYWORD" SP flag-keyword / "UNSEEN" / + ; Above this line were in [IMAP2] + "DRAFT" / "HEADER" SP header-fld-name SP astring / + "LARGER" SP number / "NOT" SP search-key / + "OR" SP search-key SP search-key / + "SENTBEFORE" SP date / "SENTON" SP date / + "SENTSINCE" SP date / "SMALLER" SP number / + "UID" SP sequence-set / "UNDRAFT" / sequence-set / + "(" search-key *(SP search-key) ")" + +section = "[" [section-spec] "]" + +section-msgtext = "HEADER" / "HEADER.FIELDS" [".NOT"] SP header-list / + "TEXT" + ; top-level or MESSAGE/RFC822 part + +section-part = nz-number *("." nz-number) + ; body part nesting + + + +Crispin Standards Track [Page 89] + +RFC 3501 IMAPv4 March 2003 + + +section-spec = section-msgtext / (section-part ["." section-text]) + +section-text = section-msgtext / "MIME" + ; text other than actual body part (headers, etc.) + +select = "SELECT" SP mailbox + +seq-number = nz-number / "*" + ; message sequence number (COPY, FETCH, STORE + ; commands) or unique identifier (UID COPY, + ; UID FETCH, UID STORE commands). + ; * represents the largest number in use. In + ; the case of message sequence numbers, it is + ; the number of messages in a non-empty mailbox. + ; In the case of unique identifiers, it is the + ; unique identifier of the last message in the + ; mailbox or, if the mailbox is empty, the + ; mailbox's current UIDNEXT value. + ; The server should respond with a tagged BAD + ; response to a command that uses a message + ; sequence number greater than the number of + ; messages in the selected mailbox. This + ; includes "*" if the selected mailbox is empty. + +seq-range = seq-number ":" seq-number + ; two seq-number values and all values between + ; these two regardless of order. + ; Example: 2:4 and 4:2 are equivalent and indicate + ; values 2, 3, and 4. + ; Example: a unique identifier sequence range of + ; 3291:* includes the UID of the last message in + ; the mailbox, even if that value is less than 3291. + +sequence-set = (seq-number / seq-range) *("," sequence-set) + ; set of seq-number values, regardless of order. + ; Servers MAY coalesce overlaps and/or execute the + ; sequence in any order. + ; Example: a message sequence number set of + ; 2,4:7,9,12:* for a mailbox with 15 messages is + ; equivalent to 2,4,5,6,7,9,12,13,14,15 + ; Example: a message sequence number set of *:4,5:7 + ; for a mailbox with 10 messages is equivalent to + ; 10,9,8,7,6,5,4,5,6,7 and MAY be reordered and + ; overlap coalesced to be 4,5,6,7,8,9,10. + +status = "STATUS" SP mailbox SP + "(" status-att *(SP status-att) ")" + + + + +Crispin Standards Track [Page 90] + +RFC 3501 IMAPv4 March 2003 + + +status-att = "MESSAGES" / "RECENT" / "UIDNEXT" / "UIDVALIDITY" / + "UNSEEN" + +status-att-list = status-att SP number *(SP status-att SP number) + +store = "STORE" SP sequence-set SP store-att-flags + +store-att-flags = (["+" / "-"] "FLAGS" [".SILENT"]) SP + (flag-list / (flag *(SP flag))) + +string = quoted / literal + +subscribe = "SUBSCRIBE" SP mailbox + +tag = 1*<any ASTRING-CHAR except "+"> + +text = 1*TEXT-CHAR + +TEXT-CHAR = <any CHAR except CR and LF> + +time = 2DIGIT ":" 2DIGIT ":" 2DIGIT + ; Hours minutes seconds + +uid = "UID" SP (copy / fetch / search / store) + ; Unique identifiers used instead of message + ; sequence numbers + +uniqueid = nz-number + ; Strictly ascending + +unsubscribe = "UNSUBSCRIBE" SP mailbox + +userid = astring + +x-command = "X" atom <experimental command arguments> + +zone = ("+" / "-") 4DIGIT + ; Signed four-digit value of hhmm representing + ; hours and minutes east of Greenwich (that is, + ; the amount that the given time differs from + ; Universal Time). Subtracting the timezone + ; from the given time will give the UT form. + ; The Universal Time zone is "+0000". + + + + + + + + +Crispin Standards Track [Page 91] + +RFC 3501 IMAPv4 March 2003 + + +10. Author's Note + + This document is a revision or rewrite of earlier documents, and + supercedes the protocol specification in those documents: RFC 2060, + RFC 1730, unpublished IMAP2bis.TXT document, RFC 1176, and RFC 1064. + +11. Security Considerations + + IMAP4rev1 protocol transactions, including electronic mail data, are + sent in the clear over the network unless protection from snooping is + negotiated. This can be accomplished either by the use of STARTTLS, + negotiated privacy protection in the AUTHENTICATE command, or some + other protection mechanism. + +11.1. STARTTLS Security Considerations + + The specification of the STARTTLS command and LOGINDISABLED + capability in this document replaces that in [IMAP-TLS]. [IMAP-TLS] + remains normative for the PLAIN [SASL] authenticator. + + IMAP client and server implementations MUST implement the + TLS_RSA_WITH_RC4_128_MD5 [TLS] cipher suite, and SHOULD implement the + TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [TLS] cipher suite. This is + important as it assures that any two compliant implementations can be + configured to interoperate. All other cipher suites are OPTIONAL. + Note that this is a change from section 2.1 of [IMAP-TLS]. + + During the [TLS] negotiation, the client MUST check its understanding + of the server hostname against the server's identity as presented in + the server Certificate message, in order to prevent man-in-the-middle + attacks. If the match fails, the client SHOULD either ask for + explicit user confirmation, or terminate the connection and indicate + that the server's identity is suspect. Matching is performed + according to these rules: + + The client MUST use the server hostname it used to open the + connection as the value to compare against the server name + as expressed in the server certificate. The client MUST + NOT use any form of the server hostname derived from an + insecure remote source (e.g., insecure DNS lookup). CNAME + canonicalization is not done. + + If a subjectAltName extension of type dNSName is present in + the certificate, it SHOULD be used as the source of the + server's identity. + + Matching is case-insensitive. + + + + +Crispin Standards Track [Page 92] + +RFC 3501 IMAPv4 March 2003 + + + A "*" wildcard character MAY be used as the left-most name + component in the certificate. For example, *.example.com + would match a.example.com, foo.example.com, etc. but would + not match example.com. + + If the certificate contains multiple names (e.g., more than + one dNSName field), then a match with any one of the fields + is considered acceptable. + + Both the client and server MUST check the result of the STARTTLS + command and subsequent [TLS] negotiation to see whether acceptable + authentication or privacy was achieved. + +11.2. Other Security Considerations + + A server error message for an AUTHENTICATE command which fails due to + invalid credentials SHOULD NOT detail why the credentials are + invalid. + + Use of the LOGIN command sends passwords in the clear. This can be + avoided by using the AUTHENTICATE command with a [SASL] mechanism + that does not use plaintext passwords, by first negotiating + encryption via STARTTLS or some other protection mechanism. + + A server implementation MUST implement a configuration that, at the + time of authentication, requires: + (1) The STARTTLS command has been negotiated. + OR + (2) Some other mechanism that protects the session from password + snooping has been provided. + OR + (3) The following measures are in place: + (a) The LOGINDISABLED capability is advertised, and [SASL] + mechanisms (such as PLAIN) using plaintext passwords are NOT + advertised in the CAPABILITY list. + AND + (b) The LOGIN command returns an error even if the password is + correct. + AND + (c) The AUTHENTICATE command returns an error with all [SASL] + mechanisms that use plaintext passwords, even if the password + is correct. + + A server error message for a failing LOGIN command SHOULD NOT specify + that the user name, as opposed to the password, is invalid. + + A server SHOULD have mechanisms in place to limit or delay failed + AUTHENTICATE/LOGIN attempts. + + + +Crispin Standards Track [Page 93] + +RFC 3501 IMAPv4 March 2003 + + + Additional security considerations are discussed in the section + discussing the AUTHENTICATE and LOGIN commands. + +12. IANA Considerations + + IMAP4 capabilities are registered by publishing a standards track or + IESG approved experimental RFC. The registry is currently located + at: + + http://www.iana.org/assignments/imap4-capabilities + + As this specification revises the STARTTLS and LOGINDISABLED + extensions previously defined in [IMAP-TLS], the registry will be + updated accordingly. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 94] + +RFC 3501 IMAPv4 March 2003 + + +Appendices + +A. Normative References + + The following documents contain definitions or specifications that + are necessary to understand this document properly: + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", RFC 2234, + November 1997. + + [ANONYMOUS] Newman, C., "Anonymous SASL Mechanism", RFC + 2245, November 1997. + + [CHARSET] Freed, N. and J. Postel, "IANA Character Set + Registration Procedures", RFC 2978, October + 2000. + + [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest + Authentication as a SASL Mechanism", RFC 2831, + May 2000. + + [DISPOSITION] Troost, R., Dorner, S. and K. Moore, + "Communicating Presentation Information in + Internet Messages: The Content-Disposition + Header", RFC 2183, August 1997. + + [IMAP-TLS] Newman, C., "Using TLS with IMAP, POP3 and + ACAP", RFC 2595, June 1999. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to + Indicate Requirement Levels", BCP 14, RFC 2119, + March 1997. + + [LANGUAGE-TAGS] Alvestrand, H., "Tags for the Identification of + Languages", BCP 47, RFC 3066, January 2001. + + [LOCATION] Palme, J., Hopmann, A. and N. Shelness, "MIME + Encapsulation of Aggregate Documents, such as + HTML (MHTML)", RFC 2557, March 1999. + + [MD5] Myers, J. and M. Rose, "The Content-MD5 Header + Field", RFC 1864, October 1995. + + + + + + + + + +Crispin Standards Track [Page 95] + +RFC 3501 IMAPv4 March 2003 + + + [MIME-HDRS] Moore, K., "MIME (Multipurpose Internet Mail + Extensions) Part Three: Message Header + Extensions for Non-ASCII Text", RFC 2047, + November 1996. + + [MIME-IMB] Freed, N. and N. Borenstein, "MIME + (Multipurpose Internet Mail Extensions) Part + One: Format of Internet Message Bodies", RFC + 2045, November 1996. + + [MIME-IMT] Freed, N. and N. Borenstein, "MIME + (Multipurpose Internet Mail Extensions) Part + Two: Media Types", RFC 2046, November 1996. + + [RFC-2822] Resnick, P., "Internet Message Format", RFC + 2822, April 2001. + + [SASL] Myers, J., "Simple Authentication and Security + Layer (SASL)", RFC 2222, October 1997. + + [TLS] Dierks, T. and C. Allen, "The TLS Protocol + Version 1.0", RFC 2246, January 1999. + + [UTF-7] Goldsmith, D. and M. Davis, "UTF-7: A Mail-Safe + Transformation Format of Unicode", RFC 2152, + May 1997. + + The following documents describe quality-of-implementation issues + that should be carefully considered when implementing this protocol: + + [IMAP-IMPLEMENTATION] Leiba, B., "IMAP Implementation + Recommendations", RFC 2683, September 1999. + + [IMAP-MULTIACCESS] Gahrns, M., "IMAP4 Multi-Accessed Mailbox + Practice", RFC 2180, July 1997. + +A.1 Informative References + + The following documents describe related protocols: + + [IMAP-DISC] Austein, R., "Synchronization Operations for + Disconnected IMAP4 Clients", Work in Progress. + + [IMAP-MODEL] Crispin, M., "Distributed Electronic Mail + Models in IMAP4", RFC 1733, December 1994. + + + + + + +Crispin Standards Track [Page 96] + +RFC 3501 IMAPv4 March 2003 + + + [ACAP] Newman, C. and J. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, + November 1997. + + [SMTP] Klensin, J., "Simple Mail Transfer Protocol", + STD 10, RFC 2821, April 2001. + + The following documents are historical or describe historical aspects + of this protocol: + + [IMAP-COMPAT] Crispin, M., "IMAP4 Compatibility with + IMAP2bis", RFC 2061, December 1996. + + [IMAP-HISTORICAL] Crispin, M., "IMAP4 Compatibility with IMAP2 + and IMAP2bis", RFC 1732, December 1994. + + [IMAP-OBSOLETE] Crispin, M., "Internet Message Access Protocol + - Obsolete Syntax", RFC 2062, December 1996. + + [IMAP2] Crispin, M., "Interactive Mail Access Protocol + - Version 2", RFC 1176, August 1990. + + [RFC-822] Crocker, D., "Standard for the Format of ARPA + Internet Text Messages", STD 11, RFC 822, + August 1982. + + [RFC-821] Postel, J., "Simple Mail Transfer Protocol", + STD 10, RFC 821, August 1982. + +B. Changes from RFC 2060 + + 1) Clarify description of unique identifiers and their semantics. + + 2) Fix the SELECT description to clarify that UIDVALIDITY is required + in the SELECT and EXAMINE responses. + + 3) Added an example of a failing search. + + 4) Correct store-att-flags: "#flag" should be "1#flag". + + 5) Made search and section rules clearer. + + 6) Correct the STORE example. + + 7) Correct "BASE645" misspelling. + + 8) Remove extraneous close parenthesis in example of two-part message + with text and BASE64 attachment. + + + +Crispin Standards Track [Page 97] + +RFC 3501 IMAPv4 March 2003 + + + 9) Remove obsolete "MAILBOX" response from mailbox-data. + + 10) A spurious "<" in the rule for mailbox-data was removed. + + 11) Add CRLF to continue-req. + + 12) Specifically exclude "]" from the atom in resp-text-code. + + 13) Clarify that clients and servers should adhere strictly to the + protocol syntax. + + 14) Emphasize in 5.2 that EXISTS can not be used to shrink a mailbox. + + 15) Add NEWNAME to resp-text-code. + + 16) Clarify that the empty string, not NIL, is used as arguments to + LIST. + + 17) Clarify that NIL can be returned as a hierarchy delimiter for the + empty string mailbox name argument if the mailbox namespace is flat. + + 18) Clarify that addr-mailbox and addr-name have RFC-2822 quoting + removed. + + 19) Update UTF-7 reference. + + 20) Fix example in 6.3.11. + + 21) Clarify that non-existent UIDs are ignored. + + 22) Update DISPOSITION reference. + + 23) Expand state diagram. + + 24) Clarify that partial fetch responses are only returned in + response to a partial fetch command. + + 25) Add UIDNEXT response code. Correct UIDVALIDITY definition + reference. + + 26) Further clarification of "can" vs. "MAY". + + 27) Reference RFC-2119. + + 28) Clarify that superfluous shifts are not permitted in modified + UTF-7. + + 29) Clarify that there are no implicit shifts in modified UTF-7. + + + +Crispin Standards Track [Page 98] + +RFC 3501 IMAPv4 March 2003 + + + 30) Clarify that "INBOX" in a mailbox name is always INBOX, even if + it is given as a string. + + 31) Add missing open parenthesis in media-basic grammar rule. + + 32) Correct attribute syntax in mailbox-data. + + 33) Add UIDNEXT to EXAMINE responses. + + 34) Clarify UNSEEN, PERMANENTFLAGS, UIDVALIDITY, and UIDNEXT + responses in SELECT and EXAMINE. They are required now, but weren't + in older versions. + + 35) Update references with RFC numbers. + + 36) Flush text-mime2. + + 37) Clarify that modified UTF-7 names must be case-sensitive and that + violating the convention should be avoided. + + 38) Correct UID FETCH example. + + 39) Clarify UID FETCH, UID STORE, and UID SEARCH vs. untagged EXPUNGE + responses. + + 40) Clarify the use of the word "convention". + + 41) Clarify that a command is not "in progress" until it has been + fully received (specifically, that a command is not "in progress" + during command continuation negotiation). + + 42) Clarify envelope defaulting. + + 43) Clarify that SP means one and only one space character. + + 44) Forbid silly states in LIST response. + + 45) Clarify that the ENVELOPE, INTERNALDATE, RFC822*, BODY*, and UID + for a message is static. + + 46) Add BADCHARSET response code. + + 47) Update formal syntax to [ABNF] conventions. + + 48) Clarify trailing hierarchy delimiter in CREATE semantics. + + 49) Clarify that the "blank line" is the [RFC-2822] delimiting blank + line. + + + +Crispin Standards Track [Page 99] + +RFC 3501 IMAPv4 March 2003 + + + 50) Clarify that RENAME should also create hierarchy as needed for + the command to complete. + + 51) Fix body-ext-mpart to not require language if disposition + present. + + 52) Clarify the RFC822.HEADER response. + + 53) Correct missing space after charset astring in search. + + 54) Correct missing quote for BADCHARSET in resp-text-code. + + 55) Clarify that ALL, FAST, and FULL preclude any other data items + appearing. + + 56) Clarify semantics of reference argument in LIST. + + 57) Clarify that a null string for SEARCH HEADER X-FOO means any + message with a header line with a field-name of X-FOO regardless of + the text of the header. + + 58) Specifically reserve 8-bit mailbox names for future use as UTF-8. + + 59) It is not an error for the client to store a flag that is not in + the PERMANENTFLAGS list; however, the server will either ignore the + change or make the change in the session only. + + 60) Correct/clarify the text regarding superfluous shifts. + + 61) Correct typographic errors in the "Changes" section. + + 62) Clarify that STATUS must not be used to check for new messages in + the selected mailbox + + 63) Clarify LSUB behavior with "%" wildcard. + + 64) Change AUTHORIZATION to AUTHENTICATE in section 7.5. + + 65) Clarify description of multipart body type. + + 66) Clarify that STORE FLAGS does not affect \Recent. + + 67) Change "west" to "east" in description of timezone. + + 68) Clarify that commands which break command pipelining must wait + for a completion result response. + + 69) Clarify that EXAMINE does not affect \Recent. + + + +Crispin Standards Track [Page 100] + +RFC 3501 IMAPv4 March 2003 + + + 70) Make description of MIME structure consistent. + + 71) Clarify that date searches disregard the time and timezone of the + INTERNALDATE or Date: header. In other words, "ON 13-APR-2000" means + messages with an INTERNALDATE text which starts with "13-APR-2000", + even if timezone differential from the local timezone is sufficient + to move that INTERNALDATE into the previous or next day. + + 72) Clarify that the header fetches don't add a blank line if one + isn't in the [RFC-2822] message. + + 73) Clarify (in discussion of UIDs) that messages are immutable. + + 74) Add an example of CHARSET searching. + + 75) Clarify in SEARCH that keywords are a type of flag. + + 76) Clarify the mandatory nature of the SELECT data responses. + + 77) Add optional CAPABILITY response code in the initial OK or + PREAUTH. + + 78) Add note that server can send an untagged CAPABILITY command as + part of the responses to AUTHENTICATE and LOGIN. + + 79) Remove statement about it being unnecessary to issue a CAPABILITY + command more than once in a connection. That statement is no longer + true. + + 80) Clarify that untagged EXPUNGE decrements the number of messages + in the mailbox. + + 81) Fix definition of "body" (concatenation has tighter binding than + alternation). + + 82) Add a new "Special Notes to Implementors" section with reference + to [IMAP-IMPLEMENTATION]. + + 83) Clarify that an untagged CAPABILITY response to an AUTHENTICATE + command should only be done if a security layer was not negotiated. + + 84) Change the definition of atom to exclude "]". Update astring to + include "]" for compatibility with the past. Remove resp-text-atom. + + 85) Remove NEWNAME. It can't work because mailbox names can be + literals and can include "]". Functionality can be addressed via + referrals. + + + + +Crispin Standards Track [Page 101] + +RFC 3501 IMAPv4 March 2003 + + + 86) Move modified UTF-7 rationale in order to have more logical + paragraph flow. + + 87) Clarify UID uniqueness guarantees with the use of MUST. + + 88) Note that clients should read response data until the connection + is closed instead of immediately closing on a BYE. + + 89) Change RFC-822 references to RFC-2822. + + 90) Clarify that RFC-2822 should be followed instead of RFC-822. + + 91) Change recommendation of optional automatic capabilities in LOGIN + and AUTHENTICATE to use the CAPABILITY response code in the tagged + OK. This is more interoperable than an unsolicited untagged + CAPABILITY response. + + 92) STARTTLS and AUTH=PLAIN are mandatory to implement; add + recommendations for other [SASL] mechanisms. + + 93) Clarify that a "connection" (as opposed to "server" or "command") + is in one of the four states. + + 94) Clarify that a failed or rejected command does not change state. + + 95) Split references between normative and informative. + + 96) Discuss authentication failure issues in security section. + + 97) Clarify that a data item is not necessarily of only one data + type. + + 98) Clarify that sequence ranges are independent of order. + + 99) Change an example to clarify that superfluous shifts in + Modified-UTF7 can not be fixed just by omitting the shift. The + entire string must be recalculated. + + 100) Change Envelope Structure definition since [RFC-2822] uses + "envelope" to refer to the [SMTP] envelope and not the envelope data + that appears in the [RFC-2822] header. + + 101) Expand on RFC822.HEADER response data vs. BODY[HEADER]. + + 102) Clarify Logout state semantics, change ASCII art. + + 103) Security changes to comply with IESG requirements. + + + + +Crispin Standards Track [Page 102] + +RFC 3501 IMAPv4 March 2003 + + + 104) Add definition for body URI. + + 105) Break sequence range definition into three rules, with rewritten + descriptions for each. + + 106) Move STARTTLS and LOGINDISABLED here from [IMAP-TLS]. + + 107) Add IANA Considerations section. + + 108) Clarify valid client assumptions for new message UIDs vs. + UIDNEXT. + + 109) Clarify that changes to permanentflags affect concurrent + sessions as well as subsequent sessions. + + 110) Clarify that authenticated state can be entered by the CLOSE + command. + + 111) Emphasize that SELECT and EXAMINE are the exceptions to the rule + that a failing command does not change state. + + 112) Clarify that newly-appended messages have the Recent flag set. + + 113) Clarify that newly-copied messages SHOULD have the Recent flag + set. + + 114) Clarify that UID commands always return the UID in FETCH + responses. + +C. Key Word Index + + +FLAGS <flag list> (store command data item) ............... 59 + +FLAGS.SILENT <flag list> (store command data item) ........ 59 + -FLAGS <flag list> (store command data item) ............... 59 + -FLAGS.SILENT <flag list> (store command data item) ........ 59 + ALERT (response code) ...................................... 64 + ALL (fetch item) ........................................... 55 + ALL (search key) ........................................... 50 + ANSWERED (search key) ...................................... 50 + APPEND (command) ........................................... 45 + AUTHENTICATE (command) ..................................... 27 + BAD (response) ............................................. 66 + BADCHARSET (response code) ................................. 64 + BCC <string> (search key) .................................. 51 + BEFORE <date> (search key) ................................. 51 + BODY (fetch item) .......................................... 55 + BODY (fetch result) ........................................ 73 + BODY <string> (search key) ................................. 51 + + + +Crispin Standards Track [Page 103] + +RFC 3501 IMAPv4 March 2003 + + + BODY.PEEK[<section>]<<partial>> (fetch item) ............... 57 + BODYSTRUCTURE (fetch item) ................................. 57 + BODYSTRUCTURE (fetch result) ............................... 74 + BODY[<section>]<<origin octet>> (fetch result) ............. 74 + BODY[<section>]<<partial>> (fetch item) .................... 55 + BYE (response) ............................................. 67 + Body Structure (message attribute) ......................... 12 + CAPABILITY (command) ....................................... 24 + CAPABILITY (response code) ................................. 64 + CAPABILITY (response) ...................................... 68 + CC <string> (search key) ................................... 51 + CHECK (command) ............................................ 47 + CLOSE (command) ............................................ 48 + COPY (command) ............................................. 59 + CREATE (command) ........................................... 34 + DELETE (command) ........................................... 35 + DELETED (search key) ....................................... 51 + DRAFT (search key) ......................................... 51 + ENVELOPE (fetch item) ...................................... 57 + ENVELOPE (fetch result) .................................... 77 + EXAMINE (command) .......................................... 33 + EXISTS (response) .......................................... 71 + EXPUNGE (command) .......................................... 48 + EXPUNGE (response) ......................................... 72 + Envelope Structure (message attribute) ..................... 12 + FAST (fetch item) .......................................... 55 + FETCH (command) ............................................ 54 + FETCH (response) ........................................... 73 + FLAGGED (search key) ....................................... 51 + FLAGS (fetch item) ......................................... 57 + FLAGS (fetch result) ....................................... 78 + FLAGS (response) ........................................... 71 + FLAGS <flag list> (store command data item) ................ 59 + FLAGS.SILENT <flag list> (store command data item) ......... 59 + FROM <string> (search key) ................................. 51 + FULL (fetch item) .......................................... 55 + Flags (message attribute) .................................. 11 + HEADER (part specifier) .................................... 55 + HEADER <field-name> <string> (search key) .................. 51 + HEADER.FIELDS <header-list> (part specifier) ............... 55 + HEADER.FIELDS.NOT <header-list> (part specifier) ........... 55 + INTERNALDATE (fetch item) .................................. 57 + INTERNALDATE (fetch result) ................................ 78 + Internal Date (message attribute) .......................... 12 + KEYWORD <flag> (search key) ................................ 51 + Keyword (type of flag) ..................................... 11 + LARGER <n> (search key) .................................... 51 + LIST (command) ............................................. 40 + + + +Crispin Standards Track [Page 104] + +RFC 3501 IMAPv4 March 2003 + + + LIST (response) ............................................ 69 + LOGIN (command) ............................................ 30 + LOGOUT (command) ........................................... 25 + LSUB (command) ............................................. 43 + LSUB (response) ............................................ 70 + MAY (specification requirement term) ....................... 4 + MESSAGES (status item) ..................................... 45 + MIME (part specifier) ...................................... 56 + MUST (specification requirement term) ...................... 4 + MUST NOT (specification requirement term) .................. 4 + Message Sequence Number (message attribute) ................ 10 + NEW (search key) ........................................... 51 + NO (response) .............................................. 66 + NOOP (command) ............................................. 25 + NOT <search-key> (search key) .............................. 52 + OK (response) .............................................. 65 + OLD (search key) ........................................... 52 + ON <date> (search key) ..................................... 52 + OPTIONAL (specification requirement term) .................. 4 + OR <search-key1> <search-key2> (search key) ................ 52 + PARSE (response code) ...................................... 64 + PERMANENTFLAGS (response code) ............................. 64 + PREAUTH (response) ......................................... 67 + Permanent Flag (class of flag) ............................. 12 + READ-ONLY (response code) .................................. 65 + READ-WRITE (response code) ................................. 65 + RECENT (response) .......................................... 72 + RECENT (search key) ........................................ 52 + RECENT (status item) ....................................... 45 + RENAME (command) ........................................... 37 + REQUIRED (specification requirement term) .................. 4 + RFC822 (fetch item) ........................................ 57 + RFC822 (fetch result) ...................................... 78 + RFC822.HEADER (fetch item) ................................. 57 + RFC822.HEADER (fetch result) ............................... 78 + RFC822.SIZE (fetch item) ................................... 57 + RFC822.SIZE (fetch result) ................................. 78 + RFC822.TEXT (fetch item) ................................... 58 + RFC822.TEXT (fetch result) ................................. 79 + SEARCH (command) ........................................... 49 + SEARCH (response) .......................................... 71 + SEEN (search key) .......................................... 52 + SELECT (command) ........................................... 31 + SENTBEFORE <date> (search key) ............................. 52 + SENTON <date> (search key) ................................. 52 + SENTSINCE <date> (search key) .............................. 52 + SHOULD (specification requirement term) .................... 4 + SHOULD NOT (specification requirement term) ................ 4 + + + +Crispin Standards Track [Page 105] + +RFC 3501 IMAPv4 March 2003 + + + SINCE <date> (search key) .................................. 52 + SMALLER <n> (search key) ................................... 52 + STARTTLS (command) ......................................... 27 + STATUS (command) ........................................... 44 + STATUS (response) .......................................... 70 + STORE (command) ............................................ 58 + SUBJECT <string> (search key) .............................. 53 + SUBSCRIBE (command) ........................................ 38 + Session Flag (class of flag) ............................... 12 + System Flag (type of flag) ................................. 11 + TEXT (part specifier) ...................................... 56 + TEXT <string> (search key) ................................. 53 + TO <string> (search key) ................................... 53 + TRYCREATE (response code) .................................. 65 + UID (command) .............................................. 60 + UID (fetch item) ........................................... 58 + UID (fetch result) ......................................... 79 + UID <sequence set> (search key) ............................ 53 + UIDNEXT (response code) .................................... 65 + UIDNEXT (status item) ...................................... 45 + UIDVALIDITY (response code) ................................ 65 + UIDVALIDITY (status item) .................................. 45 + UNANSWERED (search key) .................................... 53 + UNDELETED (search key) ..................................... 53 + UNDRAFT (search key) ....................................... 53 + UNFLAGGED (search key) ..................................... 53 + UNKEYWORD <flag> (search key) .............................. 53 + UNSEEN (response code) ..................................... 65 + UNSEEN (search key) ........................................ 53 + UNSEEN (status item) ....................................... 45 + UNSUBSCRIBE (command) ...................................... 39 + Unique Identifier (UID) (message attribute) ................ 8 + X<atom> (command) .......................................... 62 + [RFC-2822] Size (message attribute) ........................ 12 + \Answered (system flag) .................................... 11 + \Deleted (system flag) ..................................... 11 + \Draft (system flag) ....................................... 11 + \Flagged (system flag) ..................................... 11 + \Marked (mailbox name attribute) ........................... 69 + \Noinferiors (mailbox name attribute) ...................... 69 + \Noselect (mailbox name attribute) ......................... 69 + \Recent (system flag) ...................................... 11 + \Seen (system flag) ........................................ 11 + \Unmarked (mailbox name attribute) ......................... 69 + + + + + + + +Crispin Standards Track [Page 106] + +RFC 3501 IMAPv4 March 2003 + + +Author's Address + + Mark R. Crispin + Networks and Distributed Computing + University of Washington + 4545 15th Avenue NE + Seattle, WA 98105-4527 + + Phone: (206) 543-5762 + + EMail: MRC@CAC.Washington.EDU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 107] + +RFC 3501 IMAPv4 March 2003 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2003). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. v This + document and the information contained herein is provided on an "AS + IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK + FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT + LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL + NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY + OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + + +Crispin Standards Track [Page 108] + diff --git a/rfc/rfc4616.txt b/rfc/rfc4616.txt @@ -0,0 +1,619 @@ + + + + + + +Network Working Group K. Zeilenga, Ed. +Request for Comments: 4616 OpenLDAP Foundation +Updates: 2595 August 2006 +Category: Standards Track + + + The PLAIN Simple Authentication and Security Layer (SASL) Mechanism + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2006). + +Abstract + + This document defines a simple clear-text user/password Simple + Authentication and Security Layer (SASL) mechanism called the PLAIN + mechanism. The PLAIN mechanism is intended to be used, in + combination with data confidentiality services provided by a lower + layer, in protocols that lack a simple password authentication + command. + + + + + + + + + + + + + + + + + + + + + + + +Zeilenga Standards Track [Page 1] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + +1. Introduction + + Clear-text, multiple-use passwords are simple, interoperate with + almost all existing operating system authentication databases, and + are useful for a smooth transition to a more secure password-based + authentication mechanism. The drawback is that they are unacceptable + for use over network connections where data confidentiality is not + ensured. + + This document defines the PLAIN Simple Authentication and Security + Layer ([SASL]) mechanism for use in protocols with no clear-text + login command (e.g., [ACAP] or [SMTP-AUTH]). This document updates + RFC 2595, replacing Section 6. Changes since RFC 2595 are detailed + in Appendix A. + + The name associated with this mechanism is "PLAIN". + + The PLAIN SASL mechanism does not provide a security layer. + + The PLAIN mechanism should not be used without adequate data security + protection as this mechanism affords no integrity or confidentiality + protections itself. The mechanism is intended to be used with data + security protections provided by application-layer protocol, + generally through its use of Transport Layer Security ([TLS]) + services. + + By default, implementations SHOULD advertise and make use of the + PLAIN mechanism only when adequate data security services are in + place. Specifications for IETF protocols that indicate that this + mechanism is an applicable authentication mechanism MUST mandate that + implementations support an strong data security service, such as TLS. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [Keywords]. + +2. PLAIN SASL Mechanism + + The mechanism consists of a single message, a string of [UTF-8] + encoded [Unicode] characters, from the client to the server. The + client presents the authorization identity (identity to act as), + followed by a NUL (U+0000) character, followed by the authentication + identity (identity whose password will be used), followed by a NUL + (U+0000) character, followed by the clear-text password. As with + other SASL mechanisms, the client does not provide an authorization + identity when it wishes the server to derive an identity from the + credentials and use that as the authorization identity. + + + + +Zeilenga Standards Track [Page 2] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + + The formal grammar for the client message using Augmented BNF [ABNF] + follows. + + message = [authzid] UTF8NUL authcid UTF8NUL passwd + authcid = 1*SAFE ; MUST accept up to 255 octets + authzid = 1*SAFE ; MUST accept up to 255 octets + passwd = 1*SAFE ; MUST accept up to 255 octets + UTF8NUL = %x00 ; UTF-8 encoded NUL character + + SAFE = UTF1 / UTF2 / UTF3 / UTF4 + ;; any UTF-8 encoded Unicode character except NUL + + UTF1 = %x01-7F ;; except NUL + UTF2 = %xC2-DF UTF0 + UTF3 = %xE0 %xA0-BF UTF0 / %xE1-EC 2(UTF0) / + %xED %x80-9F UTF0 / %xEE-EF 2(UTF0) + UTF4 = %xF0 %x90-BF 2(UTF0) / %xF1-F3 3(UTF0) / + %xF4 %x80-8F 2(UTF0) + UTF0 = %x80-BF + + The authorization identity (authzid), authentication identity + (authcid), password (passwd), and NUL character deliminators SHALL be + transferred as [UTF-8] encoded strings of [Unicode] characters. As + the NUL (U+0000) character is used as a deliminator, the NUL (U+0000) + character MUST NOT appear in authzid, authcid, or passwd productions. + + The form of the authzid production is specific to the application- + level protocol's SASL profile [SASL]. The authcid and passwd + productions are form-free. Use of non-visible characters or + characters that a user may be unable to enter on some keyboards is + discouraged. + + Servers MUST be capable of accepting authzid, authcid, and passwd + productions up to and including 255 octets. It is noted that the + UTF-8 encoding of a Unicode character may be as long as 4 octets. + + Upon receipt of the message, the server will verify the presented (in + the message) authentication identity (authcid) and password (passwd) + with the system authentication database, and it will verify that the + authentication credentials permit the client to act as the (presented + or derived) authorization identity (authzid). If both steps succeed, + the user is authenticated. + + The presented authentication identity and password strings, as well + as the database authentication identity and password strings, are to + be prepared before being used in the verification process. The + [SASLPrep] profile of the [StringPrep] algorithm is the RECOMMENDED + preparation algorithm. The SASLprep preparation algorithm is + + + +Zeilenga Standards Track [Page 3] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + + recommended to improve the likelihood that comparisons behave in an + expected manner. The SASLprep preparation algorithm is not mandatory + so as to allow the server to employ other preparation algorithms + (including none) when appropriate. For instance, use of a different + preparation algorithm may be necessary for the server to interoperate + with an external system. + + When preparing the presented strings using [SASLPrep], the presented + strings are to be treated as "query" strings (Section 7 of + [StringPrep]) and hence unassigned code points are allowed to appear + in their prepared output. When preparing the database strings using + [SASLPrep], the database strings are to be treated as "stored" + strings (Section 7 of [StringPrep]) and hence unassigned code points + are prohibited from appearing in their prepared output. + + Regardless of the preparation algorithm used, if the output of a + non-invertible function (e.g., hash) of the expected string is + stored, the string MUST be prepared before input to that function. + + Regardless of the preparation algorithm used, if preparation fails or + results in an empty string, verification SHALL fail. + + When no authorization identity is provided, the server derives an + authorization identity from the prepared representation of the + provided authentication identity string. This ensures that the + derivation of different representations of the authentication + identity produces the same authorization identity. + + The server MAY use the credentials to initialize any new + authentication database, such as one suitable for [CRAM-MD5] or + [DIGEST-MD5]. + +3. Pseudo-Code + + This section provides pseudo-code illustrating the verification + process (using hashed passwords and the SASLprep preparation + function) discussed above. This section is not definitive. + + boolean Verify(string authzid, string authcid, string passwd) { + string pAuthcid = SASLprep(authcid, true); # prepare authcid + string pPasswd = SASLprep(passwd, true); # prepare passwd + if (pAuthcid == NULL || pPasswd == NULL) { + return false; # preparation failed + } + if (pAuthcid == "" || pPasswd == "") { + return false; # empty prepared string + } + + + + +Zeilenga Standards Track [Page 4] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + + storedHash = FetchPasswordHash(pAuthcid); + if (storedHash == NULL || storedHash == "") { + return false; # error or unknown authcid + } + + if (!Compare(storedHash, Hash(pPasswd))) { + return false; # incorrect password + } + + if (authzid == NULL ) { + authzid = DeriveAuthzid(pAuthcid); + if (authzid == NULL || authzid == "") { + return false; # could not derive authzid + } + } + + if (!Authorize(pAuthcid, authzid)) { + return false; # not authorized + } + + return true; + } + + The second parameter of the SASLprep function, when true, indicates + that unassigned code points are allowed in the input. When the + SASLprep function is called to prepare the password prior to + computing the stored hash, the second parameter would be false. + + The second parameter provided to the Authorize function is not + prepared by this code. The application-level SASL profile should be + consulted to determine what, if any, preparation is necessary. + + Note that the DeriveAuthzid and Authorize functions (whether + implemented as one function or two, whether designed in a manner in + which these functions or whether the mechanism implementation can be + reused elsewhere) require knowledge and understanding of mechanism + and the application-level protocol specification and/or + implementation details to implement. + + Note that the Authorize function outcome is clearly dependent on + details of the local authorization model and policy. Both functions + may be dependent on other factors as well. + + + + + + + + + +Zeilenga Standards Track [Page 5] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + +4. Examples + + This section provides examples of PLAIN authentication exchanges. + The examples are intended to help the readers understand the above + text. The examples are not definitive. + + "C:" and "S:" indicate lines sent by the client and server, + respectively. "<NUL>" represents a single NUL (U+0000) character. + The Application Configuration Access Protocol ([ACAP]) is used in the + examples. + + The first example shows how the PLAIN mechanism might be used for + user authentication. + + S: * ACAP (SASL "CRAM-MD5") (STARTTLS) + C: a001 STARTTLS + S: a001 OK "Begin TLS negotiation now" + <TLS negotiation, further commands are under TLS layer> + S: * ACAP (SASL "CRAM-MD5" "PLAIN") + C: a002 AUTHENTICATE "PLAIN" + S: + "" + C: {21} + C: <NUL>tim<NUL>tanstaaftanstaaf + S: a002 OK "Authenticated" + + The second example shows how the PLAIN mechanism might be used to + attempt to assume the identity of another user. In this example, the + server rejects the request. Also, this example makes use of the + protocol optional initial response capability to eliminate a round- + trip. + + S: * ACAP (SASL "CRAM-MD5") (STARTTLS) + C: a001 STARTTLS + S: a001 OK "Begin TLS negotiation now" + <TLS negotiation, further commands are under TLS layer> + S: * ACAP (SASL "CRAM-MD5" "PLAIN") + C: a002 AUTHENTICATE "PLAIN" {20+} + C: Ursel<NUL>Kurt<NUL>xipj3plmq + S: a002 NO "Not authorized to requested authorization identity" + +5. Security Considerations + + As the PLAIN mechanism itself provided no integrity or + confidentiality protections, it should not be used without adequate + external data security protection, such as TLS services provided by + many application-layer protocols. By default, implementations SHOULD + NOT advertise and SHOULD NOT make use of the PLAIN mechanism unless + adequate data security services are in place. + + + +Zeilenga Standards Track [Page 6] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + + When the PLAIN mechanism is used, the server gains the ability to + impersonate the user to all services with the same password + regardless of any encryption provided by TLS or other confidentiality + protection mechanisms. Whereas many other authentication mechanisms + have similar weaknesses, stronger SASL mechanisms address this issue. + Clients are encouraged to have an operational mode where all + mechanisms that are likely to reveal the user's password to the + server are disabled. + + General [SASL] security considerations apply to this mechanism. + + Unicode, [UTF-8], and [StringPrep] security considerations also + apply. + +6. IANA Considerations + + The SASL Mechanism registry [IANA-SASL] entry for the PLAIN mechanism + has been updated by the IANA to reflect that this document now + provides its technical specification. + + To: iana@iana.org + Subject: Updated Registration of SASL mechanism PLAIN + + SASL mechanism name: PLAIN + Security considerations: See RFC 4616. + Published specification (optional, recommended): RFC 4616 + Person & email address to contact for further information: + Kurt Zeilenga <kurt@openldap.org> + IETF SASL WG <ietf-sasl@imc.org> + Intended usage: COMMON + Author/Change controller: IESG <iesg@ietf.org> + Note: Updates existing entry for PLAIN + +7. Acknowledgements + + This document is a revision of RFC 2595 by Chris Newman. Portions of + the grammar defined in Section 2 were borrowed from [UTF-8] by + Francois Yergeau. + + This document is a product of the IETF Simple Authentication and + Security Layer (SASL) Working Group. + + + + + + + + + + +Zeilenga Standards Track [Page 7] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + +8. Normative References + + [ABNF] Crocker, D., Ed. and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", RFC 4234, October 2005. + + [Keywords] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [SASL] Melnikov, A., Ed., and K. Zeilenga, Ed., "Simple + Authentication and Security Layer (SASL)", RFC 4422, + June 2006. + + [SASLPrep] Zeilenga, K., "SASLprep: Stringprep Profile for User + Names and Passwords", RFC 4013, February 2005. + + [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of + Internationalized Strings ("stringprep")", RFC 3454, + December 2002. + + [Unicode] The Unicode Consortium, "The Unicode Standard, Version + 3.2.0" is defined by "The Unicode Standard, Version + 3.0" (Reading, MA, Addison-Wesley, 2000. ISBN 0-201- + 61633-5), as amended by the "Unicode Standard Annex + #27: Unicode 3.1" + (http://www.unicode.org/reports/tr27/) and by the + "Unicode Standard Annex #28: Unicode 3.2" + (http://www.unicode.org/reports/tr28/). + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003. + + [TLS] Dierks, T. and E. Rescorla, "The Transport Layer + Security (TLS) Protocol Version 1.1", RFC 4346, April + 2006. + +9. Informative References + + [ACAP] Newman, C. and J. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November + 1997. + + [CRAM-MD5] Nerenberg, L., Ed., "The CRAM-MD5 SASL Mechanism", Work + in Progress, June 2006. + + [DIGEST-MD5] Melnikov, A., Ed., "Using Digest Authentication as a + SASL Mechanism", Work in Progress, June 2006. + + + + + +Zeilenga Standards Track [Page 8] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + + [IANA-SASL] IANA, "SIMPLE AUTHENTICATION AND SECURITY LAYER (SASL) + MECHANISMS", + <http://www.iana.org/assignments/sasl-mechanisms>. + + [SMTP-AUTH] Myers, J., "SMTP Service Extension for Authentication", + RFC 2554, March 1999. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Zeilenga Standards Track [Page 9] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + +Appendix A. Changes since RFC 2595 + + This appendix is non-normative. + + This document replaces Section 6 of RFC 2595. + + The specification details how the server is to compare client- + provided character strings with stored character strings. + + The ABNF grammar was updated. In particular, the grammar now allows + LINE FEED (U+000A) and CARRIAGE RETURN (U+000D) characters in the + authzid, authcid, passwd productions. However, whether these control + characters may be used depends on the string preparation rules + applicable to the production. For passwd and authcid productions, + control characters are prohibited. For authzid, one must consult the + application-level SASL profile. This change allows PLAIN to carry + all possible authorization identity strings allowed in SASL. + + Pseudo-code was added. + + The example section was expanded to illustrate more features of the + PLAIN mechanism. + +Editor's Address + + Kurt D. Zeilenga + OpenLDAP Foundation + + EMail: Kurt@OpenLDAP.org + + + + + + + + + + + + + + + + + + + + + + +Zeilenga Standards Track [Page 10] + +RFC 4616 The PLAIN SASL Mechanism August 2006 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2006). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is provided by the IETF + Administrative Support Activity (IASA). + + + + + + + +Zeilenga Standards Track [Page 11] + diff --git a/rfc/rfc5256.txt b/rfc/rfc5256.txt @@ -0,0 +1,1067 @@ + + + + + + +Network Working Group M. Crispin +Request for Comments: 5256 Panda Programming +Category: Standards Track K. Murchison + Carnegie Mellon University + June 2008 + + + Internet Message Access Protocol - SORT and THREAD Extensions + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This document describes the base-level server-based sorting and + threading extensions to the IMAP protocol. These extensions provide + substantial performance improvements for IMAP clients that offer + sorted and threaded views. + +1. Introduction + + The SORT and THREAD extensions to the [IMAP] protocol provide a means + of server-based sorting and threading of messages, without requiring + that the client download the necessary data to do so itself. This is + particularly useful for online clients as described in [IMAP-MODELS]. + + A server that supports the base-level SORT extension indicates this + with a capability name which starts with "SORT". Future, upwards- + compatible extensions to the SORT extension will all start with + "SORT", indicating support for this base level. + + A server that supports the THREAD extension indicates this with one + or more capability names consisting of "THREAD=" followed by a + supported threading algorithm name as described in this document. + This provides for future upwards-compatible extensions. + + A server that implements the SORT and/or THREAD extensions MUST + collate strings in accordance with the requirements of I18NLEVEL=1, + as described in [IMAP-I18N], and SHOULD implement and advertise the + I18NLEVEL=1 extension. Alternatively, a server MAY implement + I18NLEVEL=2 (or higher) and comply with the rules of that level. + + + + + +Crispin & Murchison Standards Track [Page 1] + +RFC 5256 IMAP Sort June 2008 + + + Discussion: The SORT and THREAD extensions predate [IMAP-I18N] by + several years. At the time of this writing, all known server + implementations of SORT and THREAD comply with the rules of + I18NLEVEL=1, but do not necessarily advertise it. As discussed in + [IMAP-I18N] section 4.5, all server implementations should + eventually be updated to comply with the I18NLEVEL=2 extension. + + Historical note: The REFERENCES threading algorithm is based on the + [THREADING] algorithm written and used in "Netscape Mail and News" + versions 2.0 through 3.0. + +2. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [KEYWORDS]. + + The word "can" (not "may") is used to refer to a possible + circumstance or situation, as opposed to an optional facility of the + protocol. + + "User" is used to refer to a human user, whereas "client" refers to + the software being run by the user. + + In examples, "C:" and "S:" indicate lines sent by the client and + server, respectively. + +2.1. Base Subject + + Subject sorting and threading use the "base subject", which has + specific subject artifacts removed. Due to the complexity of these + artifacts, the formal syntax for the subject extraction rules is + ambiguous. The following procedure is followed to determine the + "base subject", using the [ABNF] formal syntax rules described in + section 5: + + (1) Convert any RFC 2047 encoded-words in the subject to [UTF-8] + as described in "Internationalization Considerations". + Convert all tabs and continuations to space. Convert all + multiple spaces to a single space. + + (2) Remove all trailing text of the subject that matches the + subj-trailer ABNF; repeat until no more matches are possible. + + (3) Remove all prefix text of the subject that matches the subj- + leader ABNF. + + + + + +Crispin & Murchison Standards Track [Page 2] + +RFC 5256 IMAP Sort June 2008 + + + (4) If there is prefix text of the subject that matches the subj- + blob ABNF, and removing that prefix leaves a non-empty subj- + base, then remove the prefix text. + + (5) Repeat (3) and (4) until no matches remain. + + Note: It is possible to defer step (2) until step (6), but this + requires checking for subj-trailer in step (4). + + (6) If the resulting text begins with the subj-fwd-hdr ABNF and + ends with the subj-fwd-trl ABNF, remove the subj-fwd-hdr and + subj-fwd-trl and repeat from step (2). + + (7) The resulting text is the "base subject" used in the SORT. + + All servers and disconnected (as described in [IMAP-MODELS]) clients + MUST use exactly this algorithm to determine the "base subject". + Otherwise, there is potential for a user to get inconsistent results + based on whether they are running in connected or disconnected mode. + +2.2. Sent Date + + As used in this document, the term "sent date" refers to the date and + time from the Date: header, adjusted by time zone to normalize to + UTC. For example, "31 Dec 2000 16:01:33 -0800" is equivalent to the + UTC date and time of "1 Jan 2001 00:01:33 +0000". + + If the time zone is invalid, the date and time SHOULD be treated as + UTC. If the time is also invalid, the time SHOULD be treated as + 00:00:00. If there is no valid date or time, the date and time + SHOULD be treated as 00:00:00 on the earliest possible date. + + This differs from the date-related criteria in the SEARCH command + (described in [IMAP] section 6.4.4), which use just the date and not + the time, and are not adjusted by time zone. + + If the sent date cannot be determined (a Date: header is missing or + cannot be parsed), the INTERNALDATE for that message is used as the + sent date. + + When comparing two sent dates that match exactly, the order in which + the two messages appear in the mailbox (that is, by sequence number) + is used as a tie-breaker to determine the order. + + + + + + + + +Crispin & Murchison Standards Track [Page 3] + +RFC 5256 IMAP Sort June 2008 + + +3. Additional Commands + + These commands are extensions to the [IMAP] base protocol. + + The section headings are intended to correspond with where they would + be located in the main document if they were part of the base + specification. + +BASE.6.4.SORT. SORT Command + + Arguments: sort program + charset specification + searching criteria (one or more) + + Data: untagged responses: SORT + + Result: OK - sort completed + NO - sort error: can't sort that charset or + criteria + BAD - command unknown or arguments invalid + + The SORT command is a variant of SEARCH with sorting semantics for + the results. There are two arguments before the searching + criteria argument: a parenthesized list of sort criteria, and the + searching charset. + + The charset argument is mandatory (unlike SEARCH) and indicates + the [CHARSET] of the strings that appear in the searching + criteria. The US-ASCII and [UTF-8] charsets MUST be implemented. + All other charsets are optional. + + There is also a UID SORT command that returns unique identifiers + instead of message sequence numbers. Note that there are separate + searching criteria for message sequence numbers and UIDs; thus, + the arguments to UID SORT are interpreted the same as in SORT. + This is analogous to the behavior of UID SEARCH, as opposed to UID + COPY, UID FETCH, or UID STORE. + + The SORT command first searches the mailbox for messages that + match the given searching criteria using the charset argument for + the interpretation of strings in the searching criteria. It then + returns the matching messages in an untagged SORT response, sorted + according to one or more sort criteria. + + Sorting is in ascending order. Earlier dates sort before later + dates; smaller sizes sort before larger sizes; and strings are + sorted according to ascending values established by their + collation algorithm (see "Internationalization Considerations"). + + + +Crispin & Murchison Standards Track [Page 4] + +RFC 5256 IMAP Sort June 2008 + + + If two or more messages exactly match according to the sorting + criteria, these messages are sorted according to the order in + which they appear in the mailbox. In other words, there is an + implicit sort criterion of "sequence number". + + When multiple sort criteria are specified, the result is sorted in + the priority order that the criteria appear. For example, + (SUBJECT DATE) will sort messages in order by their base subject + text; and for messages with the same base subject text, it will + sort by their sent date. + + Untagged EXPUNGE responses are not permitted while the server is + responding to a SORT command, but are permitted during a UID SORT + command. + + The defined sort criteria are as follows. Refer to the Formal + Syntax section for the precise syntactic definitions of the + arguments. If the associated RFC-822 header for a particular + criterion is absent, it is treated as the empty string. The empty + string always collates before non-empty strings. + + ARRIVAL + Internal date and time of the message. This differs from the + ON criteria in SEARCH, which uses just the internal date. + + CC + [IMAP] addr-mailbox of the first "cc" address. + + DATE + Sent date and time, as described in section 2.2. + + FROM + [IMAP] addr-mailbox of the first "From" address. + + REVERSE + Followed by another sort criterion, has the effect of that + criterion but in reverse (descending) order. + Note: REVERSE only reverses a single criterion, and does not + affect the implicit "sequence number" sort criterion if all + other criteria are identical. Consequently, a sort of + REVERSE SUBJECT is not the same as a reverse ordering of a + SUBJECT sort. This can be avoided by use of additional + criteria, e.g., SUBJECT DATE vs. REVERSE SUBJECT REVERSE + DATE. In general, however, it's better (and faster, if the + client has a "reverse current ordering" command) to reverse + the results in the client instead of issuing a new SORT. + + + + + +Crispin & Murchison Standards Track [Page 5] + +RFC 5256 IMAP Sort June 2008 + + + SIZE + Size of the message in octets. + + SUBJECT + Base subject text. + + TO + [IMAP] addr-mailbox of the first "To" address. + + Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994 + S: * SORT 2 84 882 + S: A282 OK SORT completed + C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL + S: * SORT 5 3 4 1 2 + S: A283 OK SORT completed + C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox" + S: * SORT + S: A284 OK SORT completed + +BASE.6.4.THREAD. THREAD Command + +Arguments: threading algorithm + charset specification + searching criteria (one or more) + +Data: untagged responses: THREAD + +Result: OK - thread completed + NO - thread error: can't thread that charset or + criteria + BAD - command unknown or arguments invalid + + The THREAD command is a variant of SEARCH with threading semantics + for the results. Thread has two arguments before the searching + criteria argument: a threading algorithm and the searching + charset. + + The charset argument is mandatory (unlike SEARCH) and indicates + the [CHARSET] of the strings that appear in the searching + criteria. The US-ASCII and [UTF-8] charsets MUST be implemented. + All other charsets are optional. + + There is also a UID THREAD command that returns unique identifiers + instead of message sequence numbers. Note that there are separate + searching criteria for message sequence numbers and UIDs; thus the + arguments to UID THREAD are interpreted the same as in THREAD. + This is analogous to the behavior of UID SEARCH, as opposed to UID + COPY, UID FETCH, or UID STORE. + + + +Crispin & Murchison Standards Track [Page 6] + +RFC 5256 IMAP Sort June 2008 + + + The THREAD command first searches the mailbox for messages that + match the given searching criteria using the charset argument for + the interpretation of strings in the searching criteria. It then + returns the matching messages in an untagged THREAD response, + threaded according to the specified threading algorithm. + + All collation is in ascending order. Earlier dates collate before + later dates and strings are collated according to ascending values + established by their collation algorithm (see + "Internationalization Considerations"). + + Untagged EXPUNGE responses are not permitted while the server is + responding to a THREAD command, but are permitted during a UID + THREAD command. + + The defined threading algorithms are as follows: + + ORDEREDSUBJECT + + The ORDEREDSUBJECT threading algorithm is also referred to as + "poor man's threading". The searched messages are sorted by + base subject and then by the sent date. The messages are then + split into separate threads, with each thread containing + messages with the same base subject text. Finally, the threads + are sorted by the sent date of the first message in the thread. + + The top level or "root" in ORDEREDSUBJECT threading contains + the first message of every thread. All messages in the root + are siblings of each other. The second message of a thread is + the child of the first message, and subsequent messages of the + thread are siblings of the second message and hence children of + the message at the root. Hence, there are no grandchildren in + ORDEREDSUBJECT threading. + + Children in ORDEREDSUBJECT threading do not have descendents. + Client implementations SHOULD treat descendents of a child in a + server response as being siblings of that child. + + REFERENCES + + The REFERENCES threading algorithm threads the searched + messages by grouping them together in parent/child + relationships based on which messages are replies to others. + The parent/child relationships are built using two methods: + reconstructing a message's ancestry using the references + contained within it; and checking the original (not base) + subject of a message to see if it is a reply to (or forward of) + another message. + + + +Crispin & Murchison Standards Track [Page 7] + +RFC 5256 IMAP Sort June 2008 + + + Note: "Message ID" in the following description refers to a + normalized form of the msg-id in [RFC2822]. The actual text + in RFC 2822 may use quoting, resulting in multiple ways of + expressing the same Message ID. Implementations of the + REFERENCES threading algorithm MUST normalize any msg-id in + order to avoid false non-matches due to differences in + quoting. + + For example, the msg-id + <"01KF8JCEOCBS0045PS"@xxx.yyy.com> + and the msg-id + <01KF8JCEOCBS0045PS@xxx.yyy.com> + MUST be interpreted as being the same Message ID. + + The references used for reconstructing a message's ancestry are + found using the following rules: + + If a message contains a References header line, then use the + Message IDs in the References header line as the references. + + If a message does not contain a References header line, or + the References header line does not contain any valid + Message IDs, then use the first (if any) valid Message ID + found in the In-Reply-To header line as the only reference + (parent) for this message. + + Note: Although [RFC2822] permits multiple Message IDs in + the In-Reply-To header, in actual practice this + discipline has not been followed. For example, + In-Reply-To headers have been observed with message + addresses after the Message ID, and there are no good + heuristics for software to determine the difference. + This is not a problem with the References header, + however. + + If a message does not contain an In-Reply-To header line, or + the In-Reply-To header line does not contain a valid Message + ID, then the message does not have any references (NIL). + + A message is considered to be a reply or forward if the base + subject extraction rules, applied to the original subject, + remove any of the following: a subj-refwd, a "(fwd)" subj- + trailer, or a subj-fwd-hdr and subj-fwd-trl. + + The REFERENCES algorithm is significantly more complex than + ORDEREDSUBJECT and consists of six main steps. These steps are + outlined in detail below. + + + + +Crispin & Murchison Standards Track [Page 8] + +RFC 5256 IMAP Sort June 2008 + + + (1) For each searched message: + + (A) Using the Message IDs in the message's references, link + the corresponding messages (those whose Message-ID + header line contains the given reference Message ID) + together as parent/child. Make the first reference the + parent of the second (and the second a child of the + first), the second the parent of the third (and the + third a child of the second), etc. The following rules + govern the creation of these links: + + If a message does not contain a Message-ID header + line, or the Message-ID header line does not + contain a valid Message ID, then assign a unique + Message ID to this message. + + If two or more messages have the same Message ID, + then only use that Message ID in the first (lowest + sequence number) message, and assign a unique + Message ID to each of the subsequent messages with + a duplicate of that Message ID. + + If no message can be found with a given Message ID, + create a dummy message with this ID. Use this + dummy message for all subsequent references to this + ID. + + If a message already has a parent, don't change the + existing link. This is done because the References + header line may have been truncated by a Mail User + Agent (MUA). As a result, there is no guarantee + that the messages corresponding to adjacent Message + IDs in the References header line are parent and + child. + + Do not create a parent/child link if creating that + link would introduce a loop. For example, before + making message A the parent of B, make sure that A + is not a descendent of B. + + Note: Message ID comparisons are case-sensitive. + + (B) Create a parent/child link between the last reference + (or NIL if there are no references) and the current + message. If the current message already has a parent, + it is probably the result of a truncated References + header line, so break the current parent/child link + before creating the new correct one. As in step 1.A, + + + +Crispin & Murchison Standards Track [Page 9] + +RFC 5256 IMAP Sort June 2008 + + + do not create the parent/child link if creating that + link would introduce a loop. Note that if this message + has no references, it will now have no parent. + + Note: The parent/child links created in steps 1.A + and 1.B MUST be kept consistent with one another at + ALL times. + + (2) Gather together all of the messages that have no parents + and make them all children (siblings of one another) of a + dummy parent (the "root"). These messages constitute the + first (head) message of the threads created thus far. + + (3) Prune dummy messages from the thread tree. Traverse each + thread under the root, and for each message: + + If it is a dummy message with NO children, delete it. + + If it is a dummy message with children, delete it, but + promote its children to the current level. In other + words, splice them in with the dummy's siblings. + + Do not promote the children if doing so would make them + children of the root, unless there is only one child. + + (4) Sort the messages under the root (top-level siblings only) + by sent date as described in section 2.2. In the case of a + dummy message, sort its children by sent date and then use + the first child for the top-level sort. + + (5) Gather together messages under the root that have the same + base subject text. + + (A) Create a table for associating base subjects with + messages, called the subject table. + + (B) Populate the subject table with one message per each + base subject. For each child of the root: + + (i) Find the subject of this thread, by using the + base subject from either the current message or + its first child if the current message is a + dummy. This is the thread subject. + + (ii) If the thread subject is empty, skip this + message. + + + + + +Crispin & Murchison Standards Track [Page 10] + +RFC 5256 IMAP Sort June 2008 + + + (iii) Look up the message associated with the thread + subject in the subject table. + + (iv) If there is no message in the subject table with + the thread subject, add the current message and + the thread subject to the subject table. + + Otherwise, if the message in the subject table is + not a dummy, AND either of the following criteria + are true: + + The current message is a dummy, OR + + The message in the subject table is a reply + or forward and the current message is not. + + then replace the message in the subject table + with the current message. + + (C) Merge threads with the same thread subject. For each + child of the root: + + (i) Find the message's thread subject as in step + 5.B.i above. + + (ii) If the thread subject is empty, skip this + message. + + (iii) Lookup the message associated with this thread + subject in the subject table. + + (iv) If the message in the subject table is the + current message, skip this message. + + Otherwise, merge the current message with the one in + the subject table using the following rules: + + If both messages are dummies, append the current + message's children to the children of the message + in the subject table (the children of both messages + become siblings), and then delete the current + message. + + If the message in the subject table is a dummy and + the current message is not, make the current + message a child of the message in the subject table + (a sibling of its children). + + + + +Crispin & Murchison Standards Track [Page 11] + +RFC 5256 IMAP Sort June 2008 + + + If the current message is a reply or forward and + the message in the subject table is not, make the + current message a child of the message in the + subject table (a sibling of its children). + + Otherwise, create a new dummy message and make both + the current message and the message in the subject + table children of the dummy. Then replace the + message in the subject table with the dummy + message. + + Note: Subject comparisons are case-insensitive, + as described under "Internationalization + Considerations". + + (6) Traverse the messages under the root and sort each set of + siblings by sent date as described in section 2.2. + Traverse the messages in such a way that the "youngest" set + of siblings are sorted first, and the "oldest" set of + siblings are sorted last (grandchildren are sorted before + children, etc). In the case of a dummy message (which can + only occur with top-level siblings), use its first child + for sorting. + + Example: C: A283 THREAD ORDEREDSUBJECT UTF-8 SINCE 5-MAR-2000 + S: * THREAD (166)(167)(168)(169)(172)(170)(171) + (173)(174 (175)(176)(178)(181)(180))(179)(177 + (183)(182)(188)(184)(185)(186)(187)(189))(190) + (191)(192)(193)(194 195)(196 (197)(198))(199) + (200 202)(201)(203)(204)(205)(206 207)(208) + S: A283 OK THREAD completed + C: A284 THREAD ORDEREDSUBJECT US-ASCII TEXT "gewp" + S: * THREAD + S: A284 OK THREAD completed + C: A285 THREAD REFERENCES UTF-8 SINCE 5-MAR-2000 + S: * THREAD (166)(167)(168)(169)(172)((170)(179)) + (171)(173)((174)(175)(176)(178)(181)(180)) + ((177)(183)(182)(188 (184)(189))(185 186)(187)) + (190)(191)(192)(193)((194)(195 196))(197 198) + (199)(200 202)(201)(203)(204)(205 206 207)(208) + S: A285 OK THREAD completed + + Note: The line breaks in the first and third server + responses are for editorial clarity and do not appear in + real THREAD responses. + + + + + + +Crispin & Murchison Standards Track [Page 12] + +RFC 5256 IMAP Sort June 2008 + + +4. Additional Responses + + These responses are extensions to the [IMAP] base protocol. + + The section headings of these responses are intended to correspond + with where they would be located in the main document. + +BASE.7.2.SORT. SORT Response + + Data: zero or more numbers + + The SORT response occurs as a result of a SORT or UID SORT + command. The number(s) refer to those messages that match the + search criteria. For SORT, these are message sequence numbers; + for UID SORT, these are unique identifiers. Each number is + delimited by a space. + + Example: S: * SORT 2 3 6 + +BASE.7.2.THREAD. THREAD Response + + Data: zero or more threads + + The THREAD response occurs as a result of a THREAD or UID THREAD + command. It contains zero or more threads. A thread consists of + a parenthesized list of thread members. + + Thread members consist of zero or more message numbers, delimited + by spaces, indicating successive parent and child. This continues + until the thread splits into multiple sub-threads, at which point, + the thread nests into multiple sub-threads with the first member + of each sub-thread being siblings at this level. There is no + limit to the nesting of threads. + + The messages numbers refer to those messages that match the search + criteria. For THREAD, these are message sequence numbers; for UID + THREAD, these are unique identifiers. + + Example: S: * THREAD (2)(3 6 (4 23)(44 7 96)) + + The first thread consists only of message 2. The second thread + consists of the messages 3 (parent) and 6 (child), after which it + splits into two sub-threads; the first of which contains messages + 4 (child of 6, sibling of 44) and 23 (child of 4), and the second + of which contains messages 44 (child of 6, sibling of 4), 7 (child + of 44), and 96 (child of 7). Since some later messages are + parents of earlier messages, the messages were probably moved from + some other mailbox at different times. + + + +Crispin & Murchison Standards Track [Page 13] + +RFC 5256 IMAP Sort June 2008 + + + -- 2 + + -- 3 + \-- 6 + |-- 4 + | \-- 23 + | + \-- 44 + \-- 7 + \-- 96 + + Example: S: * THREAD ((3)(5)) + + In this example, 3 and 5 are siblings of a parent that does not + match the search criteria (and/or does not exist in the mailbox); + however they are members of the same thread. + +5. Formal Syntax of SORT and THREAD Commands and Responses + + The following syntax specification uses the Augmented Backus-Naur + Form (ABNF) notation as specified in [ABNF]. It also uses [ABNF] + rules defined in [IMAP]. + +sort = ["UID" SP] "SORT" SP sort-criteria SP search-criteria + +sort-criteria = "(" sort-criterion *(SP sort-criterion) ")" + +sort-criterion = ["REVERSE" SP] sort-key + +sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / + "SUBJECT" / "TO" + +thread = ["UID" SP] "THREAD" SP thread-alg SP search-criteria + +thread-alg = "ORDEREDSUBJECT" / "REFERENCES" / thread-alg-ext + +thread-alg-ext = atom + ; New algorithms MUST be registered with IANA + +search-criteria = charset 1*(SP search-key) + +charset = atom / quoted + ; CHARSET values MUST be registered with IANA + +sort-data = "SORT" *(SP nz-number) + +thread-data = "THREAD" [SP 1*thread-list] + + + + +Crispin & Murchison Standards Track [Page 14] + +RFC 5256 IMAP Sort June 2008 + + +thread-list = "(" (thread-members / thread-nested) ")" + +thread-members = nz-number *(SP nz-number) [SP thread-nested] + +thread-nested = 2*thread-list + + The following syntax describes base subject extraction rules (2)-(6): + +subject = *subj-leader [subj-middle] *subj-trailer + +subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":" + +subj-blob = "[" *BLOBCHAR "]" *WSP + +subj-fwd = subj-fwd-hdr subject subj-fwd-trl + +subj-fwd-hdr = "[fwd:" + +subj-fwd-trl = "]" + +subj-leader = (*subj-blob subj-refwd) / WSP + +subj-middle = *subj-blob (subj-base / subj-fwd) + ; last subj-blob is subj-base if subj-base would + ; otherwise be empty + +subj-trailer = "(fwd)" / WSP + +subj-base = NONWSP *(*WSP NONWSP) + ; can be a subj-blob + +BLOBCHAR = %x01-5a / %x5c / %x5e-ff + ; any CHAR8 except '[' and ']'. + ; SHOULD comply with [UTF-8] + +NONWSP = %x01-08 / %x0a-1f / %x21-ff + ; any CHAR8 other than WSP. + ; SHOULD comply with [UTF-8] + +6. Security Considerations + + The SORT and THREAD extensions do not raise any security + considerations that are not present in the base [IMAP] protocol, and + these issues are discussed in [IMAP]. Nevertheless, it is important + to remember that [IMAP] protocol transactions, including message + data, are sent in the clear over the network unless protection from + snooping is negotiated, either by the use of STARTTLS, privacy + protection in AUTHENTICATE, or some other protection mechanism. + + + +Crispin & Murchison Standards Track [Page 15] + +RFC 5256 IMAP Sort June 2008 + + + Although not a security consideration, it is important to recognize + that sorting by REFERENCES can lead to misleading threading trees. + For example, a message with false References: header data will cause + a thread to be incorporated into another thread. + + The process of extracting the base subject may lead to incorrect + collation if the extracted data was significant text as opposed to a + subject artifact. + +7. Internationalization Considerations + + As stated in the introduction, the rules of I18NLEVEL=1 as described + in [IMAP-I18N] MUST be followed; that is, the SORT and THREAD + extensions MUST collate strings according to the i;unicode-casemap + collation described in [UNICASEMAP]. Servers SHOULD also advertise + the I18NLEVEL=1 extension. Alternatively, a server MAY implement + I18NLEVEL=2 (or higher) and comply with the rules of that level. + + As discussed in [IMAP-I18N] section 4.5, all server implementations + should eventually be updated to support the [IMAP-I18N] I18NLEVEL=2 + extension. + + Translations of the "re" or "fw"/"fwd" tokens are not specified for + removal in the base subject extraction process. An attempt to add + such translated tokens would result in a geometrically complex, and + ultimately unimplementable, task. + + Instead, note that [RFC2822] section 3.6.5 recommends that "re:" + (from the Latin "res", meaning "in the matter of") be used to + identify a reply. Although it is evident that, from the multiple + forms of token to identify a forwarded message, there is considerable + variation found in the wild, the variations are (still) manageable. + Consequently, it is suggested that "re:" and one of the variations of + the tokens for a forward supported by the base subject extraction + rules be adopted for Internet mail messages, since doing so makes it + a simple display-time task to localize the token language for the + user. + +8. IANA Considerations + + [IMAP] capabilities are registered by publishing a standards track or + IESG-approved experimental RFC. This document constitutes + registration of the SORT and THREAD capabilities in the [IMAP] + capabilities registry. + + + + + + + +Crispin & Murchison Standards Track [Page 16] + +RFC 5256 IMAP Sort June 2008 + + + This document creates a new [IMAP] threading algorithms registry, + which registers threading algorithms by publishing a standards track + or IESG-approved experimental RFC. This document constitutes + registration of the ORDEREDSUBJECT and REFERENCES algorithms in that + registry. + +9. Normative References + + [ABNF] Crocker, D., Ed., and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", STD 68, RFC 5234, January + 2008. + + [CHARSET] Freed, N. and J. Postel, "IANA Charset Registration + Procedures", BCP 19, RFC 2978, October 2000. + + [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - + VERSION 4rev1", RFC 3501, March 2003. + + [IMAP-I18N] Newman, C., Gulbrandsen, A., and A. Melnikov, "Internet + Message Access Protocol Internationalization", RFC + 5255, June 2008. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2822] Resnick, P., Ed., "Internet Message Format", RFC 2822, + April 2001. + + [UNICASEMAP] Crispin, M., "i;unicode-casemap - Simple Unicode + Collation Algorithm", RFC 5051, October 2007. + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003. + +10. Informative References + + [IMAP-MODELS] Crispin, M., "Distributed Electronic Mail Models in + IMAP4", RFC 1733, December 1994. + + [THREADING] Zawinski, J. "Message Threading", + http://www.jwz.org/doc/threading.html, 1997-2002. + + + + + + + + + + +Crispin & Murchison Standards Track [Page 17] + +RFC 5256 IMAP Sort June 2008 + + +Authors' Addresses + + Mark R. Crispin + Panda Programming + 6158 NE Lariat Loop + Bainbridge Island, WA 98110-2098 + + Phone: +1 (206) 842-2385 + EMail: IMAP+SORT+THREAD@Lingling.Panda.COM + + + Kenneth Murchison + Carnegie Mellon University + 5000 Forbes Avenue + Cyert Hall 285 + Pittsburgh, PA 15213 + + Phone: +1 (412) 268-2638 + EMail: murch@andrew.cmu.edu + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Crispin & Murchison Standards Track [Page 18] + +RFC 5256 IMAP Sort June 2008 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2008). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + + + + + + + + + + + +Crispin & Murchison Standards Track [Page 19] + diff --git a/rfc/rfc5322.txt b/rfc/rfc5322.txt @@ -0,0 +1,3195 @@ + + + + + + +Network Working Group P. Resnick, Ed. +Request for Comments: 5322 Qualcomm Incorporated +Obsoletes: 2822 October 2008 +Updates: 4021 +Category: Standards Track + + + Internet Message Format + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This document specifies the Internet Message Format (IMF), a syntax + for text messages that are sent between computer users, within the + framework of "electronic mail" messages. This specification is a + revision of Request For Comments (RFC) 2822, which itself superseded + Request For Comments (RFC) 822, "Standard for the Format of ARPA + Internet Text Messages", updating it to reflect current practice and + incorporating incremental changes that were specified in other RFCs. + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 1] + +RFC 5322 Internet Message Format October 2008 + + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 5 + 1.2.1. Requirements Notation . . . . . . . . . . . . . . . . 5 + 1.2.2. Syntactic Notation . . . . . . . . . . . . . . . . . . 5 + 1.2.3. Structure of This Document . . . . . . . . . . . . . . 5 + 2. Lexical Analysis of Messages . . . . . . . . . . . . . . . . . 6 + 2.1. General Description . . . . . . . . . . . . . . . . . . . 6 + 2.1.1. Line Length Limits . . . . . . . . . . . . . . . . . . 7 + 2.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 8 + 2.2.1. Unstructured Header Field Bodies . . . . . . . . . . . 8 + 2.2.2. Structured Header Field Bodies . . . . . . . . . . . . 8 + 2.2.3. Long Header Fields . . . . . . . . . . . . . . . . . . 8 + 2.3. Body . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 + 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 + 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 10 + 3.2. Lexical Tokens . . . . . . . . . . . . . . . . . . . . . . 10 + 3.2.1. Quoted characters . . . . . . . . . . . . . . . . . . 10 + 3.2.2. Folding White Space and Comments . . . . . . . . . . . 11 + 3.2.3. Atom . . . . . . . . . . . . . . . . . . . . . . . . . 12 + 3.2.4. Quoted Strings . . . . . . . . . . . . . . . . . . . . 13 + 3.2.5. Miscellaneous Tokens . . . . . . . . . . . . . . . . . 14 + 3.3. Date and Time Specification . . . . . . . . . . . . . . . 14 + 3.4. Address Specification . . . . . . . . . . . . . . . . . . 16 + 3.4.1. Addr-Spec Specification . . . . . . . . . . . . . . . 17 + 3.5. Overall Message Syntax . . . . . . . . . . . . . . . . . . 18 + 3.6. Field Definitions . . . . . . . . . . . . . . . . . . . . 19 + 3.6.1. The Origination Date Field . . . . . . . . . . . . . . 22 + 3.6.2. Originator Fields . . . . . . . . . . . . . . . . . . 22 + 3.6.3. Destination Address Fields . . . . . . . . . . . . . . 23 + 3.6.4. Identification Fields . . . . . . . . . . . . . . . . 25 + 3.6.5. Informational Fields . . . . . . . . . . . . . . . . . 27 + 3.6.6. Resent Fields . . . . . . . . . . . . . . . . . . . . 28 + 3.6.7. Trace Fields . . . . . . . . . . . . . . . . . . . . . 30 + 3.6.8. Optional Fields . . . . . . . . . . . . . . . . . . . 30 + 4. Obsolete Syntax . . . . . . . . . . . . . . . . . . . . . . . 31 + 4.1. Miscellaneous Obsolete Tokens . . . . . . . . . . . . . . 32 + 4.2. Obsolete Folding White Space . . . . . . . . . . . . . . . 33 + 4.3. Obsolete Date and Time . . . . . . . . . . . . . . . . . . 33 + 4.4. Obsolete Addressing . . . . . . . . . . . . . . . . . . . 35 + 4.5. Obsolete Header Fields . . . . . . . . . . . . . . . . . . 35 + 4.5.1. Obsolete Origination Date Field . . . . . . . . . . . 36 + 4.5.2. Obsolete Originator Fields . . . . . . . . . . . . . . 36 + 4.5.3. Obsolete Destination Address Fields . . . . . . . . . 37 + 4.5.4. Obsolete Identification Fields . . . . . . . . . . . . 37 + 4.5.5. Obsolete Informational Fields . . . . . . . . . . . . 37 + + + +Resnick Standards Track [Page 2] + +RFC 5322 Internet Message Format October 2008 + + + 4.5.6. Obsolete Resent Fields . . . . . . . . . . . . . . . . 38 + 4.5.7. Obsolete Trace Fields . . . . . . . . . . . . . . . . 38 + 4.5.8. Obsolete optional fields . . . . . . . . . . . . . . . 38 + 5. Security Considerations . . . . . . . . . . . . . . . . . . . 38 + 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 + Appendix A. Example Messages . . . . . . . . . . . . . . . . . 43 + Appendix A.1. Addressing Examples . . . . . . . . . . . . . . . 44 + Appendix A.1.1. A Message from One Person to Another with + Simple Addressing . . . . . . . . . . . . . . . . 44 + Appendix A.1.2. Different Types of Mailboxes . . . . . . . . . . . 45 + Appendix A.1.3. Group Addresses . . . . . . . . . . . . . . . . . 45 + Appendix A.2. Reply Messages . . . . . . . . . . . . . . . . . . 46 + Appendix A.3. Resent Messages . . . . . . . . . . . . . . . . . 47 + Appendix A.4. Messages with Trace Fields . . . . . . . . . . . . 48 + Appendix A.5. White Space, Comments, and Other Oddities . . . . 49 + Appendix A.6. Obsoleted Forms . . . . . . . . . . . . . . . . . 50 + Appendix A.6.1. Obsolete Addressing . . . . . . . . . . . . . . . 50 + Appendix A.6.2. Obsolete Dates . . . . . . . . . . . . . . . . . . 50 + Appendix A.6.3. Obsolete White Space and Comments . . . . . . . . 51 + Appendix B. Differences from Earlier Specifications . . . . . 52 + Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . 53 + 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 55 + 7.1. Normative References . . . . . . . . . . . . . . . . . . . 55 + 7.2. Informative References . . . . . . . . . . . . . . . . . . 55 + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 3] + +RFC 5322 Internet Message Format October 2008 + + +1. Introduction + +1.1. Scope + + This document specifies the Internet Message Format (IMF), a syntax + for text messages that are sent between computer users, within the + framework of "electronic mail" messages. This specification is an + update to [RFC2822], which itself superseded [RFC0822], updating it + to reflect current practice and incorporating incremental changes + that were specified in other RFCs such as [RFC1123]. + + This document specifies a syntax only for text messages. In + particular, it makes no provision for the transmission of images, + audio, or other sorts of structured data in electronic mail messages. + There are several extensions published, such as the MIME document + series ([RFC2045], [RFC2046], [RFC2049]), which describe mechanisms + for the transmission of such data through electronic mail, either by + extending the syntax provided here or by structuring such messages to + conform to this syntax. Those mechanisms are outside of the scope of + this specification. + + In the context of electronic mail, messages are viewed as having an + envelope and contents. The envelope contains whatever information is + needed to accomplish transmission and delivery. (See [RFC5321] for a + discussion of the envelope.) The contents comprise the object to be + delivered to the recipient. This specification applies only to the + format and some of the semantics of message contents. It contains no + specification of the information in the envelope. + + However, some message systems may use information from the contents + to create the envelope. It is intended that this specification + facilitate the acquisition of such information by programs. + + This specification is intended as a definition of what message + content format is to be passed between systems. Though some message + systems locally store messages in this format (which eliminates the + need for translation between formats) and others use formats that + differ from the one specified in this specification, local storage is + outside of the scope of this specification. + + Note: This specification is not intended to dictate the internal + formats used by sites, the specific message system features that + they are expected to support, or any of the characteristics of + user interface programs that create or read messages. In + addition, this document does not specify an encoding of the + characters for either transport or storage; that is, it does not + specify the number of bits used or how those bits are specifically + transferred over the wire or stored on disk. + + + +Resnick Standards Track [Page 4] + +RFC 5322 Internet Message Format October 2008 + + +1.2. Notational Conventions + +1.2.1. Requirements Notation + + This document occasionally uses terms that appear in capital letters. + When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD + NOT", and "MAY" appear capitalized, they are being used to indicate + particular requirements of this specification. A discussion of the + meanings of these terms appears in [RFC2119]. + +1.2.2. Syntactic Notation + + This specification uses the Augmented Backus-Naur Form (ABNF) + [RFC5234] notation for the formal definitions of the syntax of + messages. Characters will be specified either by a decimal value + (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by + a case-insensitive literal value enclosed in quotation marks (e.g., + "A" for either uppercase or lowercase A). + +1.2.3. Structure of This Document + + This document is divided into several sections. + + This section, section 1, is a short introduction to the document. + + Section 2 lays out the general description of a message and its + constituent parts. This is an overview to help the reader understand + some of the general principles used in the later portions of this + document. Any examples in this section MUST NOT be taken as + specification of the formal syntax of any part of a message. + + Section 3 specifies formal ABNF rules for the structure of each part + of a message (the syntax) and describes the relationship between + those parts and their meaning in the context of a message (the + semantics). That is, it lays out the actual rules for the structure + of each part of a message (the syntax) as well as a description of + the parts and instructions for their interpretation (the semantics). + This includes analysis of the syntax and semantics of subparts of + messages that have specific structure. The syntax included in + section 3 represents messages as they MUST be created. There are + also notes in section 3 to indicate if any of the options specified + in the syntax SHOULD be used over any of the others. + + Both sections 2 and 3 describe messages that are legal to generate + for purposes of this specification. + + + + + + +Resnick Standards Track [Page 5] + +RFC 5322 Internet Message Format October 2008 + + + Section 4 of this document specifies an "obsolete" syntax. There are + references in section 3 to these obsolete syntactic elements. The + rules of the obsolete syntax are elements that have appeared in + earlier versions of this specification or have previously been widely + used in Internet messages. As such, these elements MUST be + interpreted by parsers of messages in order to be conformant to this + specification. However, since items in this syntax have been + determined to be non-interoperable or to cause significant problems + for recipients of messages, they MUST NOT be generated by creators of + conformant messages. + + Section 5 details security considerations to take into account when + implementing this specification. + + Appendix A lists examples of different sorts of messages. These + examples are not exhaustive of the types of messages that appear on + the Internet, but give a broad overview of certain syntactic forms. + + Appendix B lists the differences between this specification and + earlier specifications for Internet messages. + + Appendix C contains acknowledgements. + +2. Lexical Analysis of Messages + +2.1. General Description + + At the most basic level, a message is a series of characters. A + message that is conformant with this specification is composed of + characters with values in the range of 1 through 127 and interpreted + as US-ASCII [ANSI.X3-4.1986] characters. For brevity, this document + sometimes refers to this range of characters as simply "US-ASCII + characters". + + Note: This document specifies that messages are made up of + characters in the US-ASCII range of 1 through 127. There are + other documents, specifically the MIME document series ([RFC2045], + [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]), that + extend this specification to allow for values outside of that + range. Discussion of those mechanisms is not within the scope of + this specification. + + Messages are divided into lines of characters. A line is a series of + characters that is delimited with the two characters carriage-return + and line-feed; that is, the carriage return (CR) character (ASCII + value 13) followed immediately by the line feed (LF) character (ASCII + value 10). (The carriage return/line feed pair is usually written in + this document as "CRLF".) + + + +Resnick Standards Track [Page 6] + +RFC 5322 Internet Message Format October 2008 + + + A message consists of header fields (collectively called "the header + section of the message") followed, optionally, by a body. The header + section is a sequence of lines of characters with special syntax as + defined in this specification. The body is simply a sequence of + characters that follows the header section and is separated from the + header section by an empty line (i.e., a line with nothing preceding + the CRLF). + + Note: Common parlance and earlier versions of this specification + use the term "header" to either refer to the entire header section + or to refer to an individual header field. To avoid ambiguity, + this document does not use the terms "header" or "headers" in + isolation, but instead always uses "header field" to refer to the + individual field and "header section" to refer to the entire + collection. + +2.1.1. Line Length Limits + + There are two limits that this specification places on the number of + characters in a line. Each line of characters MUST be no more than + 998 characters, and SHOULD be no more than 78 characters, excluding + the CRLF. + + The 998 character limit is due to limitations in many implementations + that send, receive, or store IMF messages which simply cannot handle + more than 998 characters on a line. Receiving implementations would + do well to handle an arbitrarily large number of characters in a line + for robustness sake. However, there are so many implementations that + (in compliance with the transport requirements of [RFC5321]) do not + accept messages containing more than 1000 characters including the CR + and LF per line, it is important for implementations not to create + such messages. + + The more conservative 78 character recommendation is to accommodate + the many implementations of user interfaces that display these + messages which may truncate, or disastrously wrap, the display of + more than 78 characters per line, in spite of the fact that such + implementations are non-conformant to the intent of this + specification (and that of [RFC5321] if they actually cause + information to be lost). Again, even though this limitation is put + on messages, it is incumbent upon implementations that display + messages to handle an arbitrarily large number of characters in a + line (certainly at least up to the 998 character limit) for the sake + of robustness. + + + + + + + +Resnick Standards Track [Page 7] + +RFC 5322 Internet Message Format October 2008 + + +2.2. Header Fields + + Header fields are lines beginning with a field name, followed by a + colon (":"), followed by a field body, and terminated by CRLF. A + field name MUST be composed of printable US-ASCII characters (i.e., + characters that have values between 33 and 126, inclusive), except + colon. A field body may be composed of printable US-ASCII characters + as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, + ASCII value 9) characters (together known as the white space + characters, WSP). A field body MUST NOT include CR and LF except + when used in "folding" and "unfolding", as described in section + 2.2.3. All field bodies MUST conform to the syntax described in + sections 3 and 4 of this specification. + +2.2.1. Unstructured Header Field Bodies + + Some field bodies in this specification are defined simply as + "unstructured" (which is specified in section 3.2.5 as any printable + US-ASCII characters plus white space characters) with no further + restrictions. These are referred to as unstructured field bodies. + Semantically, unstructured field bodies are simply to be treated as a + single line of characters with no further processing (except for + "folding" and "unfolding" as described in section 2.2.3). + +2.2.2. Structured Header Field Bodies + + Some field bodies in this specification have a syntax that is more + restrictive than the unstructured field bodies described above. + These are referred to as "structured" field bodies. Structured field + bodies are sequences of specific lexical tokens as described in + sections 3 and 4 of this specification. Many of these tokens are + allowed (according to their syntax) to be introduced or end with + comments (as described in section 3.2.2) as well as the white space + characters, and those white space characters are subject to "folding" + and "unfolding" as described in section 2.2.3. Semantic analysis of + structured field bodies is given along with their syntax. + +2.2.3. Long Header Fields + + Each header field is logically a single line of characters comprising + the field name, the colon, and the field body. For convenience + however, and to deal with the 998/78 character limitations per line, + the field body portion of a header field can be split into a + multiple-line representation; this is called "folding". The general + rule is that wherever this specification allows for folding white + space (not simply WSP characters), a CRLF may be inserted before any + WSP. + + + + +Resnick Standards Track [Page 8] + +RFC 5322 Internet Message Format October 2008 + + + For example, the header field: + + Subject: This is a test + + can be represented as: + + Subject: This + is a test + + Note: Though structured field bodies are defined in such a way + that folding can take place between many of the lexical tokens + (and even within some of the lexical tokens), folding SHOULD be + limited to placing the CRLF at higher-level syntactic breaks. For + instance, if a field body is defined as comma-separated values, it + is recommended that folding occur after the comma separating the + structured items in preference to other places where the field + could be folded, even if it is allowed elsewhere. + + The process of moving from this folded multiple-line representation + of a header field to its single line representation is called + "unfolding". Unfolding is accomplished by simply removing any CRLF + that is immediately followed by WSP. Each header field should be + treated in its unfolded form for further syntactic and semantic + evaluation. An unfolded header field has no length restriction and + therefore may be indeterminately long. + +2.3. Body + + The body of a message is simply lines of US-ASCII characters. The + only two limitations on the body are as follows: + + o CR and LF MUST only occur together as CRLF; they MUST NOT appear + independently in the body. + o Lines of characters in the body MUST be limited to 998 characters, + and SHOULD be limited to 78 characters, excluding the CRLF. + + Note: As was stated earlier, there are other documents, + specifically the MIME documents ([RFC2045], [RFC2046], [RFC2049], + [RFC4288], [RFC4289]), that extend (and limit) this specification + to allow for different sorts of message bodies. Again, these + mechanisms are beyond the scope of this document. + + + + + + + + + + +Resnick Standards Track [Page 9] + +RFC 5322 Internet Message Format October 2008 + + +3. Syntax + +3.1. Introduction + + The syntax as given in this section defines the legal syntax of + Internet messages. Messages that are conformant to this + specification MUST conform to the syntax in this section. If there + are options in this section where one option SHOULD be generated, + that is indicated either in the prose or in a comment next to the + syntax. + + For the defined expressions, a short description of the syntax and + use is given, followed by the syntax in ABNF, followed by a semantic + analysis. The following primitive tokens that are used but otherwise + unspecified are taken from the "Core Rules" of [RFC5234], Appendix + B.1: CR, LF, CRLF, HTAB, SP, WSP, DQUOTE, DIGIT, ALPHA, and VCHAR. + + In some of the definitions, there will be non-terminals whose names + start with "obs-". These "obs-" elements refer to tokens defined in + the obsolete syntax in section 4. In all cases, these productions + are to be ignored for the purposes of generating legal Internet + messages and MUST NOT be used as part of such a message. However, + when interpreting messages, these tokens MUST be honored as part of + the legal syntax. In this sense, section 3 defines a grammar for the + generation of messages, with "obs-" elements that are to be ignored, + while section 4 adds grammar for the interpretation of messages. + +3.2. Lexical Tokens + + The following rules are used to define an underlying lexical + analyzer, which feeds tokens to the higher-level parsers. This + section defines the tokens used in structured header field bodies. + + Note: Readers of this specification need to pay special attention + to how these lexical tokens are used in both the lower-level and + higher-level syntax later in the document. Particularly, the + white space tokens and the comment tokens defined in section 3.2.2 + get used in the lower-level tokens defined here, and those lower- + level tokens are in turn used as parts of the higher-level tokens + defined later. Therefore, white space and comments may be allowed + in the higher-level tokens even though they may not explicitly + appear in a particular definition. + +3.2.1. Quoted characters + + Some characters are reserved for special interpretation, such as + delimiting lexical tokens. To permit use of these characters as + uninterpreted data, a quoting mechanism is provided. + + + +Resnick Standards Track [Page 10] + +RFC 5322 Internet Message Format October 2008 + + + quoted-pair = ("\" (VCHAR / WSP)) / obs-qp + + Where any quoted-pair appears, it is to be interpreted as the + character alone. That is to say, the "\" character that appears as + part of a quoted-pair is semantically "invisible". + + Note: The "\" character may appear in a message where it is not + part of a quoted-pair. A "\" character that does not appear in a + quoted-pair is not semantically invisible. The only places in + this specification where quoted-pair currently appears are + ccontent, qcontent, and in obs-dtext in section 4. + +3.2.2. Folding White Space and Comments + + White space characters, including white space used in folding + (described in section 2.2.3), may appear between many elements in + header field bodies. Also, strings of characters that are treated as + comments may be included in structured field bodies as characters + enclosed in parentheses. The following defines the folding white + space (FWS) and comment constructs. + + Strings of characters enclosed in parentheses are considered comments + so long as they do not appear within a "quoted-string", as defined in + section 3.2.4. Comments may nest. + + There are several places in this specification where comments and FWS + may be freely inserted. To accommodate that syntax, an additional + token for "CFWS" is defined for places where comments and/or FWS can + occur. However, where CFWS occurs in this specification, it MUST NOT + be inserted in such a way that any line of a folded header field is + made up entirely of WSP characters and nothing else. + + FWS = ([*WSP CRLF] 1*WSP) / obs-FWS + ; Folding white space + + ctext = %d33-39 / ; Printable US-ASCII + %d42-91 / ; characters not including + %d93-126 / ; "(", ")", or "\" + obs-ctext + + ccontent = ctext / quoted-pair / comment + + comment = "(" *([FWS] ccontent) [FWS] ")" + + CFWS = (1*([FWS] comment) [FWS]) / FWS + + + + + + +Resnick Standards Track [Page 11] + +RFC 5322 Internet Message Format October 2008 + + + Throughout this specification, where FWS (the folding white space + token) appears, it indicates a place where folding, as discussed in + section 2.2.3, may take place. Wherever folding appears in a message + (that is, a header field body containing a CRLF followed by any WSP), + unfolding (removal of the CRLF) is performed before any further + semantic analysis is performed on that header field according to this + specification. That is to say, any CRLF that appears in FWS is + semantically "invisible". + + A comment is normally used in a structured field body to provide some + human-readable informational text. Since a comment is allowed to + contain FWS, folding is permitted within the comment. Also note that + since quoted-pair is allowed in a comment, the parentheses and + backslash characters may appear in a comment, so long as they appear + as a quoted-pair. Semantically, the enclosing parentheses are not + part of the comment; the comment is what is contained between the two + parentheses. As stated earlier, the "\" in any quoted-pair and the + CRLF in any FWS that appears within the comment are semantically + "invisible" and therefore not part of the comment either. + + Runs of FWS, comment, or CFWS that occur between lexical tokens in a + structured header field are semantically interpreted as a single + space character. + +3.2.3. Atom + + Several productions in structured header field bodies are simply + strings of certain basic characters. Such productions are called + atoms. + + Some of the structured header field bodies also allow the period + character (".", ASCII value 46) within runs of atext. An additional + "dot-atom" token is defined for those purposes. + + Note: The "specials" token does not appear anywhere else in this + specification. It is simply the visible (i.e., non-control, non- + white space) characters that do not appear in atext. It is + provided only because it is useful for implementers who use tools + that lexically analyze messages. Each of the characters in + specials can be used to indicate a tokenization point in lexical + analysis. + + + + + + + + + + +Resnick Standards Track [Page 12] + +RFC 5322 Internet Message Format October 2008 + + + atext = ALPHA / DIGIT / ; Printable US-ASCII + "!" / "#" / ; characters not including + "$" / "%" / ; specials. Used for atoms. + "&" / "'" / + "*" / "+" / + "-" / "/" / + "=" / "?" / + "^" / "_" / + "`" / "{" / + "|" / "}" / + "~" + + atom = [CFWS] 1*atext [CFWS] + + dot-atom-text = 1*atext *("." 1*atext) + + dot-atom = [CFWS] dot-atom-text [CFWS] + + specials = "(" / ")" / ; Special characters that do + "<" / ">" / ; not appear in atext + "[" / "]" / + ":" / ";" / + "@" / "\" / + "," / "." / + DQUOTE + + Both atom and dot-atom are interpreted as a single unit, comprising + the string of characters that make it up. Semantically, the optional + comments and FWS surrounding the rest of the characters are not part + of the atom; the atom is only the run of atext characters in an atom, + or the atext and "." characters in a dot-atom. + +3.2.4. Quoted Strings + + Strings of characters that include characters other than those + allowed in atoms can be represented in a quoted string format, where + the characters are surrounded by quote (DQUOTE, ASCII value 34) + characters. + + + + + + + + + + + + + +Resnick Standards Track [Page 13] + +RFC 5322 Internet Message Format October 2008 + + + qtext = %d33 / ; Printable US-ASCII + %d35-91 / ; characters not including + %d93-126 / ; "\" or the quote character + obs-qtext + + qcontent = qtext / quoted-pair + + quoted-string = [CFWS] + DQUOTE *([FWS] qcontent) [FWS] DQUOTE + [CFWS] + + A quoted-string is treated as a unit. That is, quoted-string is + identical to atom, semantically. Since a quoted-string is allowed to + contain FWS, folding is permitted. Also note that since quoted-pair + is allowed in a quoted-string, the quote and backslash characters may + appear in a quoted-string so long as they appear as a quoted-pair. + + Semantically, neither the optional CFWS outside of the quote + characters nor the quote characters themselves are part of the + quoted-string; the quoted-string is what is contained between the two + quote characters. As stated earlier, the "\" in any quoted-pair and + the CRLF in any FWS/CFWS that appears within the quoted-string are + semantically "invisible" and therefore not part of the quoted-string + either. + +3.2.5. Miscellaneous Tokens + + Three additional tokens are defined: word and phrase for combinations + of atoms and/or quoted-strings, and unstructured for use in + unstructured header fields and in some places within structured + header fields. + + word = atom / quoted-string + + phrase = 1*word / obs-phrase + + unstructured = (*([FWS] VCHAR) *WSP) / obs-unstruct + +3.3. Date and Time Specification + + Date and time values occur in several header fields. This section + specifies the syntax for a full date and time specification. Though + folding white space is permitted throughout the date-time + specification, it is RECOMMENDED that a single space be used in each + place that FWS appears (whether it is required or optional); some + older implementations will not interpret longer sequences of folding + white space correctly. + + + + +Resnick Standards Track [Page 14] + +RFC 5322 Internet Message Format October 2008 + + + date-time = [ day-of-week "," ] date time [CFWS] + + day-of-week = ([FWS] day-name) / obs-day-of-week + + day-name = "Mon" / "Tue" / "Wed" / "Thu" / + "Fri" / "Sat" / "Sun" + + date = day month year + + day = ([FWS] 1*2DIGIT FWS) / obs-day + + month = "Jan" / "Feb" / "Mar" / "Apr" / + "May" / "Jun" / "Jul" / "Aug" / + "Sep" / "Oct" / "Nov" / "Dec" + + year = (FWS 4*DIGIT FWS) / obs-year + + time = time-of-day zone + + time-of-day = hour ":" minute [ ":" second ] + + hour = 2DIGIT / obs-hour + + minute = 2DIGIT / obs-minute + + second = 2DIGIT / obs-second + + zone = (FWS ( "+" / "-" ) 4DIGIT) / obs-zone + + The day is the numeric day of the month. The year is any numeric + year 1900 or later. + + The time-of-day specifies the number of hours, minutes, and + optionally seconds since midnight of the date indicated. + + The date and time-of-day SHOULD express local time. + + The zone specifies the offset from Coordinated Universal Time (UTC, + formerly referred to as "Greenwich Mean Time") that the date and + time-of-day represent. The "+" or "-" indicates whether the time-of- + day is ahead of (i.e., east of) or behind (i.e., west of) Universal + Time. The first two digits indicate the number of hours difference + from Universal Time, and the last two digits indicate the number of + additional minutes difference from Universal Time. (Hence, +hhmm + means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) + minutes). The form "+0000" SHOULD be used to indicate a time zone at + Universal Time. Though "-0000" also indicates Universal Time, it is + + + + +Resnick Standards Track [Page 15] + +RFC 5322 Internet Message Format October 2008 + + + used to indicate that the time was generated on a system that may be + in a local time zone other than Universal Time and that the date-time + contains no information about the local time zone. + + A date-time specification MUST be semantically valid. That is, the + day-of-week (if included) MUST be the day implied by the date, the + numeric day-of-month MUST be between 1 and the number of days allowed + for the specified month (in the specified year), the time-of-day MUST + be in the range 00:00:00 through 23:59:60 (the number of seconds + allowing for a leap second; see [RFC1305]), and the last two digits + of the zone MUST be within the range 00 through 59. + +3.4. Address Specification + + Addresses occur in several message header fields to indicate senders + and recipients of messages. An address may either be an individual + mailbox, or a group of mailboxes. + + address = mailbox / group + + mailbox = name-addr / addr-spec + + name-addr = [display-name] angle-addr + + angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / + obs-angle-addr + + group = display-name ":" [group-list] ";" [CFWS] + + display-name = phrase + + mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list + + address-list = (address *("," address)) / obs-addr-list + + group-list = mailbox-list / CFWS / obs-group-list + + A mailbox receives mail. It is a conceptual entity that does not + necessarily pertain to file storage. For example, some sites may + choose to print mail on a printer and deliver the output to the + addressee's desk. + + Normally, a mailbox is composed of two parts: (1) an optional display + name that indicates the name of the recipient (which can be a person + or a system) that could be displayed to the user of a mail + application, and (2) an addr-spec address enclosed in angle brackets + + + + + +Resnick Standards Track [Page 16] + +RFC 5322 Internet Message Format October 2008 + + + ("<" and ">"). There is an alternate simple form of a mailbox where + the addr-spec address appears alone, without the recipient's name or + the angle brackets. The Internet addr-spec address is described in + section 3.4.1. + + Note: Some legacy implementations used the simple form where the + addr-spec appears without the angle brackets, but included the + name of the recipient in parentheses as a comment following the + addr-spec. Since the meaning of the information in a comment is + unspecified, implementations SHOULD use the full name-addr form of + the mailbox, instead of the legacy form, to specify the display + name associated with a mailbox. Also, because some legacy + implementations interpret the comment, comments generally SHOULD + NOT be used in address fields to avoid confusing such + implementations. + + When it is desirable to treat several mailboxes as a single unit + (i.e., in a distribution list), the group construct can be used. The + group construct allows the sender to indicate a named group of + recipients. This is done by giving a display name for the group, + followed by a colon, followed by a comma-separated list of any number + of mailboxes (including zero and one), and ending with a semicolon. + Because the list of mailboxes can be empty, using the group construct + is also a simple way to communicate to recipients that the message + was sent to one or more named sets of recipients, without actually + providing the individual mailbox address for any of those recipients. + +3.4.1. Addr-Spec Specification + + An addr-spec is a specific Internet identifier that contains a + locally interpreted string followed by the at-sign character ("@", + ASCII value 64) followed by an Internet domain. The locally + interpreted string is either a quoted-string or a dot-atom. If the + string can be represented as a dot-atom (that is, it contains no + characters other than atext characters or "." surrounded by atext + characters), then the dot-atom form SHOULD be used and the quoted- + string form SHOULD NOT be used. Comments and folding white space + SHOULD NOT be used around the "@" in the addr-spec. + + Note: A liberal syntax for the domain portion of addr-spec is + given here. However, the domain portion contains addressing + information specified by and used in other protocols (e.g., + [RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore + incumbent upon implementations to conform to the syntax of + addresses for the context in which they are used. + + + + + + +Resnick Standards Track [Page 17] + +RFC 5322 Internet Message Format October 2008 + + + addr-spec = local-part "@" domain + + local-part = dot-atom / quoted-string / obs-local-part + + domain = dot-atom / domain-literal / obs-domain + + domain-literal = [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS] + + dtext = %d33-90 / ; Printable US-ASCII + %d94-126 / ; characters not including + obs-dtext ; "[", "]", or "\" + + The domain portion identifies the point to which the mail is + delivered. In the dot-atom form, this is interpreted as an Internet + domain name (either a host name or a mail exchanger name) as + described in [RFC1034], [RFC1035], and [RFC1123]. In the domain- + literal form, the domain is interpreted as the literal Internet + address of the particular host. In both cases, how addressing is + used and how messages are transported to a particular host is covered + in separate documents, such as [RFC5321]. These mechanisms are + outside of the scope of this document. + + The local-part portion is a domain-dependent string. In addresses, + it is simply interpreted on the particular host as a name of a + particular mailbox. + +3.5. Overall Message Syntax + + A message consists of header fields, optionally followed by a message + body. Lines in a message MUST be a maximum of 998 characters + excluding the CRLF, but it is RECOMMENDED that lines be limited to 78 + characters excluding the CRLF. (See section 2.1.1 for explanation.) + In a message body, though all of the characters listed in the text + rule MAY be used, the use of US-ASCII control characters (values 1 + through 8, 11, 12, and 14 through 31) is discouraged since their + interpretation by receivers for display is not guaranteed. + + message = (fields / obs-fields) + [CRLF body] + + body = (*(*998text CRLF) *998text) / obs-body + + text = %d1-9 / ; Characters excluding CR + %d11 / ; and LF + %d12 / + %d14-127 + + + + + +Resnick Standards Track [Page 18] + +RFC 5322 Internet Message Format October 2008 + + + The header fields carry most of the semantic information and are + defined in section 3.6. The body is simply a series of lines of text + that are uninterpreted for the purposes of this specification. + +3.6. Field Definitions + + The header fields of a message are defined here. All header fields + have the same general syntactic structure: a field name, followed by + a colon, followed by the field body. The specific syntax for each + header field is defined in the subsequent sections. + + Note: In the ABNF syntax for each field in subsequent sections, + each field name is followed by the required colon. However, for + brevity, sometimes the colon is not referred to in the textual + description of the syntax. It is, nonetheless, required. + + It is important to note that the header fields are not guaranteed to + be in a particular order. They may appear in any order, and they + have been known to be reordered occasionally when transported over + the Internet. However, for the purposes of this specification, + header fields SHOULD NOT be reordered when a message is transported + or transformed. More importantly, the trace header fields and resent + header fields MUST NOT be reordered, and SHOULD be kept in blocks + prepended to the message. See sections 3.6.6 and 3.6.7 for more + information. + + The only required header fields are the origination date field and + the originator address field(s). All other header fields are + syntactically optional. More information is contained in the table + following this definition. + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 19] + +RFC 5322 Internet Message Format October 2008 + + + fields = *(trace + *optional-field / + *(resent-date / + resent-from / + resent-sender / + resent-to / + resent-cc / + resent-bcc / + resent-msg-id)) + *(orig-date / + from / + sender / + reply-to / + to / + cc / + bcc / + message-id / + in-reply-to / + references / + subject / + comments / + keywords / + optional-field) + + The following table indicates limits on the number of times each + field may occur in the header section of a message as well as any + special limitations on the use of those fields. An asterisk ("*") + next to a value in the minimum or maximum column indicates that a + special restriction appears in the Notes column. + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 20] + +RFC 5322 Internet Message Format October 2008 + + + +----------------+--------+------------+----------------------------+ + | Field | Min | Max number | Notes | + | | number | | | + +----------------+--------+------------+----------------------------+ + | trace | 0 | unlimited | Block prepended - see | + | | | | 3.6.7 | + | resent-date | 0* | unlimited* | One per block, required if | + | | | | other resent fields are | + | | | | present - see 3.6.6 | + | resent-from | 0 | unlimited* | One per block - see 3.6.6 | + | resent-sender | 0* | unlimited* | One per block, MUST occur | + | | | | with multi-address | + | | | | resent-from - see 3.6.6 | + | resent-to | 0 | unlimited* | One per block - see 3.6.6 | + | resent-cc | 0 | unlimited* | One per block - see 3.6.6 | + | resent-bcc | 0 | unlimited* | One per block - see 3.6.6 | + | resent-msg-id | 0 | unlimited* | One per block - see 3.6.6 | + | orig-date | 1 | 1 | | + | from | 1 | 1 | See sender and 3.6.2 | + | sender | 0* | 1 | MUST occur with | + | | | | multi-address from - see | + | | | | 3.6.2 | + | reply-to | 0 | 1 | | + | to | 0 | 1 | | + | cc | 0 | 1 | | + | bcc | 0 | 1 | | + | message-id | 0* | 1 | SHOULD be present - see | + | | | | 3.6.4 | + | in-reply-to | 0* | 1 | SHOULD occur in some | + | | | | replies - see 3.6.4 | + | references | 0* | 1 | SHOULD occur in some | + | | | | replies - see 3.6.4 | + | subject | 0 | 1 | | + | comments | 0 | unlimited | | + | keywords | 0 | unlimited | | + | optional-field | 0 | unlimited | | + +----------------+--------+------------+----------------------------+ + + The exact interpretation of each field is described in subsequent + sections. + + + + + + + + + + + +Resnick Standards Track [Page 21] + +RFC 5322 Internet Message Format October 2008 + + +3.6.1. The Origination Date Field + + The origination date field consists of the field name "Date" followed + by a date-time specification. + + orig-date = "Date:" date-time CRLF + + The origination date specifies the date and time at which the creator + of the message indicated that the message was complete and ready to + enter the mail delivery system. For instance, this might be the time + that a user pushes the "send" or "submit" button in an application + program. In any case, it is specifically not intended to convey the + time that the message is actually transported, but rather the time at + which the human or other creator of the message has put the message + into its final form, ready for transport. (For example, a portable + computer user who is not connected to a network might queue a message + for delivery. The origination date is intended to contain the date + and time that the user queued the message, not the time when the user + connected to the network to send the message.) + +3.6.2. Originator Fields + + The originator fields of a message consist of the from field, the + sender field (when applicable), and optionally the reply-to field. + The from field consists of the field name "From" and a comma- + separated list of one or more mailbox specifications. If the from + field contains more than one mailbox specification in the mailbox- + list, then the sender field, containing the field name "Sender" and a + single mailbox specification, MUST appear in the message. In either + case, an optional reply-to field MAY also be included, which contains + the field name "Reply-To" and a comma-separated list of one or more + addresses. + + from = "From:" mailbox-list CRLF + + sender = "Sender:" mailbox CRLF + + reply-to = "Reply-To:" address-list CRLF + + The originator fields indicate the mailbox(es) of the source of the + message. The "From:" field specifies the author(s) of the message, + that is, the mailbox(es) of the person(s) or system(s) responsible + for the writing of the message. The "Sender:" field specifies the + mailbox of the agent responsible for the actual transmission of the + message. For example, if a secretary were to send a message for + another person, the mailbox of the secretary would appear in the + "Sender:" field and the mailbox of the actual author would appear in + the "From:" field. If the originator of the message can be indicated + + + +Resnick Standards Track [Page 22] + +RFC 5322 Internet Message Format October 2008 + + + by a single mailbox and the author and transmitter are identical, the + "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD + appear. + + Note: The transmitter information is always present. The absence + of the "Sender:" field is sometimes mistakenly taken to mean that + the agent responsible for transmission of the message has not been + specified. This absence merely means that the transmitter is + identical to the author and is therefore not redundantly placed + into the "Sender:" field. + + The originator fields also provide the information required when + replying to a message. When the "Reply-To:" field is present, it + indicates the address(es) to which the author of the message suggests + that replies be sent. In the absence of the "Reply-To:" field, + replies SHOULD by default be sent to the mailbox(es) specified in the + "From:" field unless otherwise specified by the person composing the + reply. + + In all cases, the "From:" field SHOULD NOT contain any mailbox that + does not belong to the author(s) of the message. See also section + 3.6.3 for more information on forming the destination addresses for a + reply. + +3.6.3. Destination Address Fields + + The destination fields of a message consist of three possible fields, + each of the same form: the field name, which is either "To", "Cc", or + "Bcc", followed by a comma-separated list of one or more addresses + (either mailbox or group syntax). + + to = "To:" address-list CRLF + + cc = "Cc:" address-list CRLF + + bcc = "Bcc:" [address-list / CFWS] CRLF + + The destination fields specify the recipients of the message. Each + destination field may have one or more addresses, and the addresses + indicate the intended recipients of the message. The only difference + between the three fields is how each is used. + + The "To:" field contains the address(es) of the primary recipient(s) + of the message. + + + + + + + +Resnick Standards Track [Page 23] + +RFC 5322 Internet Message Format October 2008 + + + The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of + making a copy on a typewriter using carbon paper) contains the + addresses of others who are to receive the message, though the + content of the message may not be directed at them. + + The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy") contains + addresses of recipients of the message whose addresses are not to be + revealed to other recipients of the message. There are three ways in + which the "Bcc:" field is used. In the first case, when a message + containing a "Bcc:" field is prepared to be sent, the "Bcc:" line is + removed even though all of the recipients (including those specified + in the "Bcc:" field) are sent a copy of the message. In the second + case, recipients specified in the "To:" and "Cc:" lines each are sent + a copy of the message with the "Bcc:" line removed as above, but the + recipients on the "Bcc:" line get a separate copy of the message + containing a "Bcc:" line. (When there are multiple recipient + addresses in the "Bcc:" field, some implementations actually send a + separate copy of the message to each recipient with a "Bcc:" + containing only the address of that particular recipient.) Finally, + since a "Bcc:" field may contain no addresses, a "Bcc:" field can be + sent without any addresses indicating to the recipients that blind + copies were sent to someone. Which method to use with "Bcc:" fields + is implementation dependent, but refer to the "Security + Considerations" section of this document for a discussion of each. + + When a message is a reply to another message, the mailboxes of the + authors of the original message (the mailboxes in the "From:" field) + or mailboxes specified in the "Reply-To:" field (if it exists) MAY + appear in the "To:" field of the reply since these would normally be + the primary recipients of the reply. If a reply is sent to a message + that has destination fields, it is often desirable to send a copy of + the reply to all of the recipients of the message, in addition to the + author. When such a reply is formed, addresses in the "To:" and + "Cc:" fields of the original message MAY appear in the "Cc:" field of + the reply, since these are normally secondary recipients of the + reply. If a "Bcc:" field is present in the original message, + addresses in that field MAY appear in the "Bcc:" field of the reply, + but they SHOULD NOT appear in the "To:" or "Cc:" fields. + + Note: Some mail applications have automatic reply commands that + include the destination addresses of the original message in the + destination addresses of the reply. How those reply commands + behave is implementation dependent and is beyond the scope of this + document. In particular, whether or not to include the original + destination addresses when the original message had a "Reply-To:" + field is not addressed here. + + + + + +Resnick Standards Track [Page 24] + +RFC 5322 Internet Message Format October 2008 + + +3.6.4. Identification Fields + + Though listed as optional in the table in section 3.6, every message + SHOULD have a "Message-ID:" field. Furthermore, reply messages + SHOULD have "In-Reply-To:" and "References:" fields as appropriate + and as described below. + + The "Message-ID:" field contains a single unique message identifier. + The "References:" and "In-Reply-To:" fields each contain one or more + unique message identifiers, optionally separated by CFWS. + + The message identifier (msg-id) syntax is a limited version of the + addr-spec construct enclosed in the angle bracket characters, "<" and + ">". Unlike addr-spec, this syntax only permits the dot-atom-text + form on the left-hand side of the "@" and does not have internal CFWS + anywhere in the message identifier. + + Note: As with addr-spec, a liberal syntax is given for the right- + hand side of the "@" in a msg-id. However, later in this section, + the use of a domain for the right-hand side of the "@" is + RECOMMENDED. Again, the syntax of domain constructs is specified + by and used in other protocols (e.g., [RFC1034], [RFC1035], + [RFC1123], [RFC5321]). It is therefore incumbent upon + implementations to conform to the syntax of addresses for the + context in which they are used. + + message-id = "Message-ID:" msg-id CRLF + + in-reply-to = "In-Reply-To:" 1*msg-id CRLF + + references = "References:" 1*msg-id CRLF + + msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] + + id-left = dot-atom-text / obs-id-left + + id-right = dot-atom-text / no-fold-literal / obs-id-right + + no-fold-literal = "[" *dtext "]" + + The "Message-ID:" field provides a unique message identifier that + refers to a particular version of a particular message. The + uniqueness of the message identifier is guaranteed by the host that + generates it (see below). This message identifier is intended to be + machine readable and not necessarily meaningful to humans. A message + identifier pertains to exactly one version of a particular message; + subsequent revisions to the message each receive new message + identifiers. + + + +Resnick Standards Track [Page 25] + +RFC 5322 Internet Message Format October 2008 + + + Note: There are many instances when messages are "changed", but + those changes do not constitute a new instantiation of that + message, and therefore the message would not get a new message + identifier. For example, when messages are introduced into the + transport system, they are often prepended with additional header + fields such as trace fields (described in section 3.6.7) and + resent fields (described in section 3.6.6). The addition of such + header fields does not change the identity of the message and + therefore the original "Message-ID:" field is retained. In all + cases, it is the meaning that the sender of the message wishes to + convey (i.e., whether this is the same message or a different + message) that determines whether or not the "Message-ID:" field + changes, not any particular syntactic difference that appears (or + does not appear) in the message. + + The "In-Reply-To:" and "References:" fields are used when creating a + reply to a message. They hold the message identifier of the original + message and the message identifiers of other messages (for example, + in the case of a reply to a message that was itself a reply). The + "In-Reply-To:" field may be used to identify the message (or + messages) to which the new message is a reply, while the + "References:" field may be used to identify a "thread" of + conversation. + + When creating a reply to a message, the "In-Reply-To:" and + "References:" fields of the resultant message are constructed as + follows: + + The "In-Reply-To:" field will contain the contents of the + "Message-ID:" field of the message to which this one is a reply (the + "parent message"). If there is more than one parent message, then + the "In-Reply-To:" field will contain the contents of all of the + parents' "Message-ID:" fields. If there is no "Message-ID:" field in + any of the parent messages, then the new message will have no "In- + Reply-To:" field. + + The "References:" field will contain the contents of the parent's + "References:" field (if any) followed by the contents of the parent's + "Message-ID:" field (if any). If the parent message does not contain + a "References:" field but does have an "In-Reply-To:" field + containing a single message identifier, then the "References:" field + will contain the contents of the parent's "In-Reply-To:" field + followed by the contents of the parent's "Message-ID:" field (if + any). If the parent has none of the "References:", "In-Reply-To:", + or "Message-ID:" fields, then the new message will have no + "References:" field. + + + + + +Resnick Standards Track [Page 26] + +RFC 5322 Internet Message Format October 2008 + + + Note: Some implementations parse the "References:" field to + display the "thread of the discussion". These implementations + assume that each new message is a reply to a single parent and + hence that they can walk backwards through the "References:" field + to find the parent of each message listed there. Therefore, + trying to form a "References:" field for a reply that has multiple + parents is discouraged; how to do so is not defined in this + document. + + The message identifier (msg-id) itself MUST be a globally unique + identifier for a message. The generator of the message identifier + MUST guarantee that the msg-id is unique. There are several + algorithms that can be used to accomplish this. Since the msg-id has + a similar syntax to addr-spec (identical except that quoted strings, + comments, and folding white space are not allowed), a good method is + to put the domain name (or a domain literal IP address) of the host + on which the message identifier was created on the right-hand side of + the "@" (since domain names and IP addresses are normally unique), + and put a combination of the current absolute date and time along + with some other currently unique (perhaps sequential) identifier + available on the system (for example, a process id number) on the + left-hand side. Though other algorithms will work, it is RECOMMENDED + that the right-hand side contain some domain identifier (either of + the host itself or otherwise) such that the generator of the message + identifier can guarantee the uniqueness of the left-hand side within + the scope of that domain. + + Semantically, the angle bracket characters are not part of the + msg-id; the msg-id is what is contained between the two angle bracket + characters. + +3.6.5. Informational Fields + + The informational fields are all optional. The "Subject:" and + "Comments:" fields are unstructured fields as defined in section + 2.2.1, and therefore may contain text or folding white space. The + "Keywords:" field contains a comma-separated list of one or more + words or quoted-strings. + + subject = "Subject:" unstructured CRLF + + comments = "Comments:" unstructured CRLF + + keywords = "Keywords:" phrase *("," phrase) CRLF + + These three fields are intended to have only human-readable content + with information about the message. The "Subject:" field is the most + common and contains a short string identifying the topic of the + + + +Resnick Standards Track [Page 27] + +RFC 5322 Internet Message Format October 2008 + + + message. When used in a reply, the field body MAY start with the + string "Re: " (an abbreviation of the Latin "in re", meaning "in the + matter of") followed by the contents of the "Subject:" field body of + the original message. If this is done, only one instance of the + literal string "Re: " ought to be used since use of other strings or + more than one instance can lead to undesirable consequences. The + "Comments:" field contains any additional comments on the text of the + body of the message. The "Keywords:" field contains a comma- + separated list of important words and phrases that might be useful + for the recipient. + +3.6.6. Resent Fields + + Resent fields SHOULD be added to any message that is reintroduced by + a user into the transport system. A separate set of resent fields + SHOULD be added each time this is done. All of the resent fields + corresponding to a particular resending of the message SHOULD be + grouped together. Each new set of resent fields is prepended to the + message; that is, the most recent set of resent fields appears + earlier in the message. No other fields in the message are changed + when resent fields are added. + + Each of the resent fields corresponds to a particular field elsewhere + in the syntax. For instance, the "Resent-Date:" field corresponds to + the "Date:" field and the "Resent-To:" field corresponds to the "To:" + field. In each case, the syntax for the field body is identical to + the syntax given previously for the corresponding field. + + When resent fields are used, the "Resent-From:" and "Resent-Date:" + fields MUST be sent. The "Resent-Message-ID:" field SHOULD be sent. + "Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be + identical to "Resent-From:". + + resent-date = "Resent-Date:" date-time CRLF + + resent-from = "Resent-From:" mailbox-list CRLF + + resent-sender = "Resent-Sender:" mailbox CRLF + + resent-to = "Resent-To:" address-list CRLF + + resent-cc = "Resent-Cc:" address-list CRLF + + resent-bcc = "Resent-Bcc:" [address-list / CFWS] CRLF + + resent-msg-id = "Resent-Message-ID:" msg-id CRLF + + + + + +Resnick Standards Track [Page 28] + +RFC 5322 Internet Message Format October 2008 + + + Resent fields are used to identify a message as having been + reintroduced into the transport system by a user. The purpose of + using resent fields is to have the message appear to the final + recipient as if it were sent directly by the original sender, with + all of the original fields remaining the same. Each set of resent + fields correspond to a particular resending event. That is, if a + message is resent multiple times, each set of resent fields gives + identifying information for each individual time. Resent fields are + strictly informational. They MUST NOT be used in the normal + processing of replies or other such automatic actions on messages. + + Note: Reintroducing a message into the transport system and using + resent fields is a different operation from "forwarding". + "Forwarding" has two meanings: One sense of forwarding is that a + mail reading program can be told by a user to forward a copy of a + message to another person, making the forwarded message the body + of the new message. A forwarded message in this sense does not + appear to have come from the original sender, but is an entirely + new message from the forwarder of the message. Forwarding may + also mean that a mail transport program gets a message and + forwards it on to a different destination for final delivery. + Resent header fields are not intended for use with either type of + forwarding. + + The resent originator fields indicate the mailbox of the person(s) or + system(s) that resent the message. As with the regular originator + fields, there are two forms: a simple "Resent-From:" form, which + contains the mailbox of the individual doing the resending, and the + more complex form, when one individual (identified in the "Resent- + Sender:" field) resends a message on behalf of one or more others + (identified in the "Resent-From:" field). + + Note: When replying to a resent message, replies behave just as + they would with any other message, using the original "From:", + "Reply-To:", "Message-ID:", and other fields. The resent fields + are only informational and MUST NOT be used in the normal + processing of replies. + + The "Resent-Date:" indicates the date and time at which the resent + message is dispatched by the resender of the message. Like the + "Date:" field, it is not the date and time that the message was + actually transported. + + The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function + identically to the "To:", "Cc:", and "Bcc:" fields, respectively, + except that they indicate the recipients of the resent message, not + the recipients of the original message. + + + + +Resnick Standards Track [Page 29] + +RFC 5322 Internet Message Format October 2008 + + + The "Resent-Message-ID:" field provides a unique identifier for the + resent message. + +3.6.7. Trace Fields + + The trace fields are a group of header fields consisting of an + optional "Return-Path:" field, and one or more "Received:" fields. + The "Return-Path:" header field contains a pair of angle brackets + that enclose an optional addr-spec. The "Received:" field contains a + (possibly empty) list of tokens followed by a semicolon and a date- + time specification. Each token must be a word, angle-addr, addr- + spec, or a domain. Further restrictions are applied to the syntax of + the trace fields by specifications that provide for their use, such + as [RFC5321]. + + trace = [return] + 1*received + + return = "Return-Path:" path CRLF + + path = angle-addr / ([CFWS] "<" [CFWS] ">" [CFWS]) + + received = "Received:" *received-token ";" date-time CRLF + + received-token = word / angle-addr / addr-spec / domain + + A full discussion of the Internet mail use of trace fields is + contained in [RFC5321]. For the purposes of this specification, the + trace fields are strictly informational, and any formal + interpretation of them is outside of the scope of this document. + +3.6.8. Optional Fields + + Fields may appear in messages that are otherwise unspecified in this + document. They MUST conform to the syntax of an optional-field. + This is a field name, made up of the printable US-ASCII characters + except SP and colon, followed by a colon, followed by any text that + conforms to the unstructured syntax. + + The field names of any optional field MUST NOT be identical to any + field name specified elsewhere in this document. + + + + + + + + + + +Resnick Standards Track [Page 30] + +RFC 5322 Internet Message Format October 2008 + + + optional-field = field-name ":" unstructured CRLF + + field-name = 1*ftext + + ftext = %d33-57 / ; Printable US-ASCII + %d59-126 ; characters not including + ; ":". + + For the purposes of this specification, any optional field is + uninterpreted. + +4. Obsolete Syntax + + Earlier versions of this specification allowed for different (usually + more liberal) syntax than is allowed in this version. Also, there + have been syntactic elements used in messages on the Internet whose + interpretations have never been documented. Though these syntactic + forms MUST NOT be generated according to the grammar in section 3, + they MUST be accepted and parsed by a conformant receiver. This + section documents many of these syntactic elements. Taking the + grammar in section 3 and adding the definitions presented in this + section will result in the grammar to use for the interpretation of + messages. + + Note: This section identifies syntactic forms that any + implementation MUST reasonably interpret. However, there are + certainly Internet messages that do not conform to even the + additional syntax given in this section. The fact that a + particular form does not appear in any section of this document is + not justification for computer programs to crash or for malformed + data to be irretrievably lost by any implementation. It is up to + the implementation to deal with messages robustly. + + One important difference between the obsolete (interpreting) and the + current (generating) syntax is that in structured header field bodies + (i.e., between the colon and the CRLF of any structured header + field), white space characters, including folding white space, and + comments could be freely inserted between any syntactic tokens. This + allowed many complex forms that have proven difficult for some + implementations to parse. + + Another key difference between the obsolete and the current syntax is + that the rule in section 3.2.2 regarding lines composed entirely of + white space in comments and folding white space does not apply. See + the discussion of folding white space in section 4.2 below. + + Finally, certain characters that were formerly allowed in messages + appear in this section. The NUL character (ASCII value 0) was once + + + +Resnick Standards Track [Page 31] + +RFC 5322 Internet Message Format October 2008 + + + allowed, but is no longer for compatibility reasons. Similarly, US- + ASCII control characters other than CR, LF, SP, and HTAB (ASCII + values 1 through 8, 11, 12, 14 through 31, and 127) were allowed to + appear in header field bodies. CR and LF were allowed to appear in + messages other than as CRLF; this use is also shown here. + + Other differences in syntax and semantics are noted in the following + sections. + +4.1. Miscellaneous Obsolete Tokens + + These syntactic elements are used elsewhere in the obsolete syntax or + in the main syntax. Bare CR, bare LF, and NUL are added to obs-qp, + obs-body, and obs-unstruct. US-ASCII control characters are added to + obs-qp, obs-unstruct, obs-ctext, and obs-qtext. The period character + is added to obs-phrase. The obs-phrase-list provides for a + (potentially empty) comma-separated list of phrases that may include + "null" elements. That is, there could be two or more commas in such + a list with nothing in between them, or commas at the beginning or + end of the list. + + Note: The "period" (or "full stop") character (".") in obs-phrase + is not a form that was allowed in earlier versions of this or any + other specification. Period (nor any other character from + specials) was not allowed in phrase because it introduced a + parsing difficulty distinguishing between phrases and portions of + an addr-spec (see section 4.4). It appears here because the + period character is currently used in many messages in the + display-name portion of addresses, especially for initials in + names, and therefore must be interpreted properly. + + obs-NO-WS-CTL = %d1-8 / ; US-ASCII control + %d11 / ; characters that do not + %d12 / ; include the carriage + %d14-31 / ; return, line feed, and + %d127 ; white space characters + + obs-ctext = obs-NO-WS-CTL + + obs-qtext = obs-NO-WS-CTL + + obs-utext = %d0 / obs-NO-WS-CTL / VCHAR + + obs-qp = "\" (%d0 / obs-NO-WS-CTL / LF / CR) + + obs-body = *((*LF *CR *((%d0 / text) *LF *CR)) / CRLF) + + obs-unstruct = *((*LF *CR *(obs-utext *LF *CR)) / FWS) + + + +Resnick Standards Track [Page 32] + +RFC 5322 Internet Message Format October 2008 + + + obs-phrase = word *(word / "." / CFWS) + + obs-phrase-list = [phrase / CFWS] *("," [phrase / CFWS]) + + Bare CR and bare LF appear in messages with two different meanings. + In many cases, bare CR or bare LF are used improperly instead of CRLF + to indicate line separators. In other cases, bare CR and bare LF are + used simply as US-ASCII control characters with their traditional + ASCII meanings. + +4.2. Obsolete Folding White Space + + In the obsolete syntax, any amount of folding white space MAY be + inserted where the obs-FWS rule is allowed. This creates the + possibility of having two consecutive "folds" in a line, and + therefore the possibility that a line which makes up a folded header + field could be composed entirely of white space. + + obs-FWS = 1*WSP *(CRLF 1*WSP) + +4.3. Obsolete Date and Time + + The syntax for the obsolete date format allows a 2 digit year in the + date field and allows for a list of alphabetic time zone specifiers + that were used in earlier versions of this specification. It also + permits comments and folding white space between many of the tokens. + + obs-day-of-week = [CFWS] day-name [CFWS] + + obs-day = [CFWS] 1*2DIGIT [CFWS] + + obs-year = [CFWS] 2*DIGIT [CFWS] + + obs-hour = [CFWS] 2DIGIT [CFWS] + + obs-minute = [CFWS] 2DIGIT [CFWS] + + obs-second = [CFWS] 2DIGIT [CFWS] + + obs-zone = "UT" / "GMT" / ; Universal Time + ; North American UT + ; offsets + "EST" / "EDT" / ; Eastern: - 5/ - 4 + "CST" / "CDT" / ; Central: - 6/ - 5 + "MST" / "MDT" / ; Mountain: - 7/ - 6 + "PST" / "PDT" / ; Pacific: - 8/ - 7 + ; + + + + +Resnick Standards Track [Page 33] + +RFC 5322 Internet Message Format October 2008 + + + %d65-73 / ; Military zones - "A" + %d75-90 / ; through "I" and "K" + %d97-105 / ; through "Z", both + %d107-122 ; upper and lower case + + Where a two or three digit year occurs in a date, the year is to be + interpreted as follows: If a two digit year is encountered whose + value is between 00 and 49, the year is interpreted by adding 2000, + ending up with a value between 2000 and 2049. If a two digit year is + encountered with a value between 50 and 99, or any three digit year + is encountered, the year is interpreted by adding 1900. + + In the obsolete time zone, "UT" and "GMT" are indications of + "Universal Time" and "Greenwich Mean Time", respectively, and are + both semantically identical to "+0000". + + The remaining three character zones are the US time zones. The first + letter, "E", "C", "M", or "P" stands for "Eastern", "Central", + "Mountain", and "Pacific". The second letter is either "S" for + "Standard" time, or "D" for "Daylight Savings" (or summer) time. + Their interpretations are as follows: + + EDT is semantically equivalent to -0400 + EST is semantically equivalent to -0500 + CDT is semantically equivalent to -0500 + CST is semantically equivalent to -0600 + MDT is semantically equivalent to -0600 + MST is semantically equivalent to -0700 + PDT is semantically equivalent to -0700 + PST is semantically equivalent to -0800 + + The 1 character military time zones were defined in a non-standard + way in [RFC0822] and are therefore unpredictable in their meaning. + The original definitions of the military zones "A" through "I" are + equivalent to "+0100" through "+0900", respectively; "K", "L", and + "M" are equivalent to "+1000", "+1100", and "+1200", respectively; + "N" through "Y" are equivalent to "-0100" through "-1200". + respectively; and "Z" is equivalent to "+0000". However, because of + the error in [RFC0822], they SHOULD all be considered equivalent to + "-0000" unless there is out-of-band information confirming their + meaning. + + Other multi-character (usually between 3 and 5) alphabetic time zones + have been used in Internet messages. Any such time zone whose + meaning is not known SHOULD be considered equivalent to "-0000" + unless there is out-of-band information confirming their meaning. + + + + + +Resnick Standards Track [Page 34] + +RFC 5322 Internet Message Format October 2008 + + +4.4. Obsolete Addressing + + There are four primary differences in addressing. First, mailbox + addresses were allowed to have a route portion before the addr-spec + when enclosed in "<" and ">". The route is simply a comma-separated + list of domain names, each preceded by "@", and the list terminated + by a colon. Second, CFWS were allowed between the period-separated + elements of local-part and domain (i.e., dot-atom was not used). In + addition, local-part is allowed to contain quoted-string in addition + to just atom. Third, mailbox-list and address-list were allowed to + have "null" members. That is, there could be two or more commas in + such a list with nothing in between them, or commas at the beginning + or end of the list. Finally, US-ASCII control characters and quoted- + pairs were allowed in domain literals and are added here. + + obs-angle-addr = [CFWS] "<" obs-route addr-spec ">" [CFWS] + + obs-route = obs-domain-list ":" + + obs-domain-list = *(CFWS / ",") "@" domain + *("," [CFWS] ["@" domain]) + + obs-mbox-list = *([CFWS] ",") mailbox *("," [mailbox / CFWS]) + + obs-addr-list = *([CFWS] ",") address *("," [address / CFWS]) + + obs-group-list = 1*([CFWS] ",") [CFWS] + + obs-local-part = word *("." word) + + obs-domain = atom *("." atom) + + obs-dtext = obs-NO-WS-CTL / quoted-pair + + When interpreting addresses, the route portion SHOULD be ignored. + +4.5. Obsolete Header Fields + + Syntactically, the primary difference in the obsolete field syntax is + that it allows multiple occurrences of any of the fields and they may + occur in any order. Also, any amount of white space is allowed + before the ":" at the end of the field name. + + + + + + + + + +Resnick Standards Track [Page 35] + +RFC 5322 Internet Message Format October 2008 + + + obs-fields = *(obs-return / + obs-received / + obs-orig-date / + obs-from / + obs-sender / + obs-reply-to / + obs-to / + obs-cc / + obs-bcc / + obs-message-id / + obs-in-reply-to / + obs-references / + obs-subject / + obs-comments / + obs-keywords / + obs-resent-date / + obs-resent-from / + obs-resent-send / + obs-resent-rply / + obs-resent-to / + obs-resent-cc / + obs-resent-bcc / + obs-resent-mid / + obs-optional) + + Except for destination address fields (described in section 4.5.3), + the interpretation of multiple occurrences of fields is unspecified. + Also, the interpretation of trace fields and resent fields that do + not occur in blocks prepended to the message is unspecified as well. + Unless otherwise noted in the following sections, interpretation of + other fields is identical to the interpretation of their non-obsolete + counterparts in section 3. + +4.5.1. Obsolete Origination Date Field + + obs-orig-date = "Date" *WSP ":" date-time CRLF + +4.5.2. Obsolete Originator Fields + + obs-from = "From" *WSP ":" mailbox-list CRLF + + obs-sender = "Sender" *WSP ":" mailbox CRLF + + obs-reply-to = "Reply-To" *WSP ":" address-list CRLF + + + + + + + +Resnick Standards Track [Page 36] + +RFC 5322 Internet Message Format October 2008 + + +4.5.3. Obsolete Destination Address Fields + + obs-to = "To" *WSP ":" address-list CRLF + + obs-cc = "Cc" *WSP ":" address-list CRLF + + obs-bcc = "Bcc" *WSP ":" + (address-list / (*([CFWS] ",") [CFWS])) CRLF + + When multiple occurrences of destination address fields occur in a + message, they SHOULD be treated as if the address list in the first + occurrence of the field is combined with the address lists of the + subsequent occurrences by adding a comma and concatenating. + +4.5.4. Obsolete Identification Fields + + The obsolete "In-Reply-To:" and "References:" fields differ from the + current syntax in that they allow phrase (words or quoted strings) to + appear. The obsolete forms of the left and right sides of msg-id + allow interspersed CFWS, making them syntactically identical to + local-part and domain, respectively. + + obs-message-id = "Message-ID" *WSP ":" msg-id CRLF + + obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF + + obs-references = "References" *WSP ":" *(phrase / msg-id) CRLF + + obs-id-left = local-part + + obs-id-right = domain + + For purposes of interpretation, the phrases in the "In-Reply-To:" and + "References:" fields are ignored. + + Semantically, none of the optional CFWS in the local-part and the + domain is part of the obs-id-left and obs-id-right, respectively. + +4.5.5. Obsolete Informational Fields + + obs-subject = "Subject" *WSP ":" unstructured CRLF + + obs-comments = "Comments" *WSP ":" unstructured CRLF + + obs-keywords = "Keywords" *WSP ":" obs-phrase-list CRLF + + + + + + +Resnick Standards Track [Page 37] + +RFC 5322 Internet Message Format October 2008 + + +4.5.6. Obsolete Resent Fields + + The obsolete syntax adds a "Resent-Reply-To:" field, which consists + of the field name, the optional comments and folding white space, the + colon, and a comma separated list of addresses. + + obs-resent-from = "Resent-From" *WSP ":" mailbox-list CRLF + + obs-resent-send = "Resent-Sender" *WSP ":" mailbox CRLF + + obs-resent-date = "Resent-Date" *WSP ":" date-time CRLF + + obs-resent-to = "Resent-To" *WSP ":" address-list CRLF + + obs-resent-cc = "Resent-Cc" *WSP ":" address-list CRLF + + obs-resent-bcc = "Resent-Bcc" *WSP ":" + (address-list / (*([CFWS] ",") [CFWS])) CRLF + + obs-resent-mid = "Resent-Message-ID" *WSP ":" msg-id CRLF + + obs-resent-rply = "Resent-Reply-To" *WSP ":" address-list CRLF + + As with other resent fields, the "Resent-Reply-To:" field is to be + treated as trace information only. + +4.5.7. Obsolete Trace Fields + + The obs-return and obs-received are again given here as template + definitions, just as return and received are in section 3. Their + full syntax is given in [RFC5321]. + + obs-return = "Return-Path" *WSP ":" path CRLF + + obs-received = "Received" *WSP ":" *received-token CRLF + +4.5.8. Obsolete optional fields + + obs-optional = field-name *WSP ":" unstructured CRLF + +5. Security Considerations + + Care needs to be taken when displaying messages on a terminal or + terminal emulator. Powerful terminals may act on escape sequences + and other combinations of US-ASCII control characters with a variety + of consequences. They can remap the keyboard or permit other + modifications to the terminal that could lead to denial of service or + even damaged data. They can trigger (sometimes programmable) + + + +Resnick Standards Track [Page 38] + +RFC 5322 Internet Message Format October 2008 + + + answerback messages that can allow a message to cause commands to be + issued on the recipient's behalf. They can also affect the operation + of terminal attached devices such as printers. Message viewers may + wish to strip potentially dangerous terminal escape sequences from + the message prior to display. However, other escape sequences appear + in messages for useful purposes (cf. [ISO.2022.1994], [RFC2045], + [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]) and therefore + should not be stripped indiscriminately. + + Transmission of non-text objects in messages raises additional + security issues. These issues are discussed in [RFC2045], [RFC2046], + [RFC2047], [RFC2049], [RFC4288], and [RFC4289]. + + Many implementations use the "Bcc:" (blind carbon copy) field, + described in section 3.6.3, to facilitate sending messages to + recipients without revealing the addresses of one or more of the + addressees to the other recipients. Mishandling this use of "Bcc:" + may disclose confidential information that could eventually lead to + security problems through knowledge of even the existence of a + particular mail address. For example, if using the first method + described in section 3.6.3, where the "Bcc:" line is removed from the + message, blind recipients have no explicit indication that they have + been sent a blind copy, except insofar as their address does not + appear in the header section of a message. Because of this, one of + the blind addressees could potentially send a reply to all of the + shown recipients and accidentally reveal that the message went to the + blind recipient. When the second method from section 3.6.3 is used, + the blind recipient's address appears in the "Bcc:" field of a + separate copy of the message. If the "Bcc:" field sent contains all + of the blind addressees, all of the "Bcc:" recipients will be seen by + each "Bcc:" recipient. Even if a separate message is sent to each + "Bcc:" recipient with only the individual's address, implementations + still need to be careful to process replies to the message as per + section 3.6.3 so as not to accidentally reveal the blind recipient to + other recipients. + +6. IANA Considerations + + This document updates the registrations that appeared in [RFC4021] + that referred to the definitions in [RFC2822]. IANA has updated the + Permanent Message Header Field Repository with the following header + fields, in accordance with the procedures set out in [RFC3864]. + + Header field name: Date + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.1) + + + +Resnick Standards Track [Page 39] + +RFC 5322 Internet Message Format October 2008 + + + Header field name: From + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.2) + + Header field name: Sender + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.2) + + Header field name: Reply-To + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.2) + + Header field name: To + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.3) + + Header field name: Cc + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.3) + + Header field name: Bcc + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.3) + + Header field name: Message-ID + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.4) + + Header field name: In-Reply-To + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.4) + + + + +Resnick Standards Track [Page 40] + +RFC 5322 Internet Message Format October 2008 + + + Header field name: References + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.4) + + Header field name: Subject + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.5) + + Header field name: Comments + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.5) + + Header field name: Keywords + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.5) + + Header field name: Resent-Date + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Resent-From + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Resent-Sender + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Resent-To + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + + + +Resnick Standards Track [Page 41] + +RFC 5322 Internet Message Format October 2008 + + + Header field name: Resent-Cc + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Resent-Bcc + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Resent-Reply-To + Applicable protocol: Mail + Status: obsolete + Author/Change controller: IETF + Specification document(s): This document (section 4.5.6) + + Header field name: Resent-Message-ID + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.6) + + Header field name: Return-Path + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.7) + + Header field name: Received + Applicable protocol: Mail + Status: standard + Author/Change controller: IETF + Specification document(s): This document (section 3.6.7) + Related information: [RFC5321] + + + + + + + + + + + + + + + +Resnick Standards Track [Page 42] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A. Example Messages + + This section presents a selection of messages. These are intended to + assist in the implementation of this specification, but should not be + taken as normative; that is to say, although the examples in this + section were carefully reviewed, if there happens to be a conflict + between these examples and the syntax described in sections 3 and 4 + of this document, the syntax in those sections is to be taken as + correct. + + In the text version of this document, messages in this section are + delimited between lines of "----". The "----" lines are not part of + the message itself. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 43] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.1. Addressing Examples + + The following are examples of messages that might be sent between two + individuals. + +Appendix A.1.1. A Message from One Person to Another with Simple + Addressing + + This could be called a canonical message. It has a single author, + John Doe, a single recipient, Mary Smith, a subject, the date, a + message identifier, and a textual message in the body. + + ---- + From: John Doe <jdoe@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + If John's secretary Michael actually sent the message, even though + John was the author and replies to this message should go back to + him, the sender field would be used: + + ---- + From: John Doe <jdoe@machine.example> + Sender: Michael Jones <mjones@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + + + + + + + + + + + + +Resnick Standards Track [Page 44] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.1.2. Different Types of Mailboxes + + This message includes multiple addresses in the destination fields + and also uses several different forms of addresses. + + ---- + From: "Joe Q. Public" <john.q.public@example.com> + To: Mary Smith <mary@x.test>, jdoe@example.org, Who? <one@y.test> + Cc: <boss@nil.test>, "Giant; \"Big\" Box" <sysservices@example.net> + Date: Tue, 1 Jul 2003 10:52:37 +0200 + Message-ID: <5678.21-Nov-1997@example.com> + + Hi everyone. + ---- + + Note that the display names for Joe Q. Public and Giant; "Big" Box + needed to be enclosed in double-quotes because the former contains + the period and the latter contains both semicolon and double-quote + characters (the double-quote characters appearing as quoted-pair + constructs). Conversely, the display name for Who? could appear + without them because the question mark is legal in an atom. Notice + also that jdoe@example.org and boss@nil.test have no display names + associated with them at all, and jdoe@example.org uses the simpler + address form without the angle brackets. + +Appendix A.1.3. Group Addresses + + ---- + From: Pete <pete@silly.example> + To: A Group:Ed Jones <c@a.test>,joe@where.test,John <jdoe@one.test>; + Cc: Undisclosed recipients:; + Date: Thu, 13 Feb 1969 23:32:54 -0330 + Message-ID: <testabcd.1234@silly.example> + + Testing. + ---- + + In this message, the "To:" field has a single group recipient named + "A Group", which contains 3 addresses, and a "Cc:" field with an + empty group recipient named Undisclosed recipients. + + + + + + + + + + + +Resnick Standards Track [Page 45] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.2. Reply Messages + + The following is a series of three messages that make up a + conversation thread between John and Mary. John first sends a + message to Mary, Mary then replies to John's message, and then John + replies to Mary's reply message. + + Note especially the "Message-ID:", "References:", and "In-Reply-To:" + fields in each message. + + ---- + From: John Doe <jdoe@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + When sending replies, the Subject field is often retained, though + prepended with "Re: " as described in section 3.6.5. + + ---- + From: Mary Smith <mary@example.net> + To: John Doe <jdoe@machine.example> + Reply-To: "Mary Smith: Personal Account" <smith@home.example> + Subject: Re: Saying Hello + Date: Fri, 21 Nov 1997 10:01:10 -0600 + Message-ID: <3456@example.net> + In-Reply-To: <1234@local.machine.example> + References: <1234@local.machine.example> + + This is a reply to your hello. + ---- + + Note the "Reply-To:" field in the above message. When John replies + to Mary's message above, the reply should go to the address in the + "Reply-To:" field instead of the address in the "From:" field. + + + + + + + + + + + +Resnick Standards Track [Page 46] + +RFC 5322 Internet Message Format October 2008 + + + ---- + To: "Mary Smith: Personal Account" <smith@home.example> + From: John Doe <jdoe@machine.example> + Subject: Re: Saying Hello + Date: Fri, 21 Nov 1997 11:00:00 -0600 + Message-ID: <abcd.1234@local.machine.test> + In-Reply-To: <3456@example.net> + References: <1234@local.machine.example> <3456@example.net> + + This is a reply to your reply. + ---- + +Appendix A.3. Resent Messages + + Start with the message that has been used as an example several + times: + + ---- + From: John Doe <jdoe@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + Say that Mary, upon receiving this message, wishes to send a copy of + the message to Jane such that (a) the message would appear to have + come straight from John; (b) if Jane replies to the message, the + reply should go back to John; and (c) all of the original + information, like the date the message was originally sent to Mary, + the message identifier, and the original addressee, is preserved. In + this case, resent fields are prepended to the message: + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 47] + +RFC 5322 Internet Message Format October 2008 + + + ---- + Resent-From: Mary Smith <mary@example.net> + Resent-To: Jane Brown <j-brown@other.example> + Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800 + Resent-Message-ID: <78910@example.net> + From: John Doe <jdoe@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + If Jane, in turn, wished to resend this message to another person, + she would prepend her own set of resent header fields to the above + and send that. (Note that for brevity, trace fields are not shown.) + +Appendix A.4. Messages with Trace Fields + + As messages are sent through the transport system as described in + [RFC5321], trace fields are prepended to the message. The following + is an example of what those trace fields might look like. Note that + there is some folding white space in the first one since these lines + can be long. + + ---- + Received: from x.y.test + by example.net + via TCP + with ESMTP + id ABC12345 + for <mary@example.net>; 21 Nov 1997 10:05:43 -0600 + Received: from node.example by x.y.test; 21 Nov 1997 10:01:22 -0600 + From: John Doe <jdoe@node.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: Fri, 21 Nov 1997 09:55:06 -0600 + Message-ID: <1234@local.node.example> + + This is a message just to say hello. + So, "Hello". + ---- + + + + + + + +Resnick Standards Track [Page 48] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.5. White Space, Comments, and Other Oddities + + White space, including folding white space, and comments can be + inserted between many of the tokens of fields. Taking the example + from A.1.3, white space and comments can be inserted into all of the + fields. + + ---- + From: Pete(A nice \) chap) <pete(his account)@silly.test(his host)> + To:A Group(Some people) + :Chris Jones <c@(Chris's host.)public.example>, + joe@example.org, + John <jdoe@one.test> (my dear friend); (the end of the group) + Cc:(Empty list)(start)Hidden recipients :(nobody(that I know)) ; + Date: Thu, + 13 + Feb + 1969 + 23:32 + -0330 (Newfoundland Time) + Message-ID: <testabcd.1234@silly.test> + + Testing. + ---- + + The above example is aesthetically displeasing, but perfectly legal. + Note particularly (1) the comments in the "From:" field (including + one that has a ")" character appearing as part of a quoted-pair); (2) + the white space absent after the ":" in the "To:" field as well as + the comment and folding white space after the group name, the special + character (".") in the comment in Chris Jones's address, and the + folding white space before and after "joe@example.org,"; (3) the + multiple and nested comments in the "Cc:" field as well as the + comment immediately following the ":" after "Cc"; (4) the folding + white space (but no comments except at the end) and the missing + seconds in the time of the date field; and (5) the white space before + (but not within) the identifier in the "Message-ID:" field. + + + + + + + + + + + + + + +Resnick Standards Track [Page 49] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.6. Obsoleted Forms + + The following are examples of obsolete (that is, the "MUST NOT + generate") syntactic elements described in section 4 of this + document. + +Appendix A.6.1. Obsolete Addressing + + Note in the example below the lack of quotes around Joe Q. Public, + the route that appears in the address for Mary Smith, the two commas + that appear in the "To:" field, and the spaces that appear around the + "." in the jdoe address. + + ---- + From: Joe Q. Public <john.q.public@example.com> + To: Mary Smith <@node.test:mary@example.net>, , jdoe@test . example + Date: Tue, 1 Jul 2003 10:52:37 +0200 + Message-ID: <5678.21-Nov-1997@example.com> + + Hi everyone. + ---- + +Appendix A.6.2. Obsolete Dates + + The following message uses an obsolete date format, including a non- + numeric time zone and a two digit year. Note that although the day- + of-week is missing, that is not specific to the obsolete syntax; it + is optional in the current syntax as well. + + ---- + From: John Doe <jdoe@machine.example> + To: Mary Smith <mary@example.net> + Subject: Saying Hello + Date: 21 Nov 97 09:55:06 GMT + Message-ID: <1234@local.machine.example> + + This is a message just to say hello. + So, "Hello". + ---- + + + + + + + + + + + + +Resnick Standards Track [Page 50] + +RFC 5322 Internet Message Format October 2008 + + +Appendix A.6.3. Obsolete White Space and Comments + + White space and comments can appear between many more elements than + in the current syntax. Also, folding lines that are made up entirely + of white space are legal. + + ---- + From : John Doe <jdoe@machine(comment). example> + To : Mary Smith + __ + <mary@example.net> + Subject : Saying Hello + Date : Fri, 21 Nov 1997 09(comment): 55 : 06 -0600 + Message-ID : <1234 @ local(blah) .machine .example> + + This is a message just to say hello. + So, "Hello". + ---- + + Note especially the second line of the "To:" field. It starts with + two space characters. (Note that "__" represent blank spaces.) + Therefore, it is considered part of the folding, as described in + section 4.2. Also, the comments and white space throughout + addresses, dates, and message identifiers are all part of the + obsolete syntax. + + + + + + + + + + + + + + + + + + + + + + + + + + +Resnick Standards Track [Page 51] + +RFC 5322 Internet Message Format October 2008 + + +Appendix B. Differences from Earlier Specifications + + This appendix contains a list of changes that have been made in the + Internet Message Format from earlier specifications, specifically + [RFC0822], [RFC1123], and [RFC2822]. Items marked with an asterisk + (*) below are items which appear in section 4 of this document and + therefore can no longer be generated. + + The following are the changes made from [RFC0822] and [RFC1123] to + [RFC2822] that remain in this document: + + 1. Period allowed in obsolete form of phrase. + 2. ABNF moved out of document, now in [RFC5234]. + 3. Four or more digits allowed for year. + 4. Header field ordering (and lack thereof) made explicit. + 5. Encrypted header field removed. + 6. Specifically allow and give meaning to "-0000" time zone. + 7. Folding white space is not allowed between every token. + 8. Requirement for destinations removed. + 9. Forwarding and resending redefined. + 10. Extension header fields no longer specifically called out. + 11. ASCII 0 (null) removed.* + 12. Folding continuation lines cannot contain only white space.* + 13. Free insertion of comments not allowed in date.* + 14. Non-numeric time zones not allowed.* + 15. Two digit years not allowed.* + 16. Three digit years interpreted, but not allowed for generation.* + 17. Routes in addresses not allowed.* + 18. CFWS within local-parts and domains not allowed.* + 19. Empty members of address lists not allowed.* + 20. Folding white space between field name and colon not allowed.* + 21. Comments between field name and colon not allowed. + 22. Tightened syntax of in-reply-to and references.* + 23. CFWS within msg-id not allowed.* + 24. Tightened semantics of resent fields as informational only. + 25. Resent-Reply-To not allowed.* + 26. No multiple occurrences of fields (except resent and received).* + 27. Free CR and LF not allowed.* + 28. Line length limits specified. + 29. Bcc more clearly specified. + + + + + + + + + + + +Resnick Standards Track [Page 52] + +RFC 5322 Internet Message Format October 2008 + + + The following are changes from [RFC2822]. + 1. Assorted typographical/grammatical errors fixed and + clarifications made. + 2. Changed "standard" to "document" or "specification" throughout. + 3. Made distinction between "header field" and "header section". + 4. Removed NO-WS-CTL from ctext, qtext, dtext, and unstructured.* + 5. Moved discussion of specials to the "Atom" section. Moved text + to "Overall message syntax" section. + 6. Simplified CFWS syntax. + 7. Fixed unstructured syntax. + 8. Changed date and time syntax to deal with white space in + obsolete date syntax. + 9. Removed quoted-pair from domain literals and message + identifiers.* + 10. Clarified that other specifications limit domain syntax. + 11. Simplified "Bcc:" and "Resent-Bcc:" syntax. + 12. Allowed optional-field to appear within trace information. + 13. Removed no-fold-quote from msg-id. Clarified syntax + limitations. + 14. Generalized "Received:" syntax to fix bugs and move definition + out of this document. + 15. Simplified obs-qp. Fixed and simplified obs-utext (which now + only appears in the obsolete syntax). Removed obs-text and obs- + char, adding obs-body. + 16. Fixed obsolete date syntax to allow for more (or less) comments + and white space. + 17. Fixed all obsolete list syntax (obs-domain-list, obs-mbox-list, + obs-addr-list, obs-phrase-list, and the newly added obs-group- + list). + 18. Fixed obs-reply-to syntax. + 19. Fixed obs-bcc and obs-resent-bcc to allow empty lists. + 20. Removed obs-path. + +Appendix C. Acknowledgements + + Many people contributed to this document. They included folks who + participated in the Detailed Revision and Update of Messaging + Standards (DRUMS) Working Group of the Internet Engineering Task + Force (IETF), the chair of DRUMS, the Area Directors of the IETF, and + people who simply sent their comments in via email. The editor is + deeply indebted to them all and thanks them sincerely. The below + list includes everyone who sent email concerning both this document + and [RFC2822]. Hopefully, everyone who contributed is named here: + + +--------------------+----------------------+---------------------+ + | Matti Aarnio | Tanaka Akira | Russ Allbery | + | Eric Allman | Harald Alvestrand | Ran Atkinson | + | Jos Backus | Bruce Balden | Dave Barr | + + + +Resnick Standards Track [Page 53] + +RFC 5322 Internet Message Format October 2008 + + + | Alan Barrett | John Beck | J Robert von Behren | + | Jos den Bekker | D J Bernstein | James Berriman | + | Oliver Block | Norbert Bollow | Raj Bose | + | Antony Bowesman | Scott Bradner | Randy Bush | + | Tom Byrer | Bruce Campbell | Larry Campbell | + | W J Carpenter | Michael Chapman | Richard Clayton | + | Maurizio Codogno | Jim Conklin | R Kelley Cook | + | Nathan Coulter | Steve Coya | Mark Crispin | + | Dave Crocker | Matt Curtin | Michael D'Errico | + | Cyrus Daboo | Michael D Dean | Jutta Degener | + | Mark Delany | Steve Dorner | Harold A Driscoll | + | Michael Elkins | Frank Ellerman | Robert Elz | + | Johnny Eriksson | Erik E Fair | Roger Fajman | + | Patrik Faltstrom | Claus Andre Faerber | Barry Finkel | + | Erik Forsberg | Chuck Foster | Paul Fox | + | Klaus M Frank | Ned Freed | Jochen Friedrich | + | Randall C Gellens | Sukvinder Singh Gill | Tim Goodwin | + | Philip Guenther | Arnt Gulbrandsen | Eric A Hall | + | Tony Hansen | John Hawkinson | Philip Hazel | + | Kai Henningsen | Robert Herriot | Paul Hethmon | + | Jim Hill | Alfred Hoenes | Paul E Hoffman | + | Steve Hole | Kari Hurtta | Marco S Hyman | + | Ofer Inbar | Olle Jarnefors | Kevin Johnson | + | Sudish Joseph | Maynard Kang | Prabhat Keni | + | John C Klensin | Graham Klyne | Brad Knowles | + | Shuhei Kobayashi | Peter Koch | Dan Kohn | + | Christian Kuhtz | Anand Kumria | Steen Larsen | + | Eliot Lear | Barry Leiba | Jay Levitt | + | Bruce Lilly | Lars-Johan Liman | Charles Lindsey | + | Pete Loshin | Simon Lyall | Bill Manning | + | John Martin | Mark Martinec | Larry Masinter | + | Denis McKeon | William P McQuillan | Alexey Melnikov | + | Perry E Metzger | Steven Miller | S Moonesamy | + | Keith Moore | John Gardiner Myers | Chris Newman | + | John W Noerenberg | Eric Norman | Mike O'Dell | + | Larry Osterman | Paul Overell | Jacob Palme | + | Michael A Patton | Uzi Paz | Michael A Quinlan | + | Robert Rapplean | Eric S Raymond | Sam Roberts | + | Hugh Sasse | Bart Schaefer | Tom Scola | + | Wolfgang Segmuller | Nick Shelness | John Stanley | + | Einar Stefferud | Jeff Stephenson | Bernard Stern | + | Peter Sylvester | Mark Symons | Eric Thomas | + | Lee Thompson | Karel De Vriendt | Matthew Wall | + | Rolf Weber | Brent B Welch | Dan Wing | + | Jack De Winter | Gregory J Woodhouse | Greg A Woods | + | Kazu Yamamoto | Alain Zahm | Jamie Zawinski | + | Timothy S Zurcher | | | + +--------------------+----------------------+---------------------+ + + + +Resnick Standards Track [Page 54] + +RFC 5322 Internet Message Format October 2008 + + +7. References + +7.1. Normative References + + [ANSI.X3-4.1986] American National Standards Institute, "Coded + Character Set - 7-bit American Standard Code for + Information Interchange", ANSI X3.4, 1986. + + [RFC1034] Mockapetris, P., "Domain names - concepts and + facilities", STD 13, RFC 1034, November 1987. + + [RFC1035] Mockapetris, P., "Domain names - implementation and + specification", STD 13, RFC 1035, November 1987. + + [RFC1123] Braden, R., "Requirements for Internet Hosts - + Application and Support", STD 3, RFC 1123, + October 1989. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for + Syntax Specifications: ABNF", STD 68, RFC 5234, + January 2008. + +7.2. Informative References + + [RFC0822] Crocker, D., "Standard for the format of ARPA + Internet text messages", STD 11, RFC 822, + August 1982. + + [RFC1305] Mills, D., "Network Time Protocol (Version 3) + Specification, Implementation", RFC 1305, + March 1992. + + [ISO.2022.1994] International Organization for Standardization, + "Information technology - Character code structure + and extension techniques", ISO Standard 2022, 1994. + + [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet + Mail Extensions (MIME) Part One: Format of Internet + Message Bodies", RFC 2045, November 1996. + + [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet + Mail Extensions (MIME) Part Two: Media Types", + RFC 2046, November 1996. + + + + + +Resnick Standards Track [Page 55] + +RFC 5322 Internet Message Format October 2008 + + + [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail + Extensions) Part Three: Message Header Extensions + for Non-ASCII Text", RFC 2047, November 1996. + + [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet + Mail Extensions (MIME) Part Five: Conformance + Criteria and Examples", RFC 2049, November 1996. + + [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, + April 2001. + + [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, + "Registration Procedures for Message Header + Fields", BCP 90, RFC 3864, September 2004. + + [RFC4021] Klyne, G. and J. Palme, "Registration of Mail and + MIME Header Fields", RFC 4021, March 2005. + + [RFC4288] Freed, N. and J. Klensin, "Media Type + Specifications and Registration Procedures", + BCP 13, RFC 4288, December 2005. + + [RFC4289] Freed, N. and J. Klensin, "Multipurpose Internet + Mail Extensions (MIME) Part Four: Registration + Procedures", BCP 13, RFC 4289, December 2005. + + [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", + RFC 5321, October 2008. + +Author's Address + + Peter W. Resnick (editor) + Qualcomm Incorporated + 5775 Morehouse Drive + San Diego, CA 92121-1714 + US + + Phone: +1 858 651 4478 + EMail: presnick@qualcomm.com + URI: http://www.qualcomm.com/~presnick/ + + + + + + + + + + + +Resnick Standards Track [Page 56] + +RFC 5322 Internet Message Format October 2008 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2008). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + + + + + + + + + + + +Resnick Standards Track [Page 57] + diff --git a/rfc/rfc5804.txt b/rfc/rfc5804.txt @@ -0,0 +1,2747 @@ + + + + + + +Internet Engineering Task Force (IETF) A. Melnikov, Ed. +Request for Comments: 5804 Isode Limited +Category: Standards Track T. Martin +ISSN: 2070-1721 BeThereBeSquare, Inc. + July 2010 + + + A Protocol for Remotely Managing Sieve Scripts + +Abstract + + Sieve scripts allow users to filter incoming email. Message stores + are commonly sealed servers so users cannot log into them, yet users + must be able to update their scripts on them. This document + describes a protocol "ManageSieve" for securely managing Sieve + scripts on a remote server. This protocol allows a user to have + multiple scripts, and also alerts a user to syntactically flawed + scripts. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc5804. + +Copyright Notice + + Copyright (c) 2010 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + + + + +Melnikov & Martin Standards Track [Page 1] + +RFC 5804 ManageSieve July 2010 + + +Table of Contents + + 1. Introduction ....................................................3 + 1.1. Commands and Responses .....................................3 + 1.2. Syntax .....................................................3 + 1.3. Response Codes .............................................3 + 1.4. Active Script ..............................................6 + 1.5. Quotas .....................................................6 + 1.6. Script Names ...............................................6 + 1.7. Capabilities ...............................................7 + 1.8. Transport ..................................................9 + 1.9. Conventions Used in This Document .........................10 + 2. Commands .......................................................10 + 2.1. AUTHENTICATE Command ......................................11 + 2.1.1. Use of SASL PLAIN Mechanism over TLS ...............16 + 2.2. STARTTLS Command ..........................................16 + 2.2.1. Server Identity Check ..............................17 + 2.3. LOGOUT Command ............................................20 + 2.4. CAPABILITY Command ........................................20 + 2.5. HAVESPACE Command .........................................20 + 2.6. PUTSCRIPT Command .........................................21 + 2.7. LISTSCRIPTS Command .......................................23 + 2.8. SETACTIVE Command .........................................24 + 2.9. GETSCRIPT Command .........................................25 + 2.10. DELETESCRIPT Command .....................................25 + 2.11. RENAMESCRIPT Command .....................................26 + 2.12. CHECKSCRIPT Command ......................................27 + 2.13. NOOP Command .............................................28 + 2.14. Recommended Extensions ...................................28 + 2.14.1. UNAUTHENTICATE Command ............................28 + 3. Sieve URL Scheme ...............................................29 + 4. Formal Syntax ..................................................31 + 5. Security Considerations ........................................37 + 6. IANA Considerations ............................................38 + 6.1. ManageSieve Capability Registration Template ..............39 + 6.2. Registration of Initial ManageSieve Capabilities ..........39 + 6.3. ManageSieve Response Code Registration Template ...........41 + 6.4. Registration of Initial ManageSieve Response Codes ........41 + 7. Internationalization Considerations ............................46 + 8. Acknowledgements ...............................................46 + 9. References .....................................................47 + 9.1. Normative References ......................................47 + 9.2. Informative References ....................................48 + + + + + + + + +Melnikov & Martin Standards Track [Page 2] + +RFC 5804 ManageSieve July 2010 + + +1. Introduction + +1.1. Commands and Responses + + A ManageSieve connection consists of the establishment of a client/ + server network connection, an initial greeting from the server, and + client/server interactions. These client/server interactions consist + of a client command, server data, and a server completion result + response. + + All interactions transmitted by client and server are in the form of + lines, that is, strings that end with a CRLF. The protocol receiver + of a ManageSieve client or server is either reading a line or reading + a sequence of octets with a known count followed by a line. + +1.2. Syntax + + ManageSieve is a line-oriented protocol much like [IMAP] or [ACAP], + which runs over TCP. There are three data types: atoms, numbers and + strings. Strings may be quoted or literal. See [ACAP] for detailed + descriptions of these types. + + Each command consists of an atom (the command name) followed by zero + or more strings and numbers terminated by CRLF. + + All client queries are replied to with either an OK, NO, or BYE + response. Each response may be followed by a response code (see + Section 1.3) and by a string consisting of human-readable text in the + local language (as returned by the LANGUAGE capability; see + Section 1.7), encoded in UTF-8 [UTF-8]. The contents of the string + SHOULD be shown to the user ,and implementations MUST NOT attempt to + parse the message for meaning. + + The BYE response SHOULD be used if the server wishes to close the + connection. A server may wish to do this because the client was idle + for too long or there were too many failed authentication attempts. + This response can be issued at any time and should be immediately + followed by a server hang-up of the connection. If a server has an + inactivity timeout resulting in client autologout, it MUST be no less + than 30 minutes after successful authentication. The inactivity + timeout MAY be less before authentication. + +1.3. Response Codes + + An OK, NO, or BYE response from the server MAY contain a response + code to describe the event in a more detailed machine-parsable + fashion. A response code consists of data inside parentheses in the + form of an atom, possibly followed by a space and arguments. + + + +Melnikov & Martin Standards Track [Page 3] + +RFC 5804 ManageSieve July 2010 + + + Response codes are defined when there is a specific action that a + client can take based upon the additional information. In order to + support future extension, the response code is represented as a + slash-separated (Solidus, %x2F) hierarchy with each level of + hierarchy representing increasing detail about the error. Response + codes MUST NOT start with the Solidus character. Clients MUST + tolerate additional hierarchical response code detail that they don't + understand. For example, if the client supports the "QUOTA" response + code, but doesn't understand the "QUOTA/MAXSCRIPTS" response code, it + should treat "QUOTA/MAXSCRIPTS" as "QUOTA". + + Client implementations MUST tolerate (ignore) response codes that + they do not recognize. + + The currently defined response codes are the following: + + AUTH-TOO-WEAK + + This response code is returned in the NO or BYE response from an + AUTHENTICATE command. It indicates that site security policy forbids + the use of the requested mechanism for the specified authentication + identity. + + ENCRYPT-NEEDED + + This response code is returned in the NO or BYE response from an + AUTHENTICATE command. It indicates that site security policy + requires the use of a strong encryption mechanism for the specified + authentication identity and mechanism. + + QUOTA + + If this response code is returned in the NO/BYE response, it means + that the command would have placed the user above the site-defined + quota constraints. If this response code is returned in the OK + response, it can mean that the user's storage is near its quota, or + it can mean that the account exceeded its quota but that the + condition is being allowed by the server (the server supports + so-called soft quotas). The QUOTA response code has two more + detailed variants: "QUOTA/MAXSCRIPTS" (the maximum number of per-user + scripts) and "QUOTA/MAXSIZE" (the maximum script size). + + REFERRAL + + This response code may be returned with a BYE result from any + command, and includes a mandatory parameter that indicates what + server to access to manage this user's Sieve scripts. The server + will be specified by a Sieve URL (see Section 3). The scriptname + + + +Melnikov & Martin Standards Track [Page 4] + +RFC 5804 ManageSieve July 2010 + + + portion of the URL MUST NOT be specified. The client should + authenticate to the specified server and use it for all further + commands in the current session. + + SASL + + This response code can occur in the OK response to a successful + AUTHENTICATE command and includes the optional final server response + data from the server as specified by [SASL]. + + TRANSITION-NEEDED + + This response code occurs in a NO response of an AUTHENTICATE + command. It indicates that the user name is valid, but the entry in + the authentication database needs to be updated in order to permit + authentication with the specified mechanism. This is typically done + by establishing a secure channel using TLS, verifying server identity + as specified in Section 2.2.1, and finally authenticating once using + the [PLAIN] authentication mechanism. The selected mechanism SHOULD + then work for authentications in subsequent sessions. + + This condition can happen if a user has an entry in a system + authentication database such as Unix /etc/passwd, but does not have + credentials suitable for use by the specified mechanism. + + TRYLATER + + A command failed due to a temporary server failure. The client MAY + continue using local information and try the command later. This + response code only makes sense when returned in a NO/BYE response. + + ACTIVE + + A command failed because it is not allowed on the active script, for + example, DELETESCRIPT on the active script. This response code only + makes sense when returned in a NO/BYE response. + + NONEXISTENT + + A command failed because the referenced script name doesn't exist. + This response code only makes sense when returned in a NO/BYE + response. + + ALREADYEXISTS + + A command failed because the referenced script name already exists. + This response code only makes sense when returned in a NO/BYE + response. + + + +Melnikov & Martin Standards Track [Page 5] + +RFC 5804 ManageSieve July 2010 + + + TAG + + This response code name is followed by a string specified in the + command. See Section 2.13 for a possible use case. + + WARNINGS + + This response code MAY be returned by the server in the OK response + (but it might be returned with the NO/BYE response as well) and + signals the client that even though the script is syntactically + valid, it might contain errors not intended by the script writer. + This response code is typically returned in response to PUTSCRIPT + and/or CHECKSCRIPT commands. A client seeing such response code + SHOULD present the returned warning text to the user. + +1.4. Active Script + + A user may have multiple Sieve scripts on the server, yet only one + script may be used for filtering of incoming messages. This is the + active script. Users may have zero or one active script and MUST use + the SETACTIVE command described below for changing the active script + or disabling Sieve processing. For example, users may have an + everyday script they normally use and a special script they use when + they go on vacation. Users can change which script is being used + without having to download and upload a script stored somewhere else. + +1.5. Quotas + + Servers SHOULD impose quotas to prevent malicious users from + overflowing available storage. If a command would place a user over + a quota setting, servers that impose such quotas MUST reply with a NO + response containing the QUOTA response code. Client implementations + MUST be able to handle commands failing because of quota + restrictions. + +1.6. Script Names + + A Sieve script name is a sequence of Unicode characters encoded in + UTF-8 [UTF-8]. A script name MUST comply with Net-Unicode Definition + (Section 2 of [NET-UNICODE]), with the additional restriction of + prohibiting the following Unicode characters: + + o 0000-001F; [CONTROL CHARACTERS] + + o 007F; DELETE + + o 0080-009F; [CONTROL CHARACTERS] + + + + +Melnikov & Martin Standards Track [Page 6] + +RFC 5804 ManageSieve July 2010 + + + o 2028; LINE SEPARATOR + + o 2029; PARAGRAPH SEPARATOR + + Sieve script names MUST be at least one octet (and hence Unicode + character) long. Zero octets script name has a special meaning (see + Section 2.8). Servers MUST allow names of up to 128 Unicode + characters in length (which can take up to 512 bytes when encoded in + UTF-8, not counting the terminating NUL), and MAY allow longer names. + A server that receives a script name longer than its internal limit + MUST reject the corresponding operation, in particular it MUST NOT + truncate the script name. + +1.7. Capabilities + + Server capabilities are sent automatically by the server upon a + client connection, or after successful STARTTLS and AUTHENTICATE + (which establishes a Simple Authentication and Security Layer (SASL)) + commands. Capabilities may change immediately after a successfully + completed STARTTLS command, and/or immediately after a successfully + completed AUTHENTICATE command, and/or after a successfully completed + UNAUTHENTICATE command (see Section 2.14.1). Capabilities MUST + remain static at all other times. + + Clients MAY request the capabilities at a later time by issuing the + CAPABILITY command described later. The capabilities consist of a + series of lines each with one or two strings. The first string is + the name of the capability, which is case-insensitive. The second + optional string is the value associated with that capability. Order + of capabilities is arbitrary, but each capability name can appear at + most once. + + The following capabilities are defined in this document: + + IMPLEMENTATION - Name of implementation and version. This capability + MUST always be returned by the server. + + SASL - List of SASL mechanisms supported by the server, each + separated by a space. This list can be empty if and only if STARTTLS + is also advertised. This means that the client must negotiate TLS + encryption with STARTTLS first, at which point the SASL capability + will list a non-empty list of SASL mechanisms. + + SIEVE - List of space-separated Sieve extensions (as listed in Sieve + "require" action [SIEVE]) supported by the Sieve engine. This + capability MUST always be returned by the server. + + + + + +Melnikov & Martin Standards Track [Page 7] + +RFC 5804 ManageSieve July 2010 + + + STARTTLS - If TLS [TLS] is supported by this implementation. Before + advertising this capability a server MUST verify to the best of its + ability that TLS can be successfully negotiated by a client with + common cipher suites. Specifically, a server should verify that a + server certificate has been installed and that the TLS subsystem has + successfully initialized. This capability SHOULD NOT be advertised + once STARTTLS or AUTHENTICATE command completes successfully. Client + and server implementations MUST implement the STARTTLS extension. + + MAXREDIRECTS - Specifies the limit on the number of Sieve "redirect" + actions a script can perform during a single evaluation. Note that + this is different from the total number of "redirect" actions a + script can contain. The value is a non-negative number represented + as a ManageSieve string. + + NOTIFY - A space-separated list of URI schema parts for supported + notification methods. This capability MUST be specified if the Sieve + implementation supports the "enotify" extension [NOTIFY]. + + LANGUAGE - The language (<Language-Tag> from [RFC5646]) currently + used for human-readable error messages. If this capability is not + returned, the "i-default" [RFC2277] language is assumed. Note that + the current language MAY be per-user configurable (i.e., it MAY + change after authentication). + + OWNER - The canonical name of the logged-in user (SASL "authorization + identity") encoded in UTF-8. This capability MUST NOT be returned in + unauthenticated state and SHOULD be returned once the AUTHENTICATE + command succeeds. + + VERSION - This capability MUST be returned by servers compliant with + this document or its successor. For servers compliant with this + document, the capability value is the string "1.0". Lack of this + capability means that the server predates this specification and thus + doesn't support the following commands: RENAMESCRIPT, CHECKSCRIPT, + and NOOP. + + Section 2.14 defines some additional ManageSieve extensions and their + respective capabilities. + + A server implementation MUST return SIEVE, IMPLEMENTATION, and + VERSION capabilities. + + A client implementation MUST ignore any listed capabilities that it + does not understand. + + + + + + +Melnikov & Martin Standards Track [Page 8] + +RFC 5804 ManageSieve July 2010 + + + Example: + + S: "IMPlemENTATION" "Example1 ManageSieved v001" + S: "SASl" "DIGEST-MD5 GSSAPI" + S: "SIeVE" "fileinto vacation" + S: "StaRTTLS" + S: "NOTIFY" "xmpp mailto" + S: "MAXREdIRECTS" "5" + S: "VERSION" "1.0" + S: OK + + After successful authentication, this might look like this: + + Example: + + S: "IMPlemENTATION" "Example1 ManageSieved v001" + S: "SASl" "DIGEST-MD5 GSSAPI" + S: "SIeVE" "fileinto vacation" + S: "NOTIFY" "xmpp mailto" + S: "OWNER" "alexey@example.com" + S: "MAXREdIRECTS" "5" + S: "VERSION" "1.0" + S: OK + +1.8. Transport + + The ManageSieve protocol assumes a reliable data stream such as that + provided by TCP. When TCP is used, a ManageSieve server typically + listens on port 4190. + + Before opening the TCP connection, the ManageSieve client first MUST + resolve the Domain Name System (DNS) hostname associated with the + receiving entity and determine the appropriate TCP port for + communication with the receiving entity. The process is as follows: + + 1. Attempt to resolve the hostname using a [DNS-SRV] Service of + "sieve" and a Proto of "tcp" for the target domain (e.g., + "example.net"), resulting in resource records such as + "_sieve._tcp.example.net.". The result of the SRV lookup, if + successful, will be one or more combinations of a port and + hostname; the ManageSieve client MUST resolve the returned + hostnames to IPv4/IPv6 addresses according to returned SRV record + weight. IP addresses from the first successfully resolved + hostname (with the corresponding port number returned by SRV + lookup) are used to connect to the server. If connection using + one of the IP addresses fails, the next resolved IP address is + + + + + +Melnikov & Martin Standards Track [Page 9] + +RFC 5804 ManageSieve July 2010 + + + used to connect. If connection to all resolved IP addresses + fails, then the resolution/connect is repeated for the next + hostname returned by SRV lookup. + + 2. If the SRV lookup fails, the fallback SHOULD be a normal IPv4 or + IPv6 address record resolution to determine the IP address, where + the port used is the default ManageSieve port of 4190. + +1.9. Conventions Used in This Document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [KEYWORDS]. + + In examples, "C:" and "S:" indicate lines sent by the client and + server respectively. Line breaks that do not start a new "C:" or + "S:" exist for editorial reasons. + + Examples of authentication in this document are using DIGEST-MD5 + [DIGEST-MD5] and GSSAPI [GSSAPI] SASL mechanisms. + +2. Commands + + This section and its subsections describe valid ManageSieve commands. + Upon initial connection to the server, the client's session is in + non-authenticated state. Prior to successful authentication, only + the AUTHENTICATE, CAPABILITY, STARTTLS, LOGOUT, and NOOP (see Section + 2.13) commands are valid. ManageSieve extensions MAY define other + commands that are valid in non-authenticated state. Servers MUST + reject all other commands with a NO response. Clients may pipeline + commands (send more than one command at a time without waiting for + completion of the first command). However, a group of commands sent + together MUST NOT have an AUTHENTICATE (*), a STARTTLS, or a + HAVESPACE command anywhere but the last command in the list. + + (*) - The only exception to this rule is when the AUTHENTICATE + command contains an initial response for a SASL mechanism that allows + clients to send data first, the mechanism is known to complete in one + round trip, and the mechanism doesn't negotiate a SASL security + layer. Two examples of such SASL mechanisms are PLAIN [PLAIN] and + EXTERNAL [SASL]. + + + + + + + + + + +Melnikov & Martin Standards Track [Page 10] + +RFC 5804 ManageSieve July 2010 + + +2.1. AUTHENTICATE Command + + Arguments: String - mechanism + String - initial data (optional) + + The AUTHENTICATE command indicates a SASL [SASL] authentication + mechanism to the server. If the server supports the requested + authentication mechanism, it performs an authentication protocol + exchange to identify and authenticate the user. Optionally, it also + negotiates a security layer for subsequent protocol interactions. If + the requested authentication mechanism is not supported, the server + rejects the AUTHENTICATE command by sending the NO response. + + The authentication protocol exchange consists of a series of server + challenges and client responses that are specific to the selected + authentication mechanism. A server challenge consists of a string + (quoted or literal) followed by a CRLF. The contents of the string + is a base-64 encoding [BASE64] of the SASL data. A client response + consists of a string (quoted or literal) with the base-64 encoding of + the SASL data followed by a CRLF. If the client wishes to cancel the + authentication exchange, it issues a string containing a single "*". + If the server receives such a response, it MUST reject the + AUTHENTICATE command by sending a NO reply. + + Note that an empty challenge/response is sent as an empty string. If + the mechanism dictates that the final response is sent by the server, + this data MAY be placed within the data portion of the SASL response + code to save a round trip. + + The optional initial-response argument to the AUTHENTICATE command is + used to save a round trip when using authentication mechanisms that + are defined to send no data in the initial challenge. When the + initial-response argument is used with such a mechanism, the initial + empty challenge is not sent to the client and the server uses the + data in the initial-response argument as if it were sent in response + to the empty challenge. If the initial-response argument to the + AUTHENTICATE command is used with a mechanism that sends data in the + initial challenge, the server MUST reject the AUTHENTICATE command by + sending the NO response. + + The service name specified by this protocol's profile of SASL is + "sieve". + + Reauthentication is not supported by ManageSieve protocol's profile + of SASL. That is, after a successfully completed AUTHENTICATE + command, no more AUTHENTICATE commands may be issued in the same + session. After a successful AUTHENTICATE command completes, a server + MUST reject any further AUTHENTICATE commands with a NO reply. + + + +Melnikov & Martin Standards Track [Page 11] + +RFC 5804 ManageSieve July 2010 + + + However, note that a server may implement the UNAUTHENTICATE + extension described in Section 2.14.1. + + If a security layer is negotiated through the SASL authentication + exchange, it takes effect immediately following the CRLF that + concludes the successful authentication exchange for the client, and + the CRLF of the OK response for the server. + + When a security layer takes effect, the ManageSieve protocol is reset + to the initial state (the state in ManageSieve after a client has + connected to the server). The server MUST discard any knowledge + obtained from the client that was not obtained from the SASL (or TLS) + negotiation itself. Likewise, the client MUST discard any knowledge + obtained from the server, such as the list of ManageSieve extensions, + that was not obtained from the SASL (and/or TLS) negotiation itself. + (Note that a client MAY compare the advertised SASL mechanisms before + and after authentication in order to detect an active down- + negotiation attack. See below.) + + Once a SASL security layer is established, the server MUST re-issue + the capability results, followed by an OK response. This is + necessary to protect against man-in-the-middle attacks that alter the + capabilities list prior to SASL negotiation. The capability results + MUST include all SASL mechanisms the server was capable of + negotiating with that client. This is done in order to allow the + client to detect an active down-negotiation attack. If a user- + oriented client detects such a down-negotiation attack, it SHOULD + either notify the user (it MAY give the user the opportunity to + continue with the ManageSieve session in this case) or close the + transport connection and indicate that a down-negotiation attack + might be in progress. If an automated client detects a down- + negotiation attack, it SHOULD return or log an error indicating that + a possible attack might be in progress and/or SHOULD close the + transport connection. + + When both [TLS] and SASL security layers are in effect, the TLS + encoding MUST be applied (when sending data) after the SASL encoding. + + Server implementations SHOULD support SASL proxy authentication so + that an administrator can administer a user's scripts. Proxy + authentication is when a user authenticates as herself/himself but + requests the server to act (authorize) as another user. + + The authorization identity generated by this [SASL] exchange is a + "simple username" (in the sense defined in [SASLprep]), and both + client and server MUST use the [SASLprep] profile of the [StringPrep] + algorithm to prepare these names for transmission or comparison. If + preparation of the authorization identity fails or results in an + + + +Melnikov & Martin Standards Track [Page 12] + +RFC 5804 ManageSieve July 2010 + + + empty string (unless it was transmitted as the empty string), the + server MUST fail the authentication. + + If an AUTHENTICATE command fails with a NO response, the client MAY + try another authentication mechanism by issuing another AUTHENTICATE + command. In other words, the client may request authentication types + in decreasing order of preference. + + Note that a failed (NO) response to the AUTHENTICATE command may + contain one of the following response codes: AUTH-TOO-WEAK, ENCRYPT- + NEEDED, or TRANSITION-NEEDED. See Section 1.3 for detailed + description of the relevant conditions. + + To ensure interoperability, both client and server implementations of + the ManageSieve protocol MUST implement the SCRAM-SHA-1 [SCRAM] SASL + mechanism, as well as [PLAIN] over [TLS]. + + Note: use of PLAIN over TLS reflects current use of PLAIN over TLS in + other email-related protocols; however, a longer-term goal is to + migrate email-related protocols from using PLAIN over TLS to SCRAM- + SHA-1 mechanism. + + Examples (Note that long lines are folded for readability and are not + part of protocol exchange): + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: "VERSION" "1.0" + S: OK + C: Authenticate "DIGEST-MD5" + S: "cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik + 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz + cyxjaGFyc2V0PXV0Zi04" + C: "Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 + QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo + aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX + N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy + ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 + A9YXV0aA==" + S: OK (SASL "cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZ + mZmZA==") + + + + + + + + +Melnikov & Martin Standards Track [Page 13] + +RFC 5804 ManageSieve July 2010 + + + A slightly different variant of the same authentication exchange is: + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "VERSION" "1.0" + S: "STARTTLS" + S: OK + C: Authenticate "DIGEST-MD5" + S: {136} + S: cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik + 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz + cyxjaGFyc2V0PXV0Zi04 + C: {300+} + C: Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 + QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo + aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX + N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy + ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 + A9YXV0aA== + S: {56} + S: cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA== + C: "" + S: OK + + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 14] + +RFC 5804 ManageSieve July 2010 + + + Another example demonstrating use of SASL PLAIN mechanism under TLS + follows. This example also demonstrate use of SASL "initial + response" (the second parameter to the Authenticate command): + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: OK + C: STARTTLS + S: OK + <TLS negotiation, further commands are under TLS layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN" + S: "SIEVE" "fileinto vacation" + S: OK + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xu" + S: NO + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xz" + S: NO + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xy" + S: BYE "Too many failed authentication attempts" + <Server closes connection> + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 15] + +RFC 5804 ManageSieve July 2010 + + + The following example demonstrates use of SASL "initial response". + It also demonstrates that an empty response can be sent as a literal + and that negotiating a SASL security layer results in the server + re-issuing server capabilities: + + C: AUTHENTICATE "GSSAPI" {1488+} + C: YIIE[...1480 octets here ...]dA== + S: {208} + S: YIGZBgkqhkiG9xIBAgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKic + [...114 octets here ...] + /yzpAy9p+Y0LanLskOTvMc0MnjgAa4YEr3eJ6 + C: {0+} + C: + S: {44} + S: BQQF/wAMAAwAAAAAYRGFAo6W0vIHti8i1UXODgEAEAA= + C: {44+} + C: BQQE/wAMAAwAAAAAIsT1iv9UkZApw471iXt6cwEAAAE= + S: OK + <Further commands/responses are under SASL security layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "LANGUAGE" "ru" + S: "MAXREDIRECTS" "3" + S: ok + +2.1.1. Use of SASL PLAIN Mechanism over TLS + + This section is normative for ManageSieve client implementations that + support SASL [PLAIN] over [TLS]. + + If a ManageSieve client is willing to use SASL PLAIN over TLS to + authenticate to the ManageSieve server, the client MUST verify the + server identity (see Section 2.2.1). If the server identity can't be + verified (e.g., the server has not provided any certificate, or if + the certificate verification fails), the client MUST NOT attempt to + authenticate using the SASL PLAIN mechanism. + +2.2. STARTTLS Command + + Support for STARTTLS command in servers is optional. Its + availability is advertised with "STARTTLS" capability as described in + Section 1.7. + + The STARTTLS command requests commencement of a TLS [TLS] + negotiation. The negotiation begins immediately after the CRLF in + the OK response. After a client issues a STARTTLS command, it MUST + + + +Melnikov & Martin Standards Track [Page 16] + +RFC 5804 ManageSieve July 2010 + + + NOT issue further commands until a server response is seen and the + TLS negotiation is complete. + + The STARTTLS command is only valid in non-authenticated state. The + server remains in non-authenticated state, even if client credentials + are supplied during the TLS negotiation. The SASL [SASL] EXTERNAL + mechanism MAY be used to authenticate once TLS client credentials are + successfully exchanged, but servers supporting the STARTTLS command + are not required to support the EXTERNAL mechanism. + + After the TLS layer is established, the server MUST re-issue the + capability results, followed by an OK response. This is necessary to + protect against man-in-the-middle attacks that alter the capabilities + list prior to STARTTLS. This capability result MUST NOT include the + STARTTLS capability. + + The client MUST discard cached capability information and replace it + with the new information. The server MAY advertise different + capabilities after STARTTLS. + + Example: + + C: StartTls + S: oK + <TLS negotiation, further commands are under TLS layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "VERSION" "1.0" + S: "LANGUAGE" "fr" + S: ok + +2.2.1. Server Identity Check + + During the TLS negotiation, the ManageSieve client MUST check its + understanding of the server hostname/IP address against the server's + identity as presented in the server Certificate message, in order to + prevent man-in-the-middle attacks. In this section, the client's + understanding of the server's identity is called the "reference + identity". + + Checking is performed according to the following rules: + + o If the reference identity is a hostname: + + 1. If a subjectAltName extension of the SRVName [X509-SRV], + dNSName [X509] (in that order of preference) type is present + in the server's certificate, then it SHOULD be used as the + + + +Melnikov & Martin Standards Track [Page 17] + +RFC 5804 ManageSieve July 2010 + + + source of the server's identity. Matching is performed as + described in Section 2.2.1.1, with the exception that no + wildcard matching is allowed for SRVName type. If the + certificate contains multiple names (e.g., more than one + dNSName field), then a match with any one of the fields is + considered acceptable. + + 2. The client MAY use other types of subjectAltName for + performing comparison. + + 3. The server's identity MAY also be verified by comparing the + reference identity to the Common Name (CN) [RFC4519] value in + the leaf Relative Distinguished Name (RDN) of the subjectName + field of the server's certificate. This comparison is + performed using the rules for comparison of DNS names in + Section 2.2.1.1, below. Although the use of the Common Name + value is existing practice, it is deprecated, and + Certification Authorities are encouraged to provide + subjectAltName values instead. Note that the TLS + implementation may represent DNs in certificates according to + X.500 or other conventions. For example, some X.500 + implementations order the RDNs in a DN using a left-to-right + (most significant to least significant) convention instead of + LDAP's right-to-left convention. + + o When the reference identity is an IP address, the iPAddress + subjectAltName SHOULD be used by the client for comparison. The + comparison is performed as described in Section 2.2.1.2. + + If the server identity check fails, user-oriented clients SHOULD + either notify the user (clients MAY give the user the opportunity to + continue with the ManageSieve session in this case) or close the + transport connection and indicate that the server's identity is + suspect. Automated clients SHOULD return or log an error indicating + that the server's identity is suspect and/or SHOULD close the + transport connection. Automated clients MAY provide a configuration + setting that disables this check, but MUST provide a setting that + enables it. + + Beyond the server identity check described in this section, clients + should be prepared to do further checking to ensure that the server + is authorized to provide the service it is requested to provide. The + client may need to make use of local policy information in making + this determination. + + + + + + + +Melnikov & Martin Standards Track [Page 18] + +RFC 5804 ManageSieve July 2010 + + +2.2.1.1. Comparison of DNS Names + + If the reference identity is an internationalized domain name, + conforming implementations MUST convert it to the ASCII Compatible + Encoding (ACE) format as specified in Section 4 of RFC 3490 [RFC3490] + before comparison with subjectAltName values of type dNSName. + Specifically, conforming implementations MUST perform the conversion + operation specified in Section 4 of [RFC3490] as follows: + + o in step 1, the domain name SHALL be considered a "stored string"; + + o in step 3, set the flag called "UseSTD3ASCIIRules"; + + o in step 4, process each label with the "ToASCII" operation; and + + o in step 5, change all label separators to U+002E (full stop). + + After performing the "to-ASCII" conversion, the DNS labels and names + MUST be compared for equality according to the rules specified in + Section 3 of [RFC3490]; i.e., once all label separators are replaced + with U+002E (dot) they are compared in the case-insensitive manner. + + The '*' (ASCII 42) wildcard character is allowed in subjectAltName + values of type dNSName, and then only as the left-most (least + significant) DNS label in that value. This wildcard matches any + left-most DNS label in the server name. That is, the subject + *.example.com matches the server names a.example.com and + b.example.com, but does not match example.com or a.b.example.com. + +2.2.1.2. Comparison of IP Addresses + + When the reference identity is an IP address, the identity MUST be + converted to the "network byte order" octet string representation + [RFC791][RFC2460]. For IP Version 4, as specified in RFC 791, the + octet string will contain exactly four octets. For IP Version 6, as + specified in RFC 2460, the octet string will contain exactly sixteen + octets. This octet string is then compared against subjectAltName + values of type iPAddress. A match occurs if the reference identity + octet string and value octet strings are identical. + +2.2.1.3. Comparison of Other subjectName Types + + Client implementations MAY support matching against subjectAltName + values of other types as described in other documents. + + + + + + + +Melnikov & Martin Standards Track [Page 19] + +RFC 5804 ManageSieve July 2010 + + +2.3. LOGOUT Command + + The client sends the LOGOUT command when it is finished with a + connection and wishes to terminate it. The server MUST reply with an + OK response. The server MUST ignore commands issued by the client + after the LOGOUT command. + + The client SHOULD wait for the OK response before closing the + connection. This avoids the TCP connection going into the TIME_WAIT + state on the server. In order to avoid going into the TIME_WAIT TCP + state, the server MAY wait for a short while for the client to close + the TCP connection first. Whether or not the server waits for the + client to close the connection, it MUST then close the connection + itself. + + Example: + + C: Logout + S: Ok + <connection is terminated> + +2.4. CAPABILITY Command + + The CAPABILITY command requests the server capabilities as described + earlier in this document. It has no parameters. + + Example: + + C: CAPABILITY + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN SCRAM-SHA-1 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: OK + +2.5. HAVESPACE Command + + Arguments: String - name + Number - script size + + The HAVESPACE command is used to query the server for available + space. Clients specify the name they wish to save the script as and + its size in octets. Both parameters can be used by the server to see + if the script with the specified name and size is within a user's + quota(s). For example, the server MAY use the script name to check + if a script would be replaced or a new one would be created. Servers + respond with a NO if storing a script with that name and size would + + + +Melnikov & Martin Standards Track [Page 20] + +RFC 5804 ManageSieve July 2010 + + + fail or OK otherwise. Clients SHOULD issue this command before + attempting to place a script on the server. + + Note that the OK response from the HAVESPACE command does not + constitute a guarantee of success as server disk space conditions + could change between the client issuing the HAVESPACE and the client + issuing the PUTSCRIPT commands. A QUOTA response code (see + Section 1.3) remains a possible (albeit unlikely) response to a + subsequent PUTSCRIPT with the same name and size. + + Example: + + C: HAVESPACE "myscript" 999999 + S: NO (QUOTA/MAXSIZE) "Quota exceeded" + + C: HAVESPACE "foobar" 435 + S: OK + +2.6. PUTSCRIPT Command + + Arguments: String - Script name + String - Script content + + The PUTSCRIPT command is used by the client to submit a Sieve script + to the server. + + If the script already exists, upon success the old script will be + overwritten. The old script MUST NOT be overwritten if PUTSCRIPT + fails in any way. A script of zero length SHOULD be disallowed. + + This command places the script on the server. It does not affect + whether the script is processed on incoming mail, unless it replaces + the script that is already active. The SETACTIVE command is used to + mark a script as active. + + When submitting large scripts, clients SHOULD use the HAVESPACE + command beforehand to query if the server is willing to accept a + script of that size. + + The server MUST check the submitted script for validity, which + includes checking that the script complies with the Sieve grammar + [SIEVE] and that all Sieve extensions mentioned in the script's + "require" statement(s) are supported by the Sieve interpreter. (Note + that if the Sieve interpreter supports the Sieve "ihave" extension + [I-HAVE], any unrecognized/unsupported extension mentioned in the + "ihave" test MUST NOT cause the validation failure.) Other checks + such as validating the supplied command arguments for each command + MAY be performed. Essentially, the performed validation SHOULD be + + + +Melnikov & Martin Standards Track [Page 21] + +RFC 5804 ManageSieve July 2010 + + + the same as performed when compiling the script for execution. + Implementations that use a binary representation to store compiled + scripts can extend the validation to a full compilation, in order to + avoid validating uploaded scripts multiple times. + + If the script fails the validation, the server MUST reply with a NO + response. Any script that fails the validity test MUST NOT be stored + on the server. The message given with a NO response MUST be human + readable and SHOULD contain a specific error message giving the line + number of the first error. Implementors should strive to produce + helpful error messages similar to those given by programming language + compilers. Client implementations should note that this may be a + multiline literal string with more than one error message separated + by CRLFs. The human-readable message is in the language returned in + the latest LANGUAGE capability (or in "i-default"; see Section 1.7), + encoded in UTF-8 [UTF-8]. + + An OK response MAY contain the WARNINGS response code. In such a + case the human-readable message that follows the OK response SHOULD + contain a specific warning message (or messages) giving the line + number(s) in the script that might contain errors not intended by the + script writer. The human-readable message is in the language + returned in the latest LANGUAGE capability (or in "i-default"; see + Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a + response code SHOULD present the message to the user. + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 22] + +RFC 5804 ManageSieve July 2010 + + + Examples: + + C: Putscript "foo" {31+} + C: #comment + C: InvalidSieveCommand + C: + S: NO "line 2: Syntax error" + + C: Putscript "mysievescript" {110+} + C: require ["fileinto"]; + C: + C: if envelope :contains "to" "tmartin+sent" { + C: fileinto "INBOX.sent"; + C: } + S: OK + + C: Putscript "myforwards" {190+} + C: redirect "111@example.net"; + C: + C: if size :under 10k { + C: redirect "mobile@cell.example.com"; + C: } + C: + C: if envelope :contains "to" "tmartin+lists" { + C: redirect "lists@groups.example.com"; + C: } + S: OK (WARNINGS) "line 8: server redirect action + limit is 2, this redirect might be ignored" + +2.7. LISTSCRIPTS Command + + This command lists the scripts the user has on the server. Upon + success, a list of CRLF-separated script names (each represented as a + quoted or literal string) is returned followed by an OK response. If + there exists an active script, the atom ACTIVE is appended to the + corresponding script name. The atom ACTIVE MUST NOT appear on more + than one response line. + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 23] + +RFC 5804 ManageSieve July 2010 + + + Example: + + C: Listscripts + S: "summer_script" + S: "vacation_script" + S: {13} + S: clever"script + S: "main_script" ACTIVE + S: OK + + C: listscripts + S: "summer_script" + S: "main_script" active + S: OK + +2.8. SETACTIVE Command + + Arguments: String - script name + + This command sets a script active. If the script name is the empty + string (i.e., ""), then any active script is disabled. Disabling an + active script when there is no script active is not an error and MUST + result in an OK reply. + + If the script does not exist on the server, then the server MUST + reply with a NO response. Such a reply SHOULD contain the + NONEXISTENT response code. + + Examples: + + C: Setactive "vacationscript" + S: Ok + + C: Setactive "" + S: Ok + + C: Setactive "baz" + S: No (NONEXISTENT) "There is no script by that name" + + C: Setactive "baz" + S: No (NONEXISTENT) {31} + S: There is no script by that name + + + + + + + + + +Melnikov & Martin Standards Track [Page 24] + +RFC 5804 ManageSieve July 2010 + + +2.9. GETSCRIPT Command + + Arguments: String - script name + + This command gets the contents of the specified script. If the + script does not exist, the server MUST reply with a NO response. + Such a reply SHOULD contain the NONEXISTENT response code. + + Upon success, a string with the contents of the script is returned + followed by an OK response. + + Example: + + C: Getscript "myscript" + S: {54} + S: #this is my wonderful script + S: reject "I reject all"; + S: + S: OK + +2.10. DELETESCRIPT Command + + Arguments: String - script name + + This command is used to delete a user's Sieve script. Servers MUST + reply with a NO response if the script does not exist. Such + responses SHOULD include the NONEXISTENT response code. + + The server MUST NOT allow the client to delete an active script, so + the server MUST reply with a NO response if attempted. Such a + response SHOULD contain the ACTIVE response code. If a client wishes + to delete an active script, it should use the SETACTIVE command to + disable the script first. + + Example: + + C: Deletescript "foo" + S: Ok + + C: Deletescript "baz" + S: No (ACTIVE) "You may not delete an active script" + + + + + + + + + + +Melnikov & Martin Standards Track [Page 25] + +RFC 5804 ManageSieve July 2010 + + +2.11. RENAMESCRIPT Command + + Arguments: String - Old Script name + String - New Script name + + This command is used to rename a user's Sieve script. Servers MUST + reply with a NO response if the old script does not exist (in which + case the NONEXISTENT response code SHOULD be included), or a script + with the new name already exists (in which case the ALREADYEXISTS + response code SHOULD be included). Renaming the active script is + allowed; the renamed script remains active. + + Example: + + C: Renamescript "foo" "bar" + S: Ok + + C: Renamescript "baz" "bar" + S: No "bar already exists" + + If the server doesn't support the RENAMESCRIPT command, the client + can emulate it by performing the following steps: + + 1. List available scripts with LISTSCRIPTS. If the script with the + new script name exists, then the client should ask the user + whether to abort the operation, to replace the script (by issuing + the DELETESCRIPT <newname> after that), or to choose a different + name. + + 2. Download the old script with GETSCRIPT <oldname>. + + 3. Upload the old script with the new name: PUTSCRIPT <newname>. + + 4. If the old script was active (as reported by LISTSCRIPTS in step + 1), then make the new script active: SETACTIVE <newname>. + + 5. Delete the old script: DELETESCRIPT <oldname>. + + Note that these steps don't describe how to handle various other + error conditions (for example, NO response containing QUOTA response + code in step 3). Error handling is left as an exercise for the + reader. + + + + + + + + + +Melnikov & Martin Standards Track [Page 26] + +RFC 5804 ManageSieve July 2010 + + +2.12. CHECKSCRIPT Command + + Arguments: String - Script content + + The CHECKSCRIPT command is used by the client to verify Sieve script + validity without storing the script on the server. + + The server MUST check the submitted script for syntactic validity, + which includes checking that all Sieve extensions mentioned in Sieve + script "require" statement(s) are supported by the Sieve interpreter. + (Note that if the Sieve interpreter supports the Sieve "ihave" + extension [I-HAVE], any unrecognized/unsupported extension mentioned + in the "ihave" test MUST NOT cause the syntactic validation failure.) + If the script fails this test, the server MUST reply with a NO + response. The message given with a NO response MUST be human + readable and SHOULD contain a specific error message giving the line + number of the first error. Implementors should strive to produce + helpful error messages similar to those given by programming language + compilers. Client implementations should note that this may be a + multiline literal string with more than one error message separated + by CRLFs. The human-readable message is in the language returned in + the latest LANGUAGE capability (or in "i-default"; see Section 1.7), + encoded in UTF-8 [UTF-8]. + + Examples: + + C: CheckScript {31+} + C: #comment + C: InvalidSieveCommand + C: + S: NO "line 2: Syntax error" + + A ManageSieve server supporting this command MUST NOT check if the + script will put the current user over its quota limit. + + An OK response MAY contain the WARNINGS response code. In such a + case, the human-readable message that follows the OK response SHOULD + contain a specific warning message (or messages) giving the line + number(s) in the script that might contain errors not intended by the + script writer. The human-readable message is in the language + returned in the latest LANGUAGE capability (or in "i-default"; see + Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a + response code SHOULD present the message to the user. + + + + + + + + +Melnikov & Martin Standards Track [Page 27] + +RFC 5804 ManageSieve July 2010 + + +2.13. NOOP Command + + Arguments: String - tag to echo back (optional) + + The NOOP command does nothing, beyond returning a response to the + client. It may be used by clients for protocol re-synchronization or + to reset any inactivity auto-logout timer on the server. + + The response to the NOOP command is always OK, followed by the TAG + response code together with the supplied string. If no string was + supplied in the NOOP command, the TAG response code MUST NOT be + included. + + Examples: + + C: NOOP + S: OK "NOOP completed" + + C: NOOP "STARTTLS-SYNC-42" + S: OK (TAG {16} + S: STARTTLS-SYNC-42) "Done" + +2.14. Recommended Extensions + + The UNAUTHENTICATE extension (advertised as the "UNAUTHENTICATE" + capability with no parameters) defines a new UNAUTHENTICATE command, + which allows a client to return the server to non-authenticated + state. Support for this extension is RECOMMENDED. + +2.14.1. UNAUTHENTICATE Command + + The UNAUTHENTICATE command returns the server to the + non-authenticated state. It doesn't affect any previously + established TLS [TLS] or SASL (Section 2.1) security layer. + + The UNAUTHENTICATE command is only valid in authenticated state. If + issued in a wrong state, the server MUST reject it with a NO + response. + + The UNAUTHENTICATE command has no parameters. + + When issued in the authenticated state, the UNAUTHENTICATE command + MUST NOT fail (i.e., it must never return anything other than OK or + BYE). + + + + + + + +Melnikov & Martin Standards Track [Page 28] + +RFC 5804 ManageSieve July 2010 + + +3. Sieve URL Scheme + + URI scheme name: sieve + + Status: permanent + + URI scheme syntax: Described using ABNF [ABNF]. Some ABNF + productions not defined below are from [URI-GEN]. + + sieveurl = sieveurl-server / sieveurl-list-scripts / + sieveurl-script + + sieveurl-server = "sieve://" authority + + sieveurl-list-scripts = "sieve://" authority ["/"] + + sieveurl-script = "sieve://" authority "/" + [owner "/"] scriptname + + authority = <defined in [URI-GEN]> + + owner = *ochar + ;; %-encoded version of [SASL] authorization + ;; identity (script owner) or "userid". + ;; + ;; Empty owner is used to reference + ;; global scripts. + ;; + ;; Note that ASCII characters such as " ", ";", + ;; "&", "=", "/" and "?" must be %-encoded + ;; as per rule specified in [URI-GEN]. + + scriptname = 1*ochar + ;; %-encoded version of UTF-8 representation + ;; of the script name. + ;; Note that ASCII characters such as " ", ";", + ;; "&", "=", "/" and "?" must be %-encoded + ;; as per rule specified in [URI-GEN]. + + ochar = unreserved / pct-encoded / sub-delims-sh / + ":" / "@" + ;; Same as [URI-GEN] 'pchar', + ;; but without ";", "&" and "=". + + unreserved = <defined in [URI-GEN]> + + pct-encoded = <defined in [URI-GEN]> + + + + +Melnikov & Martin Standards Track [Page 29] + +RFC 5804 ManageSieve July 2010 + + + sub-delims-sh = "!" / "$" / "'" / "(" / ")" / + "*" / "+" / "," + ;; Same as [URI-GEN] sub-delims, + ;; but without ";", "&" and "=". + + URI scheme semantics: + + A Sieve URL identifies a Sieve server or a Sieve script on a Sieve + server. The latter form is associated with the application/sieve + MIME type defined in [SIEVE]. There is no MIME type associated + with the former form of Sieve URI. + + The server form is used in the REFERRAL response code (see Section + 1.3) in order to designate another server where the client should + perform its operations. + + The script form allows to retrieve (GETSCRIPT), update + (PUTSCRIPT), delete (DELETESCRIPT), or activate (SETACTIVE) the + named script; however, the most typical action would be to + retrieve the script. If the script name is empty (omitted), the + URI requests that the client lists available scripts using the + LISTSCRIPTS command. + + Encoding considerations: + + The script name and/or the owner, if present, is in UTF-8. Non-- + US-ASCII UTF-8 octets MUST be percent-encoded as described in + [URI-GEN]. US-ASCII characters such as " " (space), ";", "&", + "=", "/" and "?" MUST be %-encoded as described in [URI-GEN]. + Note that "&" and "?" are in this list in order to allow for + future extensions. + + Note that the empty owner (e.g., sieve://example.com//script) is + different from the missing owner (e.g., + sieve://example.com/script) and is reserved for referencing global + scripts. + + The user name (in the "authority" part), if present, is in UTF-8. + Non-US-ASCII UTF-8 octets MUST be percent-encoded as described in + [URI-GEN]. + + Applications/protocols that use this URI scheme name: + ManageSieve [RFC5804] clients and servers. Clients that can store + user preferences in protocols such as [LDAP] or [ACAP]. + + Interoperability considerations: None. + + + + + +Melnikov & Martin Standards Track [Page 30] + +RFC 5804 ManageSieve July 2010 + + + Security considerations: + The <scriptname> part of a ManageSieve URL might potentially disclose + some confidential information about the author of the script or, + depending on a ManageSieve implementation, about configuration of the + mail system. The latter might be used to prepare for a more complex + attack on the mail system. + + Clients resolving ManageSieve URLs that wish to achieve data + confidentiality and/or integrity SHOULD use the STARTTLS command (if + supported by the server) before starting authentication, or use a + SASL mechanism, such as GSSAPI, that provides a confidentiality + security layer. + + Contact: Alexey Melnikov <alexey.melnikov@isode.com> + + Author/Change controller: IESG. + + References: This document and RFC 5228 [SIEVE]. + +4. Formal Syntax + + The following syntax specification uses the Augmented Backus-Naur + Form (BNF) notation as specified in [ABNF]. This uses the ABNF core + rules as specified in Appendix A of the ABNF specification [ABNF]. + "UTF8-2", "UTF8-3", and "UTF8-4" non-terminal are defined in [UTF-8]. + + Except as noted otherwise, all alphabetic characters are case- + insensitive. The use of upper- or lowercase characters to define + token strings is for editorial clarity only. Implementations MUST + accept these strings in a case-insensitive fashion. + + SAFE-CHAR = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / + %x5D-7F + ;; any TEXT-CHAR except QUOTED-SPECIALS + + QUOTED-CHAR = SAFE-UTF8-CHAR / "\" QUOTED-SPECIALS + + QUOTED-SPECIALS = DQUOTE / "\" + + SAFE-UTF8-CHAR = SAFE-CHAR / UTF8-2 / UTF8-3 / UTF8-4 + ;; <UTF8-2>, <UTF8-3>, and <UTF8-4> + ;; are defined in [UTF-8]. + + ATOM-CHAR = "!" / %x23-27 / %x2A-5B / %x5D-7A / %x7C-7E + ;; Any CHAR except ATOM-SPECIALS + + ATOM-SPECIALS = "(" / ")" / "{" / SP / CTL / QUOTED-SPECIALS + + + + +Melnikov & Martin Standards Track [Page 31] + +RFC 5804 ManageSieve July 2010 + + + NZDIGIT = %x31-39 + ;; 1-9 + + atom = 1*1024ATOM-CHAR + + iana-token = atom + ;; MUST be registered with IANA + + auth-type = DQUOTE auth-type-name DQUOTE + + auth-type-name = iana-token + ;; as defined in SASL [SASL] + + command = (command-any / command-auth / + command-nonauth) CRLF + ;; Modal based on state + + command-any = command-capability / command-logout / + command-noop + ;; Valid in all states + + command-auth = command-getscript / command-setactive / + command-listscripts / command-deletescript / + command-putscript / command-checkscript / + command-havespace / + command-renamescript / + command-unauthenticate + ;; Valid only in Authenticated state + + command-nonauth = command-authenticate / command-starttls + ;; Valid only when in Non-Authenticated + ;; state + + command-authenticate = "AUTHENTICATE" SP auth-type [SP string] + *(CRLF string) + + command-capability = "CAPABILITY" + + command-deletescript = "DELETESCRIPT" SP sieve-name + + command-getscript = "GETSCRIPT" SP sieve-name + + command-havespace = "HAVESPACE" SP sieve-name SP number + + command-listscripts = "LISTSCRIPTS" + + command-noop = "NOOP" [SP string] + + + + +Melnikov & Martin Standards Track [Page 32] + +RFC 5804 ManageSieve July 2010 + + + command-logout = "LOGOUT" + + command-putscript = "PUTSCRIPT" SP sieve-name SP sieve-script + + command-checkscript = "CHECKSCRIPT" SP sieve-script + + sieve-script = string + + command-renamescript = "RENAMESCRIPT" SP old-sieve-name SP + new-sieve-name + + old-sieve-name = sieve-name + + new-sieve-name = sieve-name + + command-setactive = "SETACTIVE" SP active-sieve-name + + command-starttls = "STARTTLS" + + command-unauthenticate= "UNAUTHENTICATE" + + extend-token = atom + ;; MUST be defined by a Standards Track or + ;; IESG-approved experimental protocol + ;; extension + + extension-data = extension-item *(SP extension-item) + + extension-item = extend-token / string / number / + "(" [extension-data] ")" + + literal-c2s = "{" number "+}" CRLF *OCTET + ;; The number represents the number of + ;; octets. + ;; This type of literal can only be sent + ;; from the client to the server. + + literal-s2c = "{" number "}" CRLF *OCTET + ;; Almost identical to literal-c2s, + ;; but with no '+' character. + ;; The number represents the number of + ;; octets. + ;; This type of literal can only be sent + ;; from the server to the client. + + + + + + + +Melnikov & Martin Standards Track [Page 33] + +RFC 5804 ManageSieve July 2010 + + + number = (NZDIGIT *DIGIT) / "0" + ;; A 32-bit unsigned number + ;; with no extra leading zeros. + ;; (0 <= n < 4,294,967,296) + + number-str = string + ;; <number> encoded as a <string>. + + quoted = DQUOTE *1024QUOTED-CHAR DQUOTE + ;; limited to 1024 octets between the <">s + + resp-code = "AUTH-TOO-WEAK" / "ENCRYPT-NEEDED" / "QUOTA" + ["/" ("MAXSCRIPTS" / "MAXSIZE")] / + resp-code-sasl / + resp-code-referral / + "TRANSITION-NEEDED" / "TRYLATER" / + "ACTIVE" / "NONEXISTENT" / + "ALREADYEXISTS" / "WARNINGS" / + "TAG" SP string / + resp-code-ext + + resp-code-referral = "REFERRAL" SP sieveurl + + resp-code-sasl = "SASL" SP string + + resp-code-name = iana-token + ;; The response code name is hierarchical, + ;; separated by '/'. + ;; The response code name MUST NOT start + ;; with '/'. + + resp-code-ext = resp-code-name [SP extension-data] + ;; unknown response codes MUST be tolerated + ;; by the client. + + response = response-authenticate / + response-logout / + response-getscript / + response-setactive / + response-listscripts / + response-deletescript / + response-putscript / + response-checkscript / + response-capability / + response-havespace / + response-starttls / + response-renamescript / + response-noop / + + + +Melnikov & Martin Standards Track [Page 34] + +RFC 5804 ManageSieve July 2010 + + + response-unauthenticate + + response-authenticate = *(string CRLF) + ((response-ok [response-capability]) / + response-nobye) + ;; <response-capability> is REQUIRED if a + ;; SASL security layer was negotiated and + ;; MUST be omitted otherwise. + + response-capability = *(single-capability) response-oknobye + + single-capability = capability-name [SP string] CRLF + + capability-name = string + + ;; Note that literal-s2c is allowed. + + initial-capabilities = DQUOTE "IMPLEMENTATION" DQUOTE SP string / + DQUOTE "SASL" DQUOTE SP sasl-mechs / + DQUOTE "SIEVE" DQUOTE SP sieve-extensions / + DQUOTE "MAXREDIRECTS" DQUOTE SP number-str / + DQUOTE "NOTIFY" DQUOTE SP notify-mechs / + DQUOTE "STARTTLS" DQUOTE / + DQUOTE "LANGUAGE" DQUOTE SP language / + DQUOTE "VERSION" DQUOTE SP version / + DQUOTE "OWNER" DQUOTE SP string + ;; Each capability conforms to + ;; the syntax for single-capability. + ;; Also, note that the capability name + ;; can be returned as either literal-s2c + ;; or quoted, even though only "quoted" + ;; string is shown above. + + version = ( DQUOTE "1.0" DQUOTE ) / version-ext + + version-ext = DQUOTE ver-major "." ver-minor DQUOTE + ; Future versions specified in updates + ; to this document. An increment to + ; the ver-major means a backward-incompatible + ; change to the protocol, e.g., "3.5" (ver-major "3") + ; is not backward-compatible with any "2.X" version. + ; Any version "Z.W" MUST be backward compatible + ; with any version "Z.Q", where Q < W. + ; For example, version "2.4" is backward compatible + ; with version "2.0", "2.1", "2.2", and "2.3". + + ver-major = number + + + + +Melnikov & Martin Standards Track [Page 35] + +RFC 5804 ManageSieve July 2010 + + + ver-minor = number + + sasl-mechs = string + ; Space-separated list of SASL mechanisms, + ; each SASL mechanism name complies with rules + ; specified in [SASL]. + ; Can be empty. + + sieve-extensions = string + ; Space-separated list of supported SIEVE extensions. + ; Can be empty. + + language = string + ; Contains <Language-Tag> from [RFC5646]. + + + notify-mechs = string + ; Space-separated list of URI schema parts + ; for supported notification [NOTIFY] methods. + ; MUST NOT be empty. + + response-deletescript = response-oknobye + + response-getscript = (sieve-script CRLF response-ok) / + response-nobye + + response-havespace = response-oknobye + + response-listscripts = *(sieve-name [SP "ACTIVE"] CRLF) + response-oknobye + ;; ACTIVE may only occur with one sieve-name + + response-logout = response-oknobye + + response-unauthenticate= response-oknobye + ;; "NO" response can only be returned when + ;; the command is issued in a wrong state + ;; or has a wrong number of parameters + + response-ok = "OK" [SP "(" resp-code ")"] + [SP string] CRLF + ;; The string contains human-readable text + ;; encoded as UTF-8. + + response-nobye = ("NO" / "BYE") [SP "(" resp-code ")"] + [SP string] CRLF + ;; The string contains human-readable text + ;; encoded as UTF-8. + + + +Melnikov & Martin Standards Track [Page 36] + +RFC 5804 ManageSieve July 2010 + + + response-oknobye = response-ok / response-nobye + + response-noop = response-ok + + response-putscript = response-oknobye + + response-checkscript = response-oknobye + + response-renamescript = response-oknobye + + response-setactive = response-oknobye + + response-starttls = (response-ok response-capability) / + response-nobye + + sieve-name = string + ;; See Section 1.6 for the full list of + ;; prohibited characters. + ;; Empty string is not allowed. + + active-sieve-name = string + ;; See Section 1.6 for the full list of + ;; prohibited characters. + ;; This is similar to <sieve-name>, but + ;; empty string is allowed and has a special + ;; meaning. + + string = quoted / literal-c2s / literal-s2c + ;; literal-c2s is only allowed when sent + ;; from the client to the server. + ;; literal-s2c is only allowed when sent + ;; from the server to the client. + ;; quoted is allowed in either direction. + +5. Security Considerations + + The AUTHENTICATE command uses SASL [SASL] to provide authentication + and authorization services. Integrity and privacy services can be + provided by [SASL] and/or [TLS]. When a SASL mechanism is used, the + security considerations for that mechanism apply. + + This protocol's transactions are susceptible to passive observers or + man-in-the-middle attacks that alter the data, unless the optional + encryption and integrity services of the SASL (via the AUTHENTICATE + command) and/or [TLS] (via the STARTTLS command) are enabled, or an + external security mechanism is used for protection. It may be useful + to allow configuration of both clients and servers to refuse to + transfer sensitive information in the absence of strong encryption. + + + +Melnikov & Martin Standards Track [Page 37] + +RFC 5804 ManageSieve July 2010 + + + If an implementation supports SASL mechanisms that are vulnerable to + passive eavesdropping attacks (such as [PLAIN]), then the + implementation MUST support at least one configuration where these + SASL mechanisms are not advertised or used without the presence of an + external security layer such as [TLS]. + + Some response codes returned on failed AUTHENTICATE command may + disclose whether or not the username is valid (e.g., TRANSITION- + NEEDED), so server implementations SHOULD provide the ability to + disable these features (or make them not conditional on a per-user + basis) for sites concerned about such disclosure. In the case of + ENCRYPT-NEEDED, if it is applied to all identities then no extra + information is disclosed, but if it is applied on a per-user basis it + can disclose information. + + A compromised or malicious server can use the TRANSITION-NEEDED + response code to force the client that is configured to use a + mechanism that does not disclose the user's password to the server + (e.g., Kerberos), to send the bare password to the server. Clients + SHOULD have the ability to disable the password transition feature, + or disclose that risk to the user and offer the user an option of how + to proceed. + +6. IANA Considerations + + IANA has reserved TCP port number 4190 for use with the ManageSieve + protocol described in this document. + + IANA has registered the "sieve" URI scheme defined in Section 3 of + this document. + + IANA has registered "sieve" in the "GSSAPI/Kerberos/SASL Service + Names" registry. + + IANA has created a new registry for ManageSieve capabilities. The + registration template for ManageSieve capabilities is specified in + Section 6.1. ManageSieve protocol capabilities MUST be specified in + a Standards-Track or IESG-approved Experimental RFC. + + IANA has created a new registry for ManageSieve response codes. The + registration template for ManageSieve response codes is specified in + Section 6.3. ManageSieve protocol response codes MUST be specified + in a Standards-Track or IESG-approved Experimental RFC. + + + + + + + + +Melnikov & Martin Standards Track [Page 38] + +RFC 5804 ManageSieve July 2010 + + +6.1. ManageSieve Capability Registration Template + + To: iana@iana.org + Subject: ManageSieve Capability Registration + + Please register the following ManageSieve capability: + + Capability name: + Description: + Relevant publications: + Person & email address to contact for further information: + Author/Change controller: + +6.2. Registration of Initial ManageSieve Capabilities + + To: iana@iana.org + Subject: ManageSieve Capability Registration + + Please register the following ManageSieve capabilities: + + Capability name: IMPLEMENTATION + Description: Its value contains the name of the server + implementation and its version. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: SASL + Description: Its value contains a space-separated list of SASL + mechanisms supported by the server. + Relevant publications: this RFC, Sections 1.7 and 2.1. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: SIEVE + Description: Its value contains a space-separated list of supported + SIEVE extensions. + Relevant publications: this RFC, Section 1.7. Also [SIEVE]. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 39] + +RFC 5804 ManageSieve July 2010 + + + Capability name: STARTTLS + Description: This capability is returned if the server supports TLS + (STARTTLS command). + Relevant publications: this RFC, Sections 1.7 and 2.2. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: NOTIFY + Description: This capability is returned if the server supports the + 'enotify' [NOTIFY] Sieve extension. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: MAXREDIRECTS + Description: This capability returns the limit on the number of + Sieve "redirect" actions a script can perform during a + single evaluation. The value is a non-negative number + represented as a ManageSieve string. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: LANGUAGE + Description: The language (<Language-Tag> from [RFC5646]) currently + used for human-readable error messages. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: OWNER + Description: Its value contains the UTF-8-encoded name of the + currently logged-in user ("authorization identity" + according to RFC 4422). + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + +Melnikov & Martin Standards Track [Page 40] + +RFC 5804 ManageSieve July 2010 + + + Capability name: VERSION + Description: This capability is returned if the server is compliant + with RFC 5804; i.e., that it supports RENAMESCRIPT, + CHECKSCRIPT, and NOOP commands. + Relevant publications: this RFC, Sections 2.11, 2.12, and 2.13. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + +6.3. ManageSieve Response Code Registration Template + + To: iana@iana.org + Subject: ManageSieve Response Code Registration + + Please register the following ManageSieve response code: + + Response Code: + Arguments (use ABNF to specify syntax, or the word NONE if none + can be specified): + Purpose: + Published Specification(s): + Person & email address to contact for further information: + Author/Change controller: + +6.4. Registration of Initial ManageSieve Response Codes + + To: iana@iana.org + Subject: ManageSieve Response Code Registration + + Please register the following ManageSieve response codes: + + Response Code: AUTH-TOO-WEAK + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code is returned in the NO response from + an AUTHENTICATE command. It indicates that site + security policy forbids the use of the requested + mechanism for the specified authentication identity. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + +Melnikov & Martin Standards Track [Page 41] + +RFC 5804 ManageSieve July 2010 + + + Response Code: ENCRYPT-NEEDED + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code is returned in the NO response from + an AUTHENTICATE command. It indicates that site + security policy requires the use of a strong + encryption mechanism for the specified authentication + identity and mechanism. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: QUOTA + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined quota constraints. If + this response code is returned in the OK response, it + can mean that the user is near its quota or that the + user exceeded its quota, but the server supports soft + quotas. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: QUOTA/MAXSCRIPTS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined limit on the number of + Sieve scripts. If this response code is returned in + the OK response, it can mean that the user is near its + quota or that the user exceeded its quota, but the + server supports soft quotas. This response code is a + more specific version of the QUOTA response code. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 42] + +RFC 5804 ManageSieve July 2010 + + + Response Code: QUOTA/MAXSIZE + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined maximum script size. + If this response code is returned in the OK response, + it can mean that the user is near its quota or that + the user exceeded its quota, but the server supports + soft quotas. This response code is a more specific + version of the QUOTA response code. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: REFERRAL + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): <sieveurl> + Purpose: This response code may be returned with a BYE result + from any command, and includes a mandatory parameter + that indicates what server to access to manage this + user's Sieve scripts. The server will be specified by + a Sieve URL (see Section 3). The scriptname portion + of the URL MUST NOT be specified. The client should + authenticate to the specified server and use it for + all further commands in the current session. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: SASL + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): <string> + Purpose: This response code can occur in the OK response to a + successful AUTHENTICATE command and includes the + optional final server response data from the server as + specified by [SASL]. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 43] + +RFC 5804 ManageSieve July 2010 + + + Response Code: TRANSITION-NEEDED + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code occurs in a NO response of an + AUTHENTICATE command. It indicates that the user name + is valid, but the entry in the authentication database + needs to be updated in order to permit authentication + with the specified mechanism. This is typically done + by establishing a secure channel using TLS, followed + by authenticating once using the [PLAIN] + authentication mechanism. The selected mechanism + SHOULD then work for authentications in subsequent + sessions. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: TRYLATER + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed due to a temporary server failure. + The client MAY continue using local information and + try the command later. This response code only make + sense when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: ACTIVE + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because it is not allowed on the + active script, for example, DELETESCRIPT on the active + script. This response code only makes sense when + returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + + +Melnikov & Martin Standards Track [Page 44] + +RFC 5804 ManageSieve July 2010 + + + Response Code: NONEXISTENT + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because the referenced script name + doesn't exist. This response code only makes sense + when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: ALREADYEXISTS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because the referenced script name + already exists. This response code only makes sense + when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: WARNINGS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code MAY be returned by the server in + the OK response (but it might be returned with the NO/ + BYE response as well) and signals the client that even + though the script is syntactically valid, it might + contain errors not intended by the script writer. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: TAG + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): string + Purpose: This response code name is followed by a string + specified in the command that caused this response. + It is typically used for client state synchronization. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + +Melnikov & Martin Standards Track [Page 45] + +RFC 5804 ManageSieve July 2010 + + +7. Internationalization Considerations + + The LANGUAGE capability (see Section 1.7) allows a client to discover + the current language used in all human-readable responses that might + be returned at the end of any OK/NO/BYE response. Human-readable + text in OK responses typically doesn't need to be shown to the user, + unless it is returned in response to a PUTSCRIPT or CHECKSCRIPT + command that also contains the WARNINGS response code (Section 1.3). + Human-readable text from NO/BYE responses is intended be shown to the + user, unless the client can automatically handle failure of the + command that caused such a response. Clients SHOULD use response + codes (Section 1.3) for automatic error handling. Response codes MAY + also be used by the client to present error messages in a language + understood by the user, for example, if the LANGUAGE capability + doesn't return a language understood by the user. + + Note that the human-readable text from OK (WARNINGS) or NO/BYE + responses for PUTSCRIPT/CHECKSCRIPT commands is intended for advanced + users that understand Sieve language. Such advanced users are often + sophisticated enough to be able to handle whatever language the + server is using, even if it is not their preferred language, and will + want to see error/warning text no matter what language the server + puts it in. + + A client that generates Sieve script automatically, for example, if + the script is generated without user intervention or from a UI that + presents an abstract list of conditions and corresponding actions, + SHOULD NOT present warning/error messages to the user, because the + user might not even be aware that the client is using Sieve + underneath. However, if the client has a debugging mode, such + warnings/errors SHOULD be available in the debugging mode. + + Note that this document doesn't provide a way to modify the currently + used language. It is expected that a future extension will address + that. + +8. Acknowledgements + + Thanks to Simon Josefsson, Larry Greenfield, Allen Johnson, Chris + Newman, Lyndon Nerenberg, Tim Showalter, Sarah Robeson, Walter Wong, + Barry Leiba, Arnt Gulbrandsen, Stephan Bosch, Ken Murchison, Phil + Pennock, Ned Freed, Jeffrey Hutzelman, Mark E. Mallett, Dilyan + Palauzov, Dave Cridland, Aaron Stone, Robert Burrell Donkin, Patrick + Ben Koetter, Bjoern Hoehrmann, Martin Duerst, Pasi Eronen, Magnus + Westerlund, Tim Polk, and Julien Coloos for help with this document. + Special thank you to Phil Pennock for providing text for the NOOP + command, as well as finding various bugs in the document. + + + + +Melnikov & Martin Standards Track [Page 46] + +RFC 5804 ManageSieve July 2010 + + +9. References + +9.1. Normative References + + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, January 2008. + + [ACAP] Newman, C. and J. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November + 1997. + + [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, October 2006. + + [DNS-SRV] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR + for specifying the location of services (DNS SRV)", + RFC 2782, February 2000. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [NET-UNICODE] Klensin, J. and M. Padlipsky, "Unicode Format for + Network Interchange", RFC 5198, March 2008. + + [NOTIFY] Melnikov, A., Leiba, B., Segmuller, W., and T. Martin, + "Sieve Email Filtering: Extension for Notifications", + RFC 5435, January 2009. + + [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and + Languages", BCP 18, RFC 2277, January 1998. + + [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version + 6 (IPv6) Specification", RFC 2460, December 1998. + + [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, + "Internationalizing Domain Names in Applications + (IDNA)", RFC 3490, March 2003. + + [RFC4519] Sciberras, A., "Lightweight Directory Access Protocol + (LDAP): Schema for User Applications", RFC 4519, June + 2006. + + [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying + Languages", BCP 47, RFC 5646, September 2009. + + [RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791, + September 1981. + + + + +Melnikov & Martin Standards Track [Page 47] + +RFC 5804 ManageSieve July 2010 + + + [SASL] Melnikov, A. and K. Zeilenga, "Simple Authentication + and Security Layer (SASL)", RFC 4422, June 2006. + + [SASLprep] Zeilenga, K., "SASLprep: Stringprep Profile for User + Names and Passwords", RFC 4013, February 2005. + + [SCRAM] Menon-Sen, A., Melnikov, A., Newman, C., and N. + Williams, "Salted Challenge Response Authentication + Mechanism (SCRAM) SASL and GSS-API Mechanisms", RFC + 5802, July 2010. + + [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email + Filtering Language", RFC 5228, January 2008. + + [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of + Internationalized Strings ("stringprep")", RFC 3454, + December 2002. + + [TLS] Dierks, T. and E. Rescorla, "The Transport Layer + Security (TLS) Protocol Version 1.2", RFC 5246, August + 2008. + + [URI-GEN] Berners-Lee, T., Fielding, R., and L. Masinter, + "Uniform Resource Identifier (URI): Generic Syntax", + STD 66, RFC 3986, January 2005. + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003. + + [X509] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., + Housley, R., and W. Polk, "Internet X.509 Public Key + Infrastructure Certificate and Certificate Revocation + List (CRL) Profile", RFC 5280, May 2008. + + [X509-SRV] Santesson, S., "Internet X.509 Public Key + Infrastructure Subject Alternative Name for Expression + of Service Name", RFC 4985, August 2007. + +9.2. Informative References + + [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest Authentication + as a SASL Mechanism", RFC 2831, May 2000. + + [GSSAPI] Melnikov, A., "The Kerberos V5 ("GSSAPI") Simple + Authentication and Security Layer (SASL) Mechanism", + RFC 4752, November 2006. + + + + + +Melnikov & Martin Standards Track [Page 48] + +RFC 5804 ManageSieve July 2010 + + + [I-HAVE] Freed, N., "Sieve Email Filtering: Ihave Extension", + RFC 5463, March 2009. + + [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - + VERSION 4rev1", RFC 3501, March 2003. + + [LDAP] Zeilenga, K., "Lightweight Directory Access Protocol + (LDAP): Technical Specification Road Map", RFC 4510, + June 2006. + + [PLAIN] Zeilenga, K., "The PLAIN Simple Authentication and + Security Layer (SASL) Mechanism", RFC 4616, August + 2006. + +Authors' Addresses + + Alexey Melnikov (editor) + Isode Limited + 5 Castle Business Village + 36 Station Road + Hampton, Middlesex TW12 2BX + UK + + EMail: Alexey.Melnikov@isode.com + + + Tim Martin + BeThereBeSquare, Inc. + 672 Haight st. + San Francisco, CA 94117 + USA + + Phone: +1 510 260-4175 + EMail: timmartin@alumni.cmu.edu + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 49] + diff --git a/rfc/rfc822.txt b/rfc/rfc822.txt @@ -0,0 +1,2901 @@ + + + + + + + RFC # 822 + + Obsoletes: RFC #733 (NIC #41952) + + + + + + + + + + + + + STANDARD FOR THE FORMAT OF + + ARPA INTERNET TEXT MESSAGES + + + + + + + August 13, 1982 + + + + + + + Revised by + + David H. Crocker + + + Dept. of Electrical Engineering + University of Delaware, Newark, DE 19711 + Network: DCrocker @ UDel-Relay + + + + + + + + + + + + + + + + Standard for ARPA Internet Text Messages + + + TABLE OF CONTENTS + + + PREFACE .................................................... ii + + 1. INTRODUCTION ........................................... 1 + + 1.1. Scope ............................................ 1 + 1.2. Communication Framework .......................... 2 + + 2. NOTATIONAL CONVENTIONS ................................. 3 + + 3. LEXICAL ANALYSIS OF MESSAGES ........................... 5 + + 3.1. General Description .............................. 5 + 3.2. Header Field Definitions ......................... 9 + 3.3. Lexical Tokens ................................... 10 + 3.4. Clarifications ................................... 11 + + 4. MESSAGE SPECIFICATION .................................. 17 + + 4.1. Syntax ........................................... 17 + 4.2. Forwarding ....................................... 19 + 4.3. Trace Fields ..................................... 20 + 4.4. Originator Fields ................................ 21 + 4.5. Receiver Fields .................................. 23 + 4.6. Reference Fields ................................. 23 + 4.7. Other Fields ..................................... 24 + + 5. DATE AND TIME SPECIFICATION ............................ 26 + + 5.1. Syntax ........................................... 26 + 5.2. Semantics ........................................ 26 + + 6. ADDRESS SPECIFICATION .................................. 27 + + 6.1. Syntax ........................................... 27 + 6.2. Semantics ........................................ 27 + 6.3. Reserved Address ................................. 33 + + 7. BIBLIOGRAPHY ........................................... 34 + + + APPENDIX + + A. EXAMPLES ............................................... 36 + B. SIMPLE FIELD PARSING ................................... 40 + C. DIFFERENCES FROM RFC #733 .............................. 41 + D. ALPHABETICAL LISTING OF SYNTAX RULES ................... 44 + + + August 13, 1982 - i - RFC #822 + + + + + Standard for ARPA Internet Text Messages + + + PREFACE + + + By 1977, the Arpanet employed several informal standards for + the text messages (mail) sent among its host computers. It was + felt necessary to codify these practices and provide for those + features that seemed imminent. The result of that effort was + Request for Comments (RFC) #733, "Standard for the Format of ARPA + Network Text Message", by Crocker, Vittal, Pogran, and Henderson. + The specification attempted to avoid major changes in existing + software, while permitting several new features. + + This document revises the specifications in RFC #733, in + order to serve the needs of the larger and more complex ARPA + Internet. Some of RFC #733's features failed to gain adequate + acceptance. In order to simplify the standard and the software + that follows it, these features have been removed. A different + addressing scheme is used, to handle the case of inter-network + mail; and the concept of re-transmission has been introduced. + + This specification is intended for use in the ARPA Internet. + However, an attempt has been made to free it of any dependence on + that environment, so that it can be applied to other network text + message systems. + + The specification of RFC #733 took place over the course of + one year, using the ARPANET mail environment, itself, to provide + an on-going forum for discussing the capabilities to be included. + More than twenty individuals, from across the country, partici- + pated in the original discussion. The development of this + revised specification has, similarly, utilized network mail-based + group discussion. Both specification efforts greatly benefited + from the comments and ideas of the participants. + + The syntax of the standard, in RFC #733, was originally + specified in the Backus-Naur Form (BNF) meta-language. Ken L. + Harrenstien, of SRI International, was responsible for re-coding + the BNF into an augmented BNF that makes the representation + smaller and easier to understand. + + + + + + + + + + + + + August 13, 1982 - ii - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 1. INTRODUCTION + + 1.1. SCOPE + + This standard specifies a syntax for text messages that are + sent among computer users, within the framework of "electronic + mail". The standard supersedes the one specified in ARPANET + Request for Comments #733, "Standard for the Format of ARPA Net- + work Text Messages". + + In this context, messages are viewed as having an envelope + and contents. The envelope contains whatever information is + needed to accomplish transmission and delivery. The contents + compose the object to be delivered to the recipient. This stan- + dard applies only to the format and some of the semantics of mes- + sage contents. It contains no specification of the information + in the envelope. + + However, some message systems may use information from the + contents to create the envelope. It is intended that this stan- + dard facilitate the acquisition of such information by programs. + + Some message systems may store messages in formats that + differ from the one specified in this standard. This specifica- + tion is intended strictly as a definition of what message content + format is to be passed BETWEEN hosts. + + Note: This standard is NOT intended to dictate the internal for- + mats used by sites, the specific message system features + that they are expected to support, or any of the charac- + teristics of user interface programs that create or read + messages. + + A distinction should be made between what the specification + REQUIRES and what it ALLOWS. Messages can be made complex and + rich with formally-structured components of information or can be + kept small and simple, with a minimum of such information. Also, + the standard simplifies the interpretation of differing visual + formats in messages; only the visual aspect of a message is + affected and not the interpretation of information within it. + Implementors may choose to retain such visual distinctions. + + The formal definition is divided into four levels. The bot- + tom level describes the meta-notation used in this document. The + second level describes basic lexical analyzers that feed tokens + to higher-level parsers. Next is an overall specification for + messages; it permits distinguishing individual fields. Finally, + there is definition of the contents of several structured fields. + + + + August 13, 1982 - 1 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 1.2. COMMUNICATION FRAMEWORK + + Messages consist of lines of text. No special provisions + are made for encoding drawings, facsimile, speech, or structured + text. No significant consideration has been given to questions + of data compression or to transmission and storage efficiency, + and the standard tends to be free with the number of bits con- + sumed. For example, field names are specified as free text, + rather than special terse codes. + + A general "memo" framework is used. That is, a message con- + sists of some information in a rigid format, followed by the main + part of the message, with a format that is not specified in this + document. The syntax of several fields of the rigidly-formated + ("headers") section is defined in this specification; some of + these fields must be included in all messages. + + The syntax that distinguishes between header fields is + specified separately from the internal syntax for particular + fields. This separation is intended to allow simple parsers to + operate on the general structure of messages, without concern for + the detailed structure of individual header fields. Appendix B + is provided to facilitate construction of these parsers. + + In addition to the fields specified in this document, it is + expected that other fields will gain common use. As necessary, + the specifications for these "extension-fields" will be published + through the same mechanism used to publish this document. Users + may also wish to extend the set of fields that they use + privately. Such "user-defined fields" are permitted. + + The framework severely constrains document tone and appear- + ance and is primarily useful for most intra-organization communi- + cations and well-structured inter-organization communication. + It also can be used for some types of inter-process communica- + tion, such as simple file transfer and remote job entry. A more + robust framework might allow for multi-font, multi-color, multi- + dimension encoding of information. A less robust one, as is + present in most single-machine message systems, would more + severely constrain the ability to add fields and the decision to + include specific fields. In contrast with paper-based communica- + tion, it is interesting to note that the RECEIVER of a message + can exercise an extraordinary amount of control over the + message's appearance. The amount of actual control available to + message receivers is contingent upon the capabilities of their + individual message systems. + + + + + + August 13, 1982 - 2 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 2. NOTATIONAL CONVENTIONS + + This specification uses an augmented Backus-Naur Form (BNF) + notation. The differences from standard BNF involve naming rules + and indicating repetition and "local" alternatives. + + 2.1. RULE NAMING + + Angle brackets ("<", ">") are not used, in general. The + name of a rule is simply the name itself, rather than "<name>". + Quotation-marks enclose literal text (which may be upper and/or + lower case). Certain basic rules are in uppercase, such as + SPACE, TAB, CRLF, DIGIT, ALPHA, etc. Angle brackets are used in + rule definitions, and in the rest of this document, whenever + their presence will facilitate discerning the use of rule names. + + 2.2. RULE1 / RULE2: ALTERNATIVES + + Elements separated by slash ("/") are alternatives. There- + fore "foo / bar" will accept foo or bar. + + 2.3. (RULE1 RULE2): LOCAL ALTERNATIVES + + Elements enclosed in parentheses are treated as a single + element. Thus, "(elem (foo / bar) elem)" allows the token + sequences "elem foo elem" and "elem bar elem". + + 2.4. *RULE: REPETITION + + The character "*" preceding an element indicates repetition. + The full form is: + + <l>*<m>element + + indicating at least <l> and at most <m> occurrences of element. + Default values are 0 and infinity so that "*(element)" allows any + number, including zero; "1*element" requires at least one; and + "1*2element" allows one or two. + + 2.5. [RULE]: OPTIONAL + + Square brackets enclose optional elements; "[foo bar]" is + equivalent to "*1(foo bar)". + + 2.6. NRULE: SPECIFIC REPETITION + + "<n>(element)" is equivalent to "<n>*<n>(element)"; that is, + exactly <n> occurrences of (element). Thus 2DIGIT is a 2-digit + number, and 3ALPHA is a string of three alphabetic characters. + + + August 13, 1982 - 3 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 2.7. #RULE: LISTS + + A construct "#" is defined, similar to "*", as follows: + + <l>#<m>element + + indicating at least <l> and at most <m> elements, each separated + by one or more commas (","). This makes the usual form of lists + very easy; a rule such as '(element *("," element))' can be shown + as "1#element". Wherever this construct is used, null elements + are allowed, but do not contribute to the count of elements + present. That is, "(element),,(element)" is permitted, but + counts as only two elements. Therefore, where at least one ele- + ment is required, at least one non-null element must be present. + Default values are 0 and infinity so that "#(element)" allows any + number, including zero; "1#element" requires at least one; and + "1#2element" allows one or two. + + 2.8. ; COMMENTS + + A semi-colon, set off some distance to the right of rule + text, starts a comment that continues to the end of line. This + is a simple way of including useful notes in parallel with the + specifications. + + + + + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 4 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 3. LEXICAL ANALYSIS OF MESSAGES + + 3.1. GENERAL DESCRIPTION + + A message consists of header fields and, optionally, a body. + The body is simply a sequence of lines containing ASCII charac- + ters. It is separated from the headers by a null line (i.e., a + line with nothing preceding the CRLF). + + 3.1.1. LONG HEADER FIELDS + + Each header field can be viewed as a single, logical line of + ASCII characters, comprising a field-name and a field-body. + For convenience, the field-body portion of this conceptual + entity can be split into a multiple-line representation; this + is called "folding". The general rule is that wherever there + may be linear-white-space (NOT simply LWSP-chars), a CRLF + immediately followed by AT LEAST one LWSP-char may instead be + inserted. Thus, the single line + + To: "Joe & J. Harvey" <ddd @Org>, JJV @ BBN + + can be represented as: + + To: "Joe & J. Harvey" <ddd @ Org>, + JJV@BBN + + and + + To: "Joe & J. Harvey" + <ddd@ Org>, JJV + @BBN + + and + + To: "Joe & + J. Harvey" <ddd @ Org>, JJV @ BBN + + The process of moving from this folded multiple-line + representation of a header field to its single line represen- + tation is called "unfolding". Unfolding is accomplished by + regarding CRLF immediately followed by a LWSP-char as + equivalent to the LWSP-char. + + Note: While the standard permits folding wherever linear- + white-space is permitted, it is recommended that struc- + tured fields, such as those containing addresses, limit + folding to higher-level syntactic breaks. For address + fields, it is recommended that such folding occur + + + August 13, 1982 - 5 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + between addresses, after the separating comma. + + 3.1.2. STRUCTURE OF HEADER FIELDS + + Once a field has been unfolded, it may be viewed as being com- + posed of a field-name followed by a colon (":"), followed by a + field-body, and terminated by a carriage-return/line-feed. + The field-name must be composed of printable ASCII characters + (i.e., characters that have values between 33. and 126., + decimal, except colon). The field-body may be composed of any + ASCII characters, except CR or LF. (While CR and/or LF may be + present in the actual text, they are removed by the action of + unfolding the field.) + + Certain field-bodies of headers may be interpreted according + to an internal syntax that some systems may wish to parse. + These fields are called "structured fields". Examples + include fields containing dates and addresses. Other fields, + such as "Subject" and "Comments", are regarded simply as + strings of text. + + Note: Any field which has a field-body that is defined as + other than simply <text> is to be treated as a struc- + tured field. + + Field-names, unstructured field bodies and structured + field bodies each are scanned by their own, independent + "lexical" analyzers. + + 3.1.3. UNSTRUCTURED FIELD BODIES + + For some fields, such as "Subject" and "Comments", no struc- + turing is assumed, and they are treated simply as <text>s, as + in the message body. Rules of folding apply to these fields, + so that such field bodies which occupy several lines must + therefore have the second and successive lines indented by at + least one LWSP-char. + + 3.1.4. STRUCTURED FIELD BODIES + + To aid in the creation and reading of structured fields, the + free insertion of linear-white-space (which permits folding + by inclusion of CRLFs) is allowed between lexical tokens. + Rather than obscuring the syntax specifications for these + structured fields with explicit syntax for this linear-white- + space, the existence of another "lexical" analyzer is assumed. + This analyzer does not apply for unstructured field bodies + that are simply strings of text, as described above. The + analyzer provides an interpretation of the unfolded text + + + August 13, 1982 - 6 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + composing the body of the field as a sequence of lexical sym- + bols. + + These symbols are: + + - individual special characters + - quoted-strings + - domain-literals + - comments + - atoms + + The first four of these symbols are self-delimiting. Atoms + are not; they are delimited by the self-delimiting symbols and + by linear-white-space. For the purposes of regenerating + sequences of atoms and quoted-strings, exactly one SPACE is + assumed to exist, and should be used, between them. (Also, in + the "Clarifications" section on "White Space", below, note the + rules about treatment of multiple contiguous LWSP-chars.) + + So, for example, the folded body of an address field + + ":sysmail"@ Some-Group. Some-Org, + Muhammed.(I am the greatest) Ali @(the)Vegas.WBA + + + + + + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 7 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + is analyzed into the following lexical symbols and types: + + :sysmail quoted string + @ special + Some-Group atom + . special + Some-Org atom + , special + Muhammed atom + . special + (I am the greatest) comment + Ali atom + @ atom + (the) comment + Vegas atom + . special + WBA atom + + The canonical representations for the data in these addresses + are the following strings: + + ":sysmail"@Some-Group.Some-Org + + and + + Muhammed.Ali@Vegas.WBA + + Note: For purposes of display, and when passing such struc- + tured information to other systems, such as mail proto- + col services, there must be NO linear-white-space + between <word>s that are separated by period (".") or + at-sign ("@") and exactly one SPACE between all other + <word>s. Also, headers should be in a folded form. + + + + + + + + + + + + + + + + + + + August 13, 1982 - 8 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 3.2. HEADER FIELD DEFINITIONS + + These rules show a field meta-syntax, without regard for the + particular type or internal syntax. Their purpose is to permit + detection of fields; also, they present to higher-level parsers + an image of each field as fitting on one line. + + field = field-name ":" [ field-body ] CRLF + + field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> + + field-body = field-body-contents + [CRLF LWSP-char field-body] + + field-body-contents = + <the ASCII characters making up the field-body, as + defined in the following sections, and consisting + of combinations of atom, quoted-string, and + specials tokens, or else consisting of texts> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 9 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 3.3. LEXICAL TOKENS + + The following rules are used to define an underlying lexical + analyzer, which feeds tokens to higher level parsers. See the + ANSI references, in the Bibliography. + + ; ( Octal, Decimal.) + CHAR = <any ASCII character> ; ( 0-177, 0.-127.) + ALPHA = <any ASCII alphabetic character> + ; (101-132, 65.- 90.) + ; (141-172, 97.-122.) + DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.) + CTL = <any ASCII control ; ( 0- 37, 0.- 31.) + character and DEL> ; ( 177, 127.) + CR = <ASCII CR, carriage return> ; ( 15, 13.) + LF = <ASCII LF, linefeed> ; ( 12, 10.) + SPACE = <ASCII SP, space> ; ( 40, 32.) + HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) + <"> = <ASCII quote mark> ; ( 42, 34.) + CRLF = CR LF + + LWSP-char = SPACE / HTAB ; semantics = SPACE + + linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE + ; CRLF => folding + + specials = "(" / ")" / "<" / ">" / "@" ; Must be in quoted- + / "," / ";" / ":" / "\" / <"> ; string, to use + / "." / "[" / "]" ; within a word. + + delimiters = specials / linear-white-space / comment + + text = <any CHAR, including bare ; => atoms, specials, + CR & bare LF, but NOT ; comments and + including CRLF> ; quoted-strings are + ; NOT recognized. + + atom = 1*<any CHAR except specials, SPACE and CTLs> + + quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or + ; quoted chars. + + qtext = <any CHAR excepting <">, ; => may be folded + "\" & CR, and including + linear-white-space> + + domain-literal = "[" *(dtext / quoted-pair) "]" + + + + + August 13, 1982 - 10 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + dtext = <any CHAR excluding "[", ; => may be folded + "]", "\" & CR, & including + linear-white-space> + + comment = "(" *(ctext / quoted-pair / comment) ")" + + ctext = <any CHAR excluding "(", ; => may be folded + ")", "\" & CR, & including + linear-white-space> + + quoted-pair = "\" CHAR ; may quote any char + + phrase = 1*word ; Sequence of words + + word = atom / quoted-string + + + 3.4. CLARIFICATIONS + + 3.4.1. QUOTING + + Some characters are reserved for special interpretation, such + as delimiting lexical tokens. To permit use of these charac- + ters as uninterpreted data, a quoting mechanism is provided. + To quote a character, precede it with a backslash ("\"). + + This mechanism is not fully general. Characters may be quoted + only within a subset of the lexical constructs. In particu- + lar, quoting is limited to use within: + + - quoted-string + - domain-literal + - comment + + Within these constructs, quoting is REQUIRED for CR and "\" + and for the character(s) that delimit the token (e.g., "(" and + ")" for a comment). However, quoting is PERMITTED for any + character. + + Note: In particular, quoting is NOT permitted within atoms. + For example when the local-part of an addr-spec must + contain a special character, a quoted string must be + used. Therefore, a specification such as: + + Full\ Name@Domain + + is not legal and must be specified as: + + "Full Name"@Domain + + + August 13, 1982 - 11 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 3.4.2. WHITE SPACE + + Note: In structured field bodies, multiple linear space ASCII + characters (namely HTABs and SPACEs) are treated as + single spaces and may freely surround any symbol. In + all header fields, the only place in which at least one + LWSP-char is REQUIRED is at the beginning of continua- + tion lines in a folded field. + + When passing text to processes that do not interpret text + according to this standard (e.g., mail protocol servers), then + NO linear-white-space characters should occur between a period + (".") or at-sign ("@") and a <word>. Exactly ONE SPACE should + be used in place of arbitrary linear-white-space and comment + sequences. + + Note: Within systems conforming to this standard, wherever a + member of the list of delimiters is allowed, LWSP-chars + may also occur before and/or after it. + + Writers of mail-sending (i.e., header-generating) programs + should realize that there is no network-wide definition of the + effect of ASCII HT (horizontal-tab) characters on the appear- + ance of text at another network host; therefore, the use of + tabs in message headers, though permitted, is discouraged. + + 3.4.3. COMMENTS + + A comment is a set of ASCII characters, which is enclosed in + matching parentheses and which is not within a quoted-string + The comment construct permits message originators to add text + which will be useful for human readers, but which will be + ignored by the formal semantics. Comments should be retained + while the message is subject to interpretation according to + this standard. However, comments must NOT be included in + other cases, such as during protocol exchanges with mail + servers. + + Comments nest, so that if an unquoted left parenthesis occurs + in a comment string, there must also be a matching right + parenthesis. When a comment acts as the delimiter between a + sequence of two lexical symbols, such as two atoms, it is lex- + ically equivalent with a single SPACE, for the purposes of + regenerating the sequence, such as when passing the sequence + onto a mail protocol server. Comments are detected as such + only within field-bodies of structured fields. + + If a comment is to be "folded" onto multiple lines, then the + syntax for folding must be adhered to. (See the "Lexical + + + August 13, 1982 - 12 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + Analysis of Messages" section on "Folding Long Header Fields" + above, and the section on "Case Independence" below.) Note + that the official semantics therefore do not "see" any + unquoted CRLFs that are in comments, although particular pars- + ing programs may wish to note their presence. For these pro- + grams, it would be reasonable to interpret a "CRLF LWSP-char" + as being a CRLF that is part of the comment; i.e., the CRLF is + kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a + backslash followed by a CR followed by a LF) still must be + followed by at least one LWSP-char. + + 3.4.4. DELIMITING AND QUOTING CHARACTERS + + The quote character (backslash) and characters that delimit + syntactic units are not, generally, to be taken as data that + are part of the delimited or quoted unit(s). In particular, + the quotation-marks that define a quoted-string, the + parentheses that define a comment and the backslash that + quotes a following character are NOT part of the quoted- + string, comment or quoted character. A quotation-mark that is + to be part of a quoted-string, a parenthesis that is to be + part of a comment and a backslash that is to be part of either + must each be preceded by the quote-character backslash ("\"). + Note that the syntax allows any character to be quoted within + a quoted-string or comment; however only certain characters + MUST be quoted to be included as data. These characters are + the ones that are not part of the alternate text group (i.e., + ctext or qtext). + + The one exception to this rule is that a single SPACE is + assumed to exist between contiguous words in a phrase, and + this interpretation is independent of the actual number of + LWSP-chars that the creator places between the words. To + include more than one SPACE, the creator must make the LWSP- + chars be part of a quoted-string. + + Quotation marks that delimit a quoted string and backslashes + that quote the following character should NOT accompany the + quoted-string when the string is passed to processes that do + not interpret data according to this specification (e.g., mail + protocol servers). + + 3.4.5. QUOTED-STRINGS + + Where permitted (i.e., in words in structured fields) quoted- + strings are treated as a single symbol. That is, a quoted- + string is equivalent to an atom, syntactically. If a quoted- + string is to be "folded" onto multiple lines, then the syntax + for folding must be adhered to. (See the "Lexical Analysis of + + + August 13, 1982 - 13 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + Messages" section on "Folding Long Header Fields" above, and + the section on "Case Independence" below.) Therefore, the + official semantics do not "see" any bare CRLFs that are in + quoted-strings; however particular parsing programs may wish + to note their presence. For such programs, it would be rea- + sonable to interpret a "CRLF LWSP-char" as being a CRLF which + is part of the quoted-string; i.e., the CRLF is kept and the + LWSP-char is discarded. Quoted CRLFs (i.e., a backslash fol- + lowed by a CR followed by a LF) are also subject to rules of + folding, but the presence of the quoting character (backslash) + explicitly indicates that the CRLF is data to the quoted + string. Stripping off the first following LWSP-char is also + appropriate when parsing quoted CRLFs. + + 3.4.6. BRACKETING CHARACTERS + + There is one type of bracket which must occur in matched pairs + and may have pairs nested within each other: + + o Parentheses ("(" and ")") are used to indicate com- + ments. + + There are three types of brackets which must occur in matched + pairs, and which may NOT be nested: + + o Colon/semi-colon (":" and ";") are used in address + specifications to indicate that the included list of + addresses are to be treated as a group. + + o Angle brackets ("<" and ">") are generally used to + indicate the presence of a one machine-usable refer- + ence (e.g., delimiting mailboxes), possibly including + source-routing to the machine. + + o Square brackets ("[" and "]") are used to indicate the + presence of a domain-literal, which the appropriate + name-domain is to use directly, bypassing normal + name-resolution mechanisms. + + 3.4.7. CASE INDEPENDENCE + + Except as noted, alphabetic strings may be represented in any + combination of upper and lower case. The only syntactic units + + + + + + + + + August 13, 1982 - 14 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + which requires preservation of case information are: + + - text + - qtext + - dtext + - ctext + - quoted-pair + - local-part, except "Postmaster" + + When matching any other syntactic unit, case is to be ignored. + For example, the field-names "From", "FROM", "from", and even + "FroM" are semantically equal and should all be treated ident- + ically. + + When generating these units, any mix of upper and lower case + alphabetic characters may be used. The case shown in this + specification is suggested for message-creating processes. + + Note: The reserved local-part address unit, "Postmaster", is + an exception. When the value "Postmaster" is being + interpreted, it must be accepted in any mixture of + case, including "POSTMASTER", and "postmaster". + + 3.4.8. FOLDING LONG HEADER FIELDS + + Each header field may be represented on exactly one line con- + sisting of the name of the field and its body, and terminated + by a CRLF; this is what the parser sees. For readability, the + field-body portion of long header fields may be "folded" onto + multiple lines of the actual field. "Long" is commonly inter- + preted to mean greater than 65 or 72 characters. The former + length serves as a limit, when the message is to be viewed on + most simple terminals which use simple display software; how- + ever, the limit is not imposed by this standard. + + Note: Some display software often can selectively fold lines, + to suit the display terminal. In such cases, sender- + provided folding can interfere with the display + software. + + 3.4.9. BACKSPACE CHARACTERS + + ASCII BS characters (Backspace, decimal 8) may be included in + texts and quoted-strings to effect overstriking. However, any + use of backspaces which effects an overstrike to the left of + the beginning of the text or quoted-string is prohibited. + + + + + + August 13, 1982 - 15 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 3.4.10. NETWORK-SPECIFIC TRANSFORMATIONS + + During transmission through heterogeneous networks, it may be + necessary to force data to conform to a network's local con- + ventions. For example, it may be required that a CR be fol- + lowed either by LF, making a CRLF, or by <null>, if the CR is + to stand alone). Such transformations are reversed, when the + message exits that network. + + When crossing network boundaries, the message should be + treated as passing through two modules. It will enter the + first module containing whatever network-specific transforma- + tions that were necessary to permit migration through the + "current" network. It then passes through the modules: + + o Transformation Reversal + + The "current" network's idiosyncracies are removed and + the message is returned to the canonical form speci- + fied in this standard. + + o Transformation + + The "next" network's local idiosyncracies are imposed + on the message. + + ------------------ + From ==> | Remove Net-A | + Net-A | idiosyncracies | + ------------------ + || + \/ + Conformance + with standard + || + \/ + ------------------ + | Impose Net-B | ==> To + | idiosyncracies | Net-B + ------------------ + + + + + + + + + + + + August 13, 1982 - 16 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 4. MESSAGE SPECIFICATION + + 4.1. SYNTAX + + Note: Due to an artifact of the notational conventions, the syn- + tax indicates that, when present, some fields, must be in + a particular order. Header fields are NOT required to + occur in any particular order, except that the message + body must occur AFTER the headers. It is recommended + that, if present, headers be sent in the order "Return- + Path", "Received", "Date", "From", "Subject", "Sender", + "To", "cc", etc. + + This specification permits multiple occurrences of most + fields. Except as noted, their interpretation is not + specified here, and their use is discouraged. + + The following syntax for the bodies of various fields should + be thought of as describing each field body as a single long + string (or line). The "Lexical Analysis of Message" section on + "Long Header Fields", above, indicates how such long strings can + be represented on more than one line in the actual transmitted + message. + + message = fields *( CRLF *text ) ; Everything after + ; first null line + ; is message body + + fields = dates ; Creation time, + source ; author id & one + 1*destination ; address required + *optional-field ; others optional + + source = [ trace ] ; net traversals + originator ; original mail + [ resent ] ; forwarded + + trace = return ; path to sender + 1*received ; receipt tags + + return = "Return-path" ":" route-addr ; return address + + received = "Received" ":" ; one per relay + ["from" domain] ; sending host + ["by" domain] ; receiving host + ["via" atom] ; physical path + *("with" atom) ; link/mail protocol + ["id" msg-id] ; receiver msg id + ["for" addr-spec] ; initial form + + + August 13, 1982 - 17 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + ";" date-time ; time received + + originator = authentic ; authenticated addr + [ "Reply-To" ":" 1#address] ) + + authentic = "From" ":" mailbox ; Single author + / ( "Sender" ":" mailbox ; Actual submittor + "From" ":" 1#mailbox) ; Multiple authors + ; or not sender + + resent = resent-authentic + [ "Resent-Reply-To" ":" 1#address] ) + + resent-authentic = + = "Resent-From" ":" mailbox + / ( "Resent-Sender" ":" mailbox + "Resent-From" ":" 1#mailbox ) + + dates = orig-date ; Original + [ resent-date ] ; Forwarded + + orig-date = "Date" ":" date-time + + resent-date = "Resent-Date" ":" date-time + + destination = "To" ":" 1#address ; Primary + / "Resent-To" ":" 1#address + / "cc" ":" 1#address ; Secondary + / "Resent-cc" ":" 1#address + / "bcc" ":" #address ; Blind carbon + / "Resent-bcc" ":" #address + + optional-field = + / "Message-ID" ":" msg-id + / "Resent-Message-ID" ":" msg-id + / "In-Reply-To" ":" *(phrase / msg-id) + / "References" ":" *(phrase / msg-id) + / "Keywords" ":" #phrase + / "Subject" ":" *text + / "Comments" ":" *text + / "Encrypted" ":" 1#2word + / extension-field ; To be defined + / user-defined-field ; May be pre-empted + + msg-id = "<" addr-spec ">" ; Unique message id + + + + + + + August 13, 1982 - 18 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + extension-field = + <Any field which is defined in a document + published as a formal extension to this + specification; none will have names beginning + with the string "X-"> + + user-defined-field = + <Any field which has not been defined + in this specification or published as an + extension to this specification; names for + such fields must be unique and may be + pre-empted by published extensions> + + 4.2. FORWARDING + + Some systems permit mail recipients to forward a message, + retaining the original headers, by adding some new fields. This + standard supports such a service, through the "Resent-" prefix to + field names. + + Whenever the string "Resent-" begins a field name, the field + has the same semantics as a field whose name does not have the + prefix. However, the message is assumed to have been forwarded + by an original recipient who attached the "Resent-" field. This + new field is treated as being more recent than the equivalent, + original field. For example, the "Resent-From", indicates the + person that forwarded the message, whereas the "From" field indi- + cates the original author. + + Use of such precedence information depends upon partici- + pants' communication needs. For example, this standard does not + dictate when a "Resent-From:" address should receive replies, in + lieu of sending them to the "From:" address. + + Note: In general, the "Resent-" fields should be treated as con- + taining a set of information that is independent of the + set of original fields. Information for one set should + not automatically be taken from the other. The interpre- + tation of multiple "Resent-" fields, of the same type, is + undefined. + + In the remainder of this specification, occurrence of legal + "Resent-" fields are treated identically with the occurrence of + + + + + + + + + August 13, 1982 - 19 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + fields whose names do not contain this prefix. + + 4.3. TRACE FIELDS + + Trace information is used to provide an audit trail of mes- + sage handling. In addition, it indicates a route back to the + sender of the message. + + The list of known "via" and "with" values are registered + with the Network Information Center, SRI International, Menlo + Park, California. + + 4.3.1. RETURN-PATH + + This field is added by the final transport system that + delivers the message to its recipient. The field is intended + to contain definitive information about the address and route + back to the message's originator. + + Note: The "Reply-To" field is added by the originator and + serves to direct replies, whereas the "Return-Path" + field is used to identify a path back to the origina- + tor. + + While the syntax indicates that a route specification is + optional, every attempt should be made to provide that infor- + mation in this field. + + 4.3.2. RECEIVED + + A copy of this field is added by each transport service that + relays the message. The information in the field can be quite + useful for tracing transport problems. + + The names of the sending and receiving hosts and time-of- + receipt may be specified. The "via" parameter may be used, to + indicate what physical mechanism the message was sent over, + such as Arpanet or Phonenet, and the "with" parameter may be + used to indicate the mail-, or connection-, level protocol + that was used, such as the SMTP mail protocol, or X.25 tran- + sport protocol. + + Note: Several "with" parameters may be included, to fully + specify the set of protocols that were used. + + Some transport services queue mail; the internal message iden- + tifier that is assigned to the message may be noted, using the + "id" parameter. When the sending host uses a destination + address specification that the receiving host reinterprets, by + + + August 13, 1982 - 20 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + expansion or transformation, the receiving host may wish to + record the original specification, using the "for" parameter. + For example, when a copy of mail is sent to the member of a + distribution list, this parameter may be used to record the + original address that was used to specify the list. + + 4.4. ORIGINATOR FIELDS + + The standard allows only a subset of the combinations possi- + ble with the From, Sender, Reply-To, Resent-From, Resent-Sender, + and Resent-Reply-To fields. The limitation is intentional. + + 4.4.1. FROM / RESENT-FROM + + This field contains the identity of the person(s) who wished + this message to be sent. The message-creation process should + default this field to be a single, authenticated machine + address, indicating the AGENT (person, system or process) + entering the message. If this is not done, the "Sender" field + MUST be present. If the "From" field IS defaulted this way, + the "Sender" field is optional and is redundant with the + "From" field. In all cases, addresses in the "From" field + must be machine-usable (addr-specs) and may not contain named + lists (groups). + + 4.4.2. SENDER / RESENT-SENDER + + This field contains the authenticated identity of the AGENT + (person, system or process) that sends the message. It is + intended for use when the sender is not the author of the mes- + sage, or to indicate who among a group of authors actually + sent the message. If the contents of the "Sender" field would + be completely redundant with the "From" field, then the + "Sender" field need not be present and its use is discouraged + (though still legal). In particular, the "Sender" field MUST + be present if it is NOT the same as the "From" Field. + + The Sender mailbox specification includes a word sequence + which must correspond to a specific agent (i.e., a human user + or a computer program) rather than a standard address. This + indicates the expectation that the field will identify the + single AGENT (person, system, or process) responsible for + sending the mail and not simply include the name of a mailbox + from which the mail was sent. For example in the case of a + shared login name, the name, by itself, would not be adequate. + The local-part address unit, which refers to this agent, is + expected to be a computer system term, and not (for example) a + generalized person reference which can be used outside the + network text message context. + + + August 13, 1982 - 21 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + Since the critical function served by the "Sender" field is + identification of the agent responsible for sending mail and + since computer programs cannot be held accountable for their + behavior, it is strongly recommended that when a computer pro- + gram generates a message, the HUMAN who is responsible for + that program be referenced as part of the "Sender" field mail- + box specification. + + 4.4.3. REPLY-TO / RESENT-REPLY-TO + + This field provides a general mechanism for indicating any + mailbox(es) to which responses are to be sent. Three typical + uses for this feature can be distinguished. In the first + case, the author(s) may not have regular machine-based mail- + boxes and therefore wish(es) to indicate an alternate machine + address. In the second case, an author may wish additional + persons to be made aware of, or responsible for, replies. A + somewhat different use may be of some help to "text message + teleconferencing" groups equipped with automatic distribution + services: include the address of that service in the "Reply- + To" field of all messages submitted to the teleconference; + then participants can "reply" to conference submissions to + guarantee the correct distribution of any submission of their + own. + + Note: The "Return-Path" field is added by the mail transport + service, at the time of final deliver. It is intended + to identify a path back to the orginator of the mes- + sage. The "Reply-To" field is added by the message + originator and is intended to direct replies. + + 4.4.4. AUTOMATIC USE OF FROM / SENDER / REPLY-TO + + For systems which automatically generate address lists for + replies to messages, the following recommendations are made: + + o The "Sender" field mailbox should be sent notices of + any problems in transport or delivery of the original + messages. If there is no "Sender" field, then the + "From" field mailbox should be used. + + o The "Sender" field mailbox should NEVER be used + automatically, in a recipient's reply message. + + o If the "Reply-To" field exists, then the reply should + go to the addresses indicated in that field and not to + the address(es) indicated in the "From" field. + + + + + August 13, 1982 - 22 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + o If there is a "From" field, but no "Reply-To" field, + the reply should be sent to the address(es) indicated + in the "From" field. + + Sometimes, a recipient may actually wish to communicate with + the person that initiated the message transfer. In such + cases, it is reasonable to use the "Sender" address. + + This recommendation is intended only for automated use of + originator-fields and is not intended to suggest that replies + may not also be sent to other recipients of messages. It is + up to the respective mail-handling programs to decide what + additional facilities will be provided. + + Examples are provided in Appendix A. + + 4.5. RECEIVER FIELDS + + 4.5.1. TO / RESENT-TO + + This field contains the identity of the primary recipients of + the message. + + 4.5.2. CC / RESENT-CC + + This field contains the identity of the secondary (informa- + tional) recipients of the message. + + 4.5.3. BCC / RESENT-BCC + + This field contains the identity of additional recipients of + the message. The contents of this field are not included in + copies of the message sent to the primary and secondary reci- + pients. Some systems may choose to include the text of the + "Bcc" field only in the author(s)'s copy, while others may + also include it in the text sent to all those indicated in the + "Bcc" list. + + 4.6. REFERENCE FIELDS + + 4.6.1. MESSAGE-ID / RESENT-MESSAGE-ID + + This field contains a unique identifier (the local-part + address unit) which refers to THIS version of THIS message. + The uniqueness of the message identifier is guaranteed by the + host which generates it. This identifier is intended to be + machine readable and not necessarily meaningful to humans. A + message identifier pertains to exactly one instantiation of a + particular message; subsequent revisions to the message should + + + August 13, 1982 - 23 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + each receive new message identifiers. + + 4.6.2. IN-REPLY-TO + + The contents of this field identify previous correspon- + dence which this message answers. Note that if message iden- + tifiers are used in this field, they must use the msg-id + specification format. + + 4.6.3. REFERENCES + + The contents of this field identify other correspondence + which this message references. Note that if message identif- + iers are used, they must use the msg-id specification format. + + 4.6.4. KEYWORDS + + This field contains keywords or phrases, separated by + commas. + + 4.7. OTHER FIELDS + + 4.7.1. SUBJECT + + This is intended to provide a summary, or indicate the + nature, of the message. + + 4.7.2. COMMENTS + + Permits adding text comments onto the message without + disturbing the contents of the message's body. + + 4.7.3. ENCRYPTED + + Sometimes, data encryption is used to increase the + privacy of message contents. If the body of a message has + been encrypted, to keep its contents private, the "Encrypted" + field can be used to note the fact and to indicate the nature + of the encryption. The first <word> parameter indicates the + software used to encrypt the body, and the second, optional + <word> is intended to aid the recipient in selecting the + proper decryption key. This code word may be viewed as an + index to a table of keys held by the recipient. + + Note: Unfortunately, headers must contain envelope, as well + as contents, information. Consequently, it is neces- + sary that they remain unencrypted, so that mail tran- + sport services may access them. Since names, + addresses, and "Subject" field contents may contain + + + August 13, 1982 - 24 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + sensitive information, this requirement limits total + message privacy. + + Names of encryption software are registered with the Net- + work Information Center, SRI International, Menlo Park, Cali- + fornia. + + 4.7.4. EXTENSION-FIELD + + A limited number of common fields have been defined in + this document. As network mail requirements dictate, addi- + tional fields may be standardized. To provide user-defined + fields with a measure of safety, in name selection, such + extension-fields will never have names that begin with the + string "X-". + + Names of Extension-fields are registered with the Network + Information Center, SRI International, Menlo Park, California. + + 4.7.5. USER-DEFINED-FIELD + + Individual users of network mail are free to define and + use additional header fields. Such fields must have names + which are not already used in the current specification or in + any definitions of extension-fields, and the overall syntax of + these user-defined-fields must conform to this specification's + rules for delimiting and folding fields. Due to the + extension-field publishing process, the name of a user- + defined-field may be pre-empted + + Note: The prefatory string "X-" will never be used in the + names of Extension-fields. This provides user-defined + fields with a protected set of names. + + + + + + + + + + + + + + + + + + + August 13, 1982 - 25 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 5. DATE AND TIME SPECIFICATION + + 5.1. SYNTAX + + date-time = [ day "," ] date time ; dd mm yy + ; hh:mm:ss zzz + + day = "Mon" / "Tue" / "Wed" / "Thu" + / "Fri" / "Sat" / "Sun" + + date = 1*2DIGIT month 2DIGIT ; day month year + ; e.g. 20 Jun 82 + + month = "Jan" / "Feb" / "Mar" / "Apr" + / "May" / "Jun" / "Jul" / "Aug" + / "Sep" / "Oct" / "Nov" / "Dec" + + time = hour zone ; ANSI and Military + + hour = 2DIGIT ":" 2DIGIT [":" 2DIGIT] + ; 00:00:00 - 23:59:59 + + zone = "UT" / "GMT" ; Universal Time + ; North American : UT + / "EST" / "EDT" ; Eastern: - 5/ - 4 + / "CST" / "CDT" ; Central: - 6/ - 5 + / "MST" / "MDT" ; Mountain: - 7/ - 6 + / "PST" / "PDT" ; Pacific: - 8/ - 7 + / 1ALPHA ; Military: Z = UT; + ; A:-1; (J not used) + ; M:-12; N:+1; Y:+12 + / ( ("+" / "-") 4DIGIT ) ; Local differential + ; hours+min. (HHMM) + + 5.2. SEMANTICS + + If included, day-of-week must be the day implied by the date + specification. + + Time zone may be indicated in several ways. "UT" is Univer- + sal Time (formerly called "Greenwich Mean Time"); "GMT" is per- + mitted as a reference to Universal Time. The military standard + uses a single character for each zone. "Z" is Universal Time. + "A" indicates one hour earlier, and "M" indicates 12 hours ear- + lier; "N" is one hour later, and "Y" is 12 hours later. The + letter "J" is not used. The other remaining two forms are taken + from ANSI standard X3.51-1975. One allows explicit indication of + the amount of offset from UT; the other uses common 3-character + strings for indicating time zones in North America. + + + August 13, 1982 - 26 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 6. ADDRESS SPECIFICATION + + 6.1. SYNTAX + + address = mailbox ; one addressee + / group ; named list + + group = phrase ":" [#mailbox] ";" + + mailbox = addr-spec ; simple address + / phrase route-addr ; name & addr-spec + + route-addr = "<" [route] addr-spec ">" + + route = 1#("@" domain) ":" ; path-relative + + addr-spec = local-part "@" domain ; global address + + local-part = word *("." word) ; uninterpreted + ; case-preserved + + domain = sub-domain *("." sub-domain) + + sub-domain = domain-ref / domain-literal + + domain-ref = atom ; symbolic reference + + 6.2. SEMANTICS + + A mailbox receives mail. It is a conceptual entity which + does not necessarily pertain to file storage. For example, some + sites may choose to print mail on their line printer and deliver + the output to the addressee's desk. + + A mailbox specification comprises a person, system or pro- + cess name reference, a domain-dependent string, and a name-domain + reference. The name reference is optional and is usually used to + indicate the human name of a recipient. The name-domain refer- + ence specifies a sequence of sub-domains. The domain-dependent + string is uninterpreted, except by the final sub-domain; the rest + of the mail service merely transmits it as a literal string. + + 6.2.1. DOMAINS + + A name-domain is a set of registered (mail) names. A name- + domain specification resolves to a subordinate name-domain + specification or to a terminal domain-dependent string. + Hence, domain specification is extensible, permitting any + number of registration levels. + + + August 13, 1982 - 27 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + Name-domains model a global, logical, hierarchical addressing + scheme. The model is logical, in that an address specifica- + tion is related to name registration and is not necessarily + tied to transmission path. The model's hierarchy is a + directed graph, called an in-tree, such that there is a single + path from the root of the tree to any node in the hierarchy. + If more than one path actually exists, they are considered to + be different addresses. + + The root node is common to all addresses; consequently, it is + not referenced. Its children constitute "top-level" name- + domains. Usually, a service has access to its own full domain + specification and to the names of all top-level name-domains. + + The "top" of the domain addressing hierarchy -- a child of the + root -- is indicated by the right-most field, in a domain + specification. Its child is specified to the left, its child + to the left, and so on. + + Some groups provide formal registration services; these con- + stitute name-domains that are independent logically of + specific machines. In addition, networks and machines impli- + citly compose name-domains, since their membership usually is + registered in name tables. + + In the case of formal registration, an organization implements + a (distributed) data base which provides an address-to-route + mapping service for addresses of the form: + + person@registry.organization + + Note that "organization" is a logical entity, separate from + any particular communication network. + + A mechanism for accessing "organization" is universally avail- + able. That mechanism, in turn, seeks an instantiation of the + registry; its location is not indicated in the address specif- + ication. It is assumed that the system which operates under + the name "organization" knows how to find a subordinate regis- + try. The registry will then use the "person" string to deter- + mine where to send the mail specification. + + The latter, network-oriented case permits simple, direct, + attachment-related address specification, such as: + + user@host.network + + Once the network is accessed, it is expected that a message + will go directly to the host and that the host will resolve + + + August 13, 1982 - 28 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + the user name, placing the message in the user's mailbox. + + 6.2.2. ABBREVIATED DOMAIN SPECIFICATION + + Since any number of levels is possible within the domain + hierarchy, specification of a fully qualified address can + become inconvenient. This standard permits abbreviated domain + specification, in a special case: + + For the address of the sender, call the left-most + sub-domain Level N. In a header address, if all of + the sub-domains above (i.e., to the right of) Level N + are the same as those of the sender, then they do not + have to appear in the specification. Otherwise, the + address must be fully qualified. + + This feature is subject to approval by local sub- + domains. Individual sub-domains may require their + member systems, which originate mail, to provide full + domain specification only. When permitted, abbrevia- + tions may be present only while the message stays + within the sub-domain of the sender. + + Use of this mechanism requires the sender's sub-domain + to reserve the names of all top-level domains, so that + full specifications can be distinguished from abbrevi- + ated specifications. + + For example, if a sender's address is: + + sender@registry-A.registry-1.organization-X + + and one recipient's address is: + + recipient@registry-B.registry-1.organization-X + + and another's is: + + recipient@registry-C.registry-2.organization-X + + then ".registry-1.organization-X" need not be specified in the + the message, but "registry-C.registry-2" DOES have to be + specified. That is, the first two addresses may be abbrevi- + ated, but the third address must be fully specified. + + When a message crosses a domain boundary, all addresses must + be specified in the full format, ending with the top-level + name-domain in the right-most field. It is the responsibility + of mail forwarding services to ensure that addresses conform + + + August 13, 1982 - 29 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + with this requirement. In the case of abbreviated addresses, + the relaying service must make the necessary expansions. It + should be noted that it often is difficult for such a service + to locate all occurrences of address abbreviations. For exam- + ple, it will not be possible to find such abbreviations within + the body of the message. The "Return-Path" field can aid + recipients in recovering from these errors. + + Note: When passing any portion of an addr-spec onto a process + which does not interpret data according to this stan- + dard (e.g., mail protocol servers). There must be NO + LWSP-chars preceding or following the at-sign or any + delimiting period ("."), such as shown in the above + examples, and only ONE SPACE between contiguous + <word>s. + + 6.2.3. DOMAIN TERMS + + A domain-ref must be THE official name of a registry, network, + or host. It is a symbolic reference, within a name sub- + domain. At times, it is necessary to bypass standard mechan- + isms for resolving such references, using more primitive + information, such as a network host address rather than its + associated host name. + + To permit such references, this standard provides the domain- + literal construct. Its contents must conform with the needs + of the sub-domain in which it is interpreted. + + Domain-literals which refer to domains within the ARPA Inter- + net specify 32-bit Internet addresses, in four 8-bit fields + noted in decimal, as described in Request for Comments #820, + "Assigned Numbers." For example: + + [10.0.3.19] + + Note: THE USE OF DOMAIN-LITERALS IS STRONGLY DISCOURAGED. It + is permitted only as a means of bypassing temporary + system limitations, such as name tables which are not + complete. + + The names of "top-level" domains, and the names of domains + under in the ARPA Internet, are registered with the Network + Information Center, SRI International, Menlo Park, California. + + 6.2.4. DOMAIN-DEPENDENT LOCAL STRING + + The local-part of an addr-spec in a mailbox specification + (i.e., the host's name for the mailbox) is understood to be + + + August 13, 1982 - 30 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + whatever the receiving mail protocol server allows. For exam- + ple, some systems do not understand mailbox references of the + form "P. D. Q. Bach", but others do. + + This specification treats periods (".") as lexical separators. + Hence, their presence in local-parts which are not quoted- + strings, is detected. However, such occurrences carry NO + semantics. That is, if a local-part has periods within it, an + address parser will divide the local-part into several tokens, + but the sequence of tokens will be treated as one uninter- + preted unit. The sequence will be re-assembled, when the + address is passed outside of the system such as to a mail pro- + tocol service. + + For example, the address: + + First.Last@Registry.Org + + is legal and does not require the local-part to be surrounded + with quotation-marks. (However, "First Last" DOES require + quoting.) The local-part of the address, when passed outside + of the mail system, within the Registry.Org domain, is + "First.Last", again without quotation marks. + + 6.2.5. BALANCING LOCAL-PART AND DOMAIN + + In some cases, the boundary between local-part and domain can + be flexible. The local-part may be a simple string, which is + used for the final determination of the recipient's mailbox. + All other levels of reference are, therefore, part of the + domain. + + For some systems, in the case of abbreviated reference to the + local and subordinate sub-domains, it may be possible to + specify only one reference within the domain part and place + the other, subordinate name-domain references within the + local-part. This would appear as: + + mailbox.sub1.sub2@this-domain + + Such a specification would be acceptable to address parsers + which conform to RFC #733, but do not support this newer + Internet standard. While contrary to the intent of this stan- + dard, the form is legal. + + Also, some sub-domains have a specification syntax which does + not conform to this standard. For example: + + sub-net.mailbox@sub-domain.domain + + + August 13, 1982 - 31 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + uses a different parsing sequence for local-part than for + domain. + + Note: As a rule, the domain specification should contain + fields which are encoded according to the syntax of + this standard and which contain generally-standardized + information. The local-part specification should con- + tain only that portion of the address which deviates + from the form or intention of the domain field. + + 6.2.6. MULTIPLE MAILBOXES + + An individual may have several mailboxes and wish to receive + mail at whatever mailbox is convenient for the sender to + access. This standard does not provide a means of specifying + "any member of" a list of mailboxes. + + A set of individuals may wish to receive mail as a single unit + (i.e., a distribution list). The <group> construct permits + specification of such a list. Recipient mailboxes are speci- + fied within the bracketed part (":" - ";"). A copy of the + transmitted message is to be sent to each mailbox listed. + This standard does not permit recursive specification of + groups within groups. + + While a list must be named, it is not required that the con- + tents of the list be included. In this case, the <address> + serves only as an indication of group distribution and would + appear in the form: + + name:; + + Some mail services may provide a group-list distribution + facility, accepting a single mailbox reference, expanding it + to the full distribution list, and relaying the mail to the + list's members. This standard provides no additional syntax + for indicating such a service. Using the <group> address + alternative, while listing one mailbox in it, can mean either + that the mailbox reference will be expanded to a list or that + there is a group with one member. + + 6.2.7. EXPLICIT PATH SPECIFICATION + + At times, a message originator may wish to indicate the + transmission path that a message should follow. This is + called source routing. The normal addressing scheme, used in + an addr-spec, is carefully separated from such information; + the <route> portion of a route-addr is provided for such occa- + sions. It specifies the sequence of hosts and/or transmission + + + August 13, 1982 - 32 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + services that are to be traversed. Both domain-refs and + domain-literals may be used. + + Note: The use of source routing is discouraged. Unless the + sender has special need of path restriction, the choice + of transmission route should be left to the mail tran- + sport service. + + 6.3. RESERVED ADDRESS + + It often is necessary to send mail to a site, without know- + ing any of its valid addresses. For example, there may be mail + system dysfunctions, or a user may wish to find out a person's + correct address, at that site. + + This standard specifies a single, reserved mailbox address + (local-part) which is to be valid at each site. Mail sent to + that address is to be routed to a person responsible for the + site's mail system or to a person with responsibility for general + site operation. The name of the reserved local-part address is: + + Postmaster + + so that "Postmaster@domain" is required to be valid. + + Note: This reserved local-part must be matched without sensi- + tivity to alphabetic case, so that "POSTMASTER", "postmas- + ter", and even "poStmASteR" is to be accepted. + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 33 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + 7. BIBLIOGRAPHY + + + ANSI. "USA Standard Code for Information Interchange," X3.4. + American National Standards Institute: New York (1968). Also + in: Feinler, E. and J. Postel, eds., "ARPANET Protocol Hand- + book", NIC 7104. + + ANSI. "Representations of Universal Time, Local Time Differen- + tials, and United States Time Zone References for Information + Interchange," X3.51-1975. American National Standards Insti- + tute: New York (1975). + + Bemer, R.W., "Time and the Computer." In: Interface Age (Feb. + 1979). + + Bennett, C.J. "JNT Mail Protocol". Joint Network Team, Ruther- + ford and Appleton Laboratory: Didcot, England. + + Bhushan, A.K., Pogran, K.T., Tomlinson, R.S., and White, J.E. + "Standardizing Network Mail Headers," ARPANET Request for + Comments No. 561, Network Information Center No. 18516; SRI + International: Menlo Park (September 1973). + + Birrell, A.D., Levin, R., Needham, R.M., and Schroeder, M.D. + "Grapevine: An Exercise in Distributed Computing," Communica- + tions of the ACM 25, 4 (April 1982), 260-274. + + Crocker, D.H., Vittal, J.J., Pogran, K.T., Henderson, D.A. + "Standard for the Format of ARPA Network Text Message," + ARPANET Request for Comments No. 733, Network Information + Center No. 41952. SRI International: Menlo Park (November + 1977). + + Feinler, E.J. and Postel, J.B. ARPANET Protocol Handbook, Net- + work Information Center No. 7104 (NTIS AD A003890). SRI + International: Menlo Park (April 1976). + + Harary, F. "Graph Theory". Addison-Wesley: Reading, Mass. + (1969). + + Levin, R. and Schroeder, M. "Transport of Electronic Messages + through a Network," TeleInformatics 79, pp. 29-33. North + Holland (1979). Also as Xerox Palo Alto Research Center + Technical Report CSL-79-4. + + Myer, T.H. and Henderson, D.A. "Message Transmission Protocol," + ARPANET Request for Comments, No. 680, Network Information + Center No. 32116. SRI International: Menlo Park (1975). + + + August 13, 1982 - 34 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + NBS. "Specification of Message Format for Computer Based Message + Systems, Recommended Federal Information Processing Standard." + National Bureau of Standards: Gaithersburg, Maryland + (October 1981). + + NIC. Internet Protocol Transition Workbook. Network Information + Center, SRI-International, Menlo Park, California (March + 1982). + + Oppen, D.C. and Dalal, Y.K. "The Clearinghouse: A Decentralized + Agent for Locating Named Objects in a Distributed Environ- + ment," OPD-T8103. Xerox Office Products Division: Palo Alto, + CA. (October 1981). + + Postel, J.B. "Assigned Numbers," ARPANET Request for Comments, + No. 820. SRI International: Menlo Park (August 1982). + + Postel, J.B. "Simple Mail Transfer Protocol," ARPANET Request + for Comments, No. 821. SRI International: Menlo Park (August + 1982). + + Shoch, J.F. "Internetwork naming, addressing and routing," in + Proc. 17th IEEE Computer Society International Conference, pp. + 72-79, Sept. 1978, IEEE Cat. No. 78 CH 1388-8C. + + Su, Z. and Postel, J. "The Domain Naming Convention for Internet + User Applications," ARPANET Request for Comments, No. 819. + SRI International: Menlo Park (August 1982). + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 35 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + APPENDIX + + + A. EXAMPLES + + A.1. ADDRESSES + + A.1.1. Alfred Neuman <Neuman@BBN-TENEXA> + + A.1.2. Neuman@BBN-TENEXA + + These two "Alfred Neuman" examples have identical seman- + tics, as far as the operation of the local host's mail sending + (distribution) program (also sometimes called its "mailer") + and the remote host's mail protocol server are concerned. In + the first example, the "Alfred Neuman" is ignored by the + mailer, as "Neuman@BBN-TENEXA" completely specifies the reci- + pient. The second example contains no superfluous informa- + tion, and, again, "Neuman@BBN-TENEXA" is the intended reci- + pient. + + Note: When the message crosses name-domain boundaries, then + these specifications must be changed, so as to indicate + the remainder of the hierarchy, starting with the top + level. + + A.1.3. "George, Ted" <Shared@Group.Arpanet> + + This form might be used to indicate that a single mailbox + is shared by several users. The quoted string is ignored by + the originating host's mailer, because "Shared@Group.Arpanet" + completely specifies the destination mailbox. + + A.1.4. Wilt . (the Stilt) Chamberlain@NBA.US + + The "(the Stilt)" is a comment, which is NOT included in + the destination mailbox address handed to the originating + system's mailer. The local-part of the address is the string + "Wilt.Chamberlain", with NO space between the first and second + words. + + A.1.5. Address Lists + + Gourmets: Pompous Person <WhoZiWhatZit@Cordon-Bleu>, + Childs@WGBH.Boston, Galloping Gourmet@ + ANT.Down-Under (Australian National Television), + Cheapie@Discount-Liquors;, + Cruisers: Port@Portugal, Jones@SEA;, + Another@Somewhere.SomeOrg + + + August 13, 1982 - 36 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + This group list example points out the use of comments and the + mixing of addresses and groups. + + A.2. ORIGINATOR ITEMS + + A.2.1. Author-sent + + George Jones logs into his host as "Jones". He sends + mail himself. + + From: Jones@Group.Org + + or + + From: George Jones <Jones@Group.Org> + + A.2.2. Secretary-sent + + George Jones logs in as Jones on his host. His secre- + tary, who logs in as Secy sends mail for him. Replies to the + mail should go to George. + + From: George Jones <Jones@Group> + Sender: Secy@Other-Group + + A.2.3. Secretary-sent, for user of shared directory + + George Jones' secretary sends mail for George. Replies + should go to George. + + From: George Jones<Shared@Group.Org> + Sender: Secy@Other-Group + + Note that there need not be a space between "Jones" and the + "<", but adding a space enhances readability (as is the case + in other examples. + + A.2.4. Committee activity, with one author + + George is a member of a committee. He wishes to have any + replies to his message go to all committee members. + + From: George Jones <Jones@Host.Net> + Sender: Jones@Host + Reply-To: The Committee: Jones@Host.Net, + Smith@Other.Org, + Doe@Somewhere-Else; + + Note that if George had not included himself in the + + + August 13, 1982 - 37 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + enumeration of The Committee, he would not have gotten an + implicit reply; the presence of the "Reply-to" field SUPER- + SEDES the sending of a reply to the person named in the "From" + field. + + A.2.5. Secretary acting as full agent of author + + George Jones asks his secretary (Secy@Host) to send a + message for him in his capacity as Group. He wants his secre- + tary to handle all replies. + + From: George Jones <Group@Host> + Sender: Secy@Host + Reply-To: Secy@Host + + A.2.6. Agent for user without online mailbox + + A friend of George's, Sarah, is visiting. George's + secretary sends some mail to a friend of Sarah in computer- + land. Replies should go to George, whose mailbox is Jones at + Registry. + + From: Sarah Friendly <Secy@Registry> + Sender: Secy-Name <Secy@Registry> + Reply-To: Jones@Registry. + + A.2.7. Agent for member of a committee + + George's secretary sends out a message which was authored + jointly by all the members of a committee. Note that the name + of the committee cannot be specified, since <group> names are + not permitted in the From field. + + From: Jones@Host, + Smith@Other-Host, + Doe@Somewhere-Else + Sender: Secy@SHost + + + + + + + + + + + + + + + August 13, 1982 - 38 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + A.3. COMPLETE HEADERS + + A.3.1. Minimum required + + Date: 26 Aug 76 1429 EDT Date: 26 Aug 76 1429 EDT + From: Jones@Registry.Org or From: Jones@Registry.Org + Bcc: To: Smith@Registry.Org + + Note that the "Bcc" field may be empty, while the "To" field + is required to have at least one address. + + A.3.2. Using some of the additional fields + + Date: 26 Aug 76 1430 EDT + From: George Jones<Group@Host> + Sender: Secy@SHOST + To: "Al Neuman"@Mad-Host, + Sam.Irving@Other-Host + Message-ID: <some.string@SHOST> + + A.3.3. About as complex as you're going to get + + Date : 27 Aug 76 0932 PDT + From : Ken Davis <KDavis@This-Host.This-net> + Subject : Re: The Syntax in the RFC + Sender : KSecy@Other-Host + Reply-To : Sam.Irving@Reg.Organization + To : George Jones <Group@Some-Reg.An-Org>, + Al.Neuman@MAD.Publisher + cc : Important folk: + Tom Softwood <Balsa@Tree.Root>, + "Sam Irving"@Other-Host;, + Standard Distribution: + /main/davis/people/standard@Other-Host, + "<Jones>standard.dist.3"@Tops-20-Host>; + Comment : Sam is away on business. He asked me to handle + his mail for him. He'll be able to provide a + more accurate explanation when he returns + next week. + In-Reply-To: <some.string@DBM.Group>, George's message + X-Special-action: This is a sample of user-defined field- + names. There could also be a field-name + "Special-action", but its name might later be + preempted + Message-ID: <4231.629.XYzi-What@Other-Host> + + + + + + + August 13, 1982 - 39 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + B. SIMPLE FIELD PARSING + + Some mail-reading software systems may wish to perform only + minimal processing, ignoring the internal syntax of structured + field-bodies and treating them the same as unstructured-field- + bodies. Such software will need only to distinguish: + + o Header fields from the message body, + + o Beginnings of fields from lines which continue fields, + + o Field-names from field-contents. + + The abbreviated set of syntactic rules which follows will + suffice for this purpose. It describes a limited view of mes- + sages and is a subset of the syntactic rules provided in the main + part of this specification. One small exception is that the con- + tents of field-bodies consist only of text: + + B.1. SYNTAX + + + message = *field *(CRLF *text) + + field = field-name ":" [field-body] CRLF + + field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> + + field-body = *text [CRLF LWSP-char field-body] + + + B.2. SEMANTICS + + Headers occur before the message body and are terminated by + a null line (i.e., two contiguous CRLFs). + + A line which continues a header field begins with a SPACE or + HTAB character, while a line beginning a field starts with a + printable character which is not a colon. + + A field-name consists of one or more printable characters + (excluding colon, space, and control-characters). A field-name + MUST be contained on one line. Upper and lower case are not dis- + tinguished when comparing field-names. + + + + + + + + August 13, 1982 - 40 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + C. DIFFERENCES FROM RFC #733 + + The following summarizes the differences between this stan- + dard and the one specified in Arpanet Request for Comments #733, + "Standard for the Format of ARPA Network Text Messages". The + differences are listed in the order of their occurrence in the + current specification. + + C.1. FIELD DEFINITIONS + + C.1.1. FIELD NAMES + + These now must be a sequence of printable characters. They + may not contain any LWSP-chars. + + C.2. LEXICAL TOKENS + + C.2.1. SPECIALS + + The characters period ("."), left-square bracket ("["), and + right-square bracket ("]") have been added. For presentation + purposes, and when passing a specification to a system that + does not conform to this standard, periods are to be contigu- + ous with their surrounding lexical tokens. No linear-white- + space is permitted between them. The presence of one LWSP- + char between other tokens is still directed. + + C.2.2. ATOM + + Atoms may not contain SPACE. + + C.2.3. SPECIAL TEXT + + ctext and qtext have had backslash ("\") added to the list of + prohibited characters. + + C.2.4. DOMAINS + + The lexical tokens <domain-literal> and <dtext> have been + added. + + C.3. MESSAGE SPECIFICATION + + C.3.1. TRACE + + The "Return-path:" and "Received:" fields have been specified. + + + + + + August 13, 1982 - 41 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + C.3.2. FROM + + The "From" field must contain machine-usable addresses (addr- + spec). Multiple addresses may be specified, but named-lists + (groups) may not. + + C.3.3. RESENT + + The meta-construct of prefacing field names with the string + "Resent-" has been added, to indicate that a message has been + forwarded by an intermediate recipient. + + C.3.4. DESTINATION + + A message must contain at least one destination address field. + "To" and "CC" are required to contain at least one address. + + C.3.5. IN-REPLY-TO + + The field-body is no longer a comma-separated list, although a + sequence is still permitted. + + C.3.6. REFERENCE + + The field-body is no longer a comma-separated list, although a + sequence is still permitted. + + C.3.7. ENCRYPTED + + A field has been specified that permits senders to indicate + that the body of a message has been encrypted. + + C.3.8. EXTENSION-FIELD + + Extension fields are prohibited from beginning with the char- + acters "X-". + + C.4. DATE AND TIME SPECIFICATION + + C.4.1. SIMPLIFICATION + + Fewer optional forms are permitted and the list of three- + letter time zones has been shortened. + + C.5. ADDRESS SPECIFICATION + + + + + + + August 13, 1982 - 42 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + C.5.1. ADDRESS + + The use of quoted-string, and the ":"-atom-":" construct, have + been removed. An address now is either a single mailbox + reference or is a named list of addresses. The latter indi- + cates a group distribution. + + C.5.2. GROUPS + + Group lists are now required to to have a name. Group lists + may not be nested. + + C.5.3. MAILBOX + + A mailbox specification may indicate a person's name, as + before. Such a named list no longer may specify multiple + mailboxes and may not be nested. + + C.5.4. ROUTE ADDRESSING + + Addresses now are taken to be absolute, global specifications, + independent of transmission paths. The <route> construct has + been provided, to permit explicit specification of transmis- + sion path. RFC #733's use of multiple at-signs ("@") was + intended as a general syntax for indicating routing and/or + hierarchical addressing. The current standard separates these + specifications and only one at-sign is permitted. + + C.5.5. AT-SIGN + + The string " at " no longer is used as an address delimiter. + Only at-sign ("@") serves the function. + + C.5.6. DOMAINS + + Hierarchical, logical name-domains have been added. + + C.6. RESERVED ADDRESS + + The local-part "Postmaster" has been reserved, so that users can + be guaranteed at least one valid address at a site. + + + + + + + + + + + August 13, 1982 - 43 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + D. ALPHABETICAL LISTING OF SYNTAX RULES + + address = mailbox ; one addressee + / group ; named list + addr-spec = local-part "@" domain ; global address + ALPHA = <any ASCII alphabetic character> + ; (101-132, 65.- 90.) + ; (141-172, 97.-122.) + atom = 1*<any CHAR except specials, SPACE and CTLs> + authentic = "From" ":" mailbox ; Single author + / ( "Sender" ":" mailbox ; Actual submittor + "From" ":" 1#mailbox) ; Multiple authors + ; or not sender + CHAR = <any ASCII character> ; ( 0-177, 0.-127.) + comment = "(" *(ctext / quoted-pair / comment) ")" + CR = <ASCII CR, carriage return> ; ( 15, 13.) + CRLF = CR LF + ctext = <any CHAR excluding "(", ; => may be folded + ")", "\" & CR, & including + linear-white-space> + CTL = <any ASCII control ; ( 0- 37, 0.- 31.) + character and DEL> ; ( 177, 127.) + date = 1*2DIGIT month 2DIGIT ; day month year + ; e.g. 20 Jun 82 + dates = orig-date ; Original + [ resent-date ] ; Forwarded + date-time = [ day "," ] date time ; dd mm yy + ; hh:mm:ss zzz + day = "Mon" / "Tue" / "Wed" / "Thu" + / "Fri" / "Sat" / "Sun" + delimiters = specials / linear-white-space / comment + destination = "To" ":" 1#address ; Primary + / "Resent-To" ":" 1#address + / "cc" ":" 1#address ; Secondary + / "Resent-cc" ":" 1#address + / "bcc" ":" #address ; Blind carbon + / "Resent-bcc" ":" #address + DIGIT = <any ASCII decimal digit> ; ( 60- 71, 48.- 57.) + domain = sub-domain *("." sub-domain) + domain-literal = "[" *(dtext / quoted-pair) "]" + domain-ref = atom ; symbolic reference + dtext = <any CHAR excluding "[", ; => may be folded + "]", "\" & CR, & including + linear-white-space> + extension-field = + <Any field which is defined in a document + published as a formal extension to this + specification; none will have names beginning + with the string "X-"> + + + August 13, 1982 - 44 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + field = field-name ":" [ field-body ] CRLF + fields = dates ; Creation time, + source ; author id & one + 1*destination ; address required + *optional-field ; others optional + field-body = field-body-contents + [CRLF LWSP-char field-body] + field-body-contents = + <the ASCII characters making up the field-body, as + defined in the following sections, and consisting + of combinations of atom, quoted-string, and + specials tokens, or else consisting of texts> + field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":"> + group = phrase ":" [#mailbox] ";" + hour = 2DIGIT ":" 2DIGIT [":" 2DIGIT] + ; 00:00:00 - 23:59:59 + HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) + LF = <ASCII LF, linefeed> ; ( 12, 10.) + linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE + ; CRLF => folding + local-part = word *("." word) ; uninterpreted + ; case-preserved + LWSP-char = SPACE / HTAB ; semantics = SPACE + mailbox = addr-spec ; simple address + / phrase route-addr ; name & addr-spec + message = fields *( CRLF *text ) ; Everything after + ; first null line + ; is message body + month = "Jan" / "Feb" / "Mar" / "Apr" + / "May" / "Jun" / "Jul" / "Aug" + / "Sep" / "Oct" / "Nov" / "Dec" + msg-id = "<" addr-spec ">" ; Unique message id + optional-field = + / "Message-ID" ":" msg-id + / "Resent-Message-ID" ":" msg-id + / "In-Reply-To" ":" *(phrase / msg-id) + / "References" ":" *(phrase / msg-id) + / "Keywords" ":" #phrase + / "Subject" ":" *text + / "Comments" ":" *text + / "Encrypted" ":" 1#2word + / extension-field ; To be defined + / user-defined-field ; May be pre-empted + orig-date = "Date" ":" date-time + originator = authentic ; authenticated addr + [ "Reply-To" ":" 1#address] ) + phrase = 1*word ; Sequence of words + + + + + August 13, 1982 - 45 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + qtext = <any CHAR excepting <">, ; => may be folded + "\" & CR, and including + linear-white-space> + quoted-pair = "\" CHAR ; may quote any char + quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or + ; quoted chars. + received = "Received" ":" ; one per relay + ["from" domain] ; sending host + ["by" domain] ; receiving host + ["via" atom] ; physical path + *("with" atom) ; link/mail protocol + ["id" msg-id] ; receiver msg id + ["for" addr-spec] ; initial form + ";" date-time ; time received + + resent = resent-authentic + [ "Resent-Reply-To" ":" 1#address] ) + resent-authentic = + = "Resent-From" ":" mailbox + / ( "Resent-Sender" ":" mailbox + "Resent-From" ":" 1#mailbox ) + resent-date = "Resent-Date" ":" date-time + return = "Return-path" ":" route-addr ; return address + route = 1#("@" domain) ":" ; path-relative + route-addr = "<" [route] addr-spec ">" + source = [ trace ] ; net traversals + originator ; original mail + [ resent ] ; forwarded + SPACE = <ASCII SP, space> ; ( 40, 32.) + specials = "(" / ")" / "<" / ">" / "@" ; Must be in quoted- + / "," / ";" / ":" / "\" / <"> ; string, to use + / "." / "[" / "]" ; within a word. + sub-domain = domain-ref / domain-literal + text = <any CHAR, including bare ; => atoms, specials, + CR & bare LF, but NOT ; comments and + including CRLF> ; quoted-strings are + ; NOT recognized. + time = hour zone ; ANSI and Military + trace = return ; path to sender + 1*received ; receipt tags + user-defined-field = + <Any field which has not been defined + in this specification or published as an + extension to this specification; names for + such fields must be unique and may be + pre-empted by published extensions> + word = atom / quoted-string + + + + + August 13, 1982 - 46 - RFC #822 + + + + Standard for ARPA Internet Text Messages + + + zone = "UT" / "GMT" ; Universal Time + ; North American : UT + / "EST" / "EDT" ; Eastern: - 5/ - 4 + / "CST" / "CDT" ; Central: - 6/ - 5 + / "MST" / "MDT" ; Mountain: - 7/ - 6 + / "PST" / "PDT" ; Pacific: - 8/ - 7 + / 1ALPHA ; Military: Z = UT; + <"> = <ASCII quote mark> ; ( 42, 34.) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + August 13, 1982 - 47 - RFC #822 + diff --git a/rfc/sieve/rfc3028.txt b/rfc/sieve/rfc3028.txt @@ -0,0 +1,2019 @@ + + + + + + +Network Working Group T. Showalter +Request for Comments: 3028 Mirapoint, Inc. +Category: Standards Track January 2001 + + + Sieve: A Mail Filtering Language + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2001). All Rights Reserved. + +Abstract + + This document describes a language for filtering e-mail messages at + time of final delivery. It is designed to be implementable on either + a mail client or mail server. It is meant to be extensible, simple, + and independent of access protocol, mail architecture, and operating + system. It is suitable for running on a mail server where users may + not be allowed to execute arbitrary programs, such as on black box + Internet Message Access Protocol (IMAP) servers, as it has no + variables, loops, or ability to shell out to external programs. + +Table of Contents + + 1. Introduction ........................................... 3 + 1.1. Conventions Used in This Document ..................... 4 + 1.2. Example mail messages ................................. 4 + 2. Design ................................................. 5 + 2.1. Form of the Language .................................. 5 + 2.2. Whitespace ............................................ 5 + 2.3. Comments .............................................. 6 + 2.4. Literal Data .......................................... 6 + 2.4.1. Numbers ............................................... 6 + 2.4.2. Strings ............................................... 7 + 2.4.2.1. String Lists .......................................... 7 + 2.4.2.2. Headers ............................................... 8 + 2.4.2.3. Addresses ............................................. 8 + 2.4.2.4. MIME Parts ............................................ 9 + 2.5. Tests ................................................. 9 + 2.5.1. Test Lists ............................................ 9 + + + +Showalter Standards Track [Page 1] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + 2.6. Arguments ............................................. 9 + 2.6.1. Positional Arguments .................................. 9 + 2.6.2. Tagged Arguments ...................................... 10 + 2.6.3. Optional Arguments .................................... 10 + 2.6.4. Types of Arguments .................................... 10 + 2.7. String Comparison ..................................... 11 + 2.7.1. Match Type ............................................ 11 + 2.7.2. Comparisons Across Character Sets ..................... 12 + 2.7.3. Comparators ........................................... 12 + 2.7.4. Comparisons Against Addresses ......................... 13 + 2.8. Blocks ................................................ 14 + 2.9. Commands .............................................. 14 + 2.10. Evaluation ............................................ 15 + 2.10.1. Action Interaction .................................... 15 + 2.10.2. Implicit Keep ......................................... 15 + 2.10.3. Message Uniqueness in a Mailbox ....................... 15 + 2.10.4. Limits on Numbers of Actions .......................... 16 + 2.10.5. Extensions and Optional Features ...................... 16 + 2.10.6. Errors ................................................ 17 + 2.10.7. Limits on Execution ................................... 17 + 3. Control Commands ....................................... 17 + 3.1. Control Structure If .................................. 18 + 3.2. Control Structure Require ............................. 19 + 3.3. Control Structure Stop ................................ 19 + 4. Action Commands ........................................ 19 + 4.1. Action reject ......................................... 20 + 4.2. Action fileinto ....................................... 20 + 4.3. Action redirect ....................................... 21 + 4.4. Action keep ........................................... 21 + 4.5. Action discard ........................................ 22 + 5. Test Commands .......................................... 22 + 5.1. Test address .......................................... 23 + 5.2. Test allof ............................................ 23 + 5.3. Test anyof ............................................ 24 + 5.4. Test envelope ......................................... 24 + 5.5. Test exists ........................................... 25 + 5.6. Test false ............................................ 25 + 5.7. Test header ........................................... 25 + 5.8. Test not .............................................. 26 + 5.9. Test size ............................................. 26 + 5.10. Test true ............................................. 26 + 6. Extensibility .......................................... 26 + 6.1. Capability String ..................................... 27 + 6.2. IANA Considerations ................................... 28 + 6.2.1. Template for Capability Registrations ................. 28 + 6.2.2. Initial Capability Registrations ...................... 28 + 6.3. Capability Transport .................................. 29 + 7. Transmission ........................................... 29 + + + +Showalter Standards Track [Page 2] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + 8. Parsing ................................................ 30 + 8.1. Lexical Tokens ........................................ 30 + 8.2. Grammar ............................................... 31 + 9. Extended Example ....................................... 32 + 10. Security Considerations ................................ 34 + 11. Acknowledgments ........................................ 34 + 12. Author's Address ....................................... 34 + 13. References ............................................. 34 + 14. Full Copyright Statement ............................... 36 + +1. Introduction + + This memo documents a language that can be used to create filters for + electronic mail. It is not tied to any particular operating system or + mail architecture. It requires the use of [IMAIL]-compliant + messages, but should otherwise generalize to many systems. + + The language is powerful enough to be useful but limited in order to + allow for a safe server-side filtering system. The intention is to + make it impossible for users to do anything more complex (and + dangerous) than write simple mail filters, along with facilitating + the use of GUIs for filter creation and manipulation. The language is + not Turing-complete: it provides no way to write a loop or a function + and variables are not provided. + + Scripts written in Sieve are executed during final delivery, when the + message is moved to the user-accessible mailbox. In systems where + the MTA does final delivery, such as traditional Unix mail, it is + reasonable to sort when the MTA deposits mail into the user's + mailbox. + + There are a number of reasons to use a filtering system. Mail + traffic for most users has been increasing due to increased usage of + e-mail, the emergence of unsolicited email as a form of advertising, + and increased usage of mailing lists. + + Experience at Carnegie Mellon has shown that if a filtering system is + made available to users, many will make use of it in order to file + messages from specific users or mailing lists. However, many others + did not make use of the Andrew system's FLAMES filtering language + [FLAMES] due to difficulty in setting it up. + + Because of the expectation that users will make use of filtering if + it is offered and easy to use, this language has been made simple + enough to allow many users to make use of it, but rich enough that it + can be used productively. However, it is expected that GUI-based + editors will be the preferred way of editing filters for a large + number of users. + + + +Showalter Standards Track [Page 3] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +1.1. Conventions Used in This Document + + In the sections of this document that discuss the requirements of + various keywords and operators, the following conventions have been + adopted. + + The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and + "MAY" in this document are to be interpreted as defined in + [KEYWORDS]. + + Each section on a command (test, action, or control structure) has a + line labeled "Syntax:". This line describes the syntax of the + command, including its name and its arguments. Required arguments + are listed inside angle brackets ("<" and ">"). Optional arguments + are listed inside square brackets ("[" and "]"). Each argument is + followed by its type, so "<key: string>" represents an argument + called "key" that is a string. Literal strings are represented with + double-quoted strings. Alternatives are separated with slashes, and + parenthesis are used for grouping, similar to [ABNF]. + + In the "Syntax" line, there are three special pieces of syntax that + are frequently repeated, MATCH-TYPE, COMPARATOR, and ADDRESS-PART. + These are discussed in sections 2.7.1, 2.7.3, and 2.7.4, + respectively. + + The formal grammar for these commands in section 10 and is the + authoritative reference on how to construct commands, but the formal + grammar does not specify the order, semantics, number or types of + arguments to commands, nor the legal command names. The intent is to + allow for extension without changing the grammar. + +1.2. Example mail messages + + The following mail messages will be used throughout this document in + examples. + + Message A + ----------------------------------------------------------- + Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) + From: coyote@desert.example.org + To: roadrunner@acme.example.com + Subject: I have a present for you + + Look, I'm sorry about the whole anvil thing, and I really + didn't mean to try and drop it on you from the top of the + cliff. I want to try to make it up to you. I've got some + great birdseed over here at my place--top of the line + + + + +Showalter Standards Track [Page 4] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + stuff--and if you come by, I'll have it all wrapped up + for you. I'm really sorry for all the problems I've caused + for you over the years, but I know we can work this out. + -- + Wile E. Coyote "Super Genius" coyote@desert.example.org + ----------------------------------------------------------- + + Message B + ----------------------------------------------------------- + From: youcouldberich!@reply-by-postal-mail.invalid + Sender: b1ff@de.res.example.com + To: rube@landru.example.edu + Date: Mon, 31 Mar 1997 18:26:10 -0800 + Subject: $$$ YOU, TOO, CAN BE A MILLIONAIRE! $$$ + + YOU MAY HAVE ALREADY WON TEN MILLION DOLLARS, BUT I DOUBT + IT! SO JUST POST THIS TO SIX HUNDRED NEWSGROUPS! IT WILL + GUARANTEE THAT YOU GET AT LEAST FIVE RESPONSES WITH MONEY! + MONEY! MONEY! COLD HARD CASH! YOU WILL RECEIVE OVER + $20,000 IN LESS THAN TWO MONTHS! AND IT'S LEGAL!!!!!!!!! + !!!!!!!!!!!!!!!!!!111111111!!!!!!!11111111111!!1 JUST + SEND $5 IN SMALL, UNMARKED BILLS TO THE ADDRESSES BELOW! + ----------------------------------------------------------- + +2. Design + +2.1. Form of the Language + + The language consists of a set of commands. Each command consists of + a set of tokens delimited by whitespace. The command identifier is + the first token and it is followed by zero or more argument tokens. + Arguments may be literal data, tags, blocks of commands, or test + commands. + + The language is represented in UTF-8, as specified in [UTF-8]. + + Tokens in the ASCII range are considered case-insensitive. + +2.2. Whitespace + + Whitespace is used to separate tokens. Whitespace is made up of + tabs, newlines (CRLF, never just CR or LF), and the space character. + The amount of whitespace used is not significant. + + + + + + + + +Showalter Standards Track [Page 5] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +2.3. Comments + + Two types of comments are offered. Comments are semantically + equivalent to whitespace and can be used anyplace that whitespace is + (with one exception in multi-line strings, as described in the + grammar). + + Hash comments begin with a "#" character that is not contained within + a string and continue until the next CRLF. + + Example: if size :over 100K { # this is a comment + discard; + } + + Bracketed comments begin with the token "/*" and end with "*/" outside + of a string. Bracketed comments may span multiple lines. Bracketed + comments do not nest. + + Example: if size :over 100K { /* this is a comment + this is still a comment */ discard /* this is a comment + */ ; + } + +2.4. Literal Data + + Literal data means data that is not executed, merely evaluated "as + is", to be used as arguments to commands. Literal data is limited to + numbers and strings. + +2.4.1. Numbers + + Numbers are given as ordinary decimal numbers. However, those + numbers that have a tendency to be fairly large, such as message + sizes, MAY have a "K", "M", or "G" appended to indicate a multiple of + a power of two. To be comparable with the power-of-two-based + versions of SI units that computers frequently use, K specifies + kibi-, or 1,024 (2^10) times the value of the number; M specifies + mebi-, or 1,048,576 (2^20) times the value of the number; and G + specifies tebi-, or 1,073,741,824 (2^30) times the value of the + number [BINARY-SI]. + + Implementations MUST provide 31 bits of magnitude in numbers, but MAY + provide more. + + Only positive integers are permitted by this specification. + + + + + + +Showalter Standards Track [Page 6] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +2.4.2. Strings + + Scripts involve large numbers of strings as they are used for pattern + matching, addresses, textual bodies, etc. Typically, short quoted + strings suffice for most uses, but a more convenient form is provided + for longer strings such as bodies of messages. + + A quoted string starts and ends with a single double quote (the <"> + character, ASCII 34). A backslash ("\", ASCII 92) inside of a quoted + string is followed by either another backslash or a double quote. + This two-character sequence represents a single backslash or double- + quote within the string, respectively. + + No other characters should be escaped with a single backslash. + + An undefined escape sequence (such as "\a" in a context where "a" has + no special meaning) is interpreted as if there were no backslash (in + this case, "\a" is just "a"). + + Non-printing characters such as tabs, CR and LF, and control + characters are permitted in quoted strings. Quoted strings MAY span + multiple lines. NUL (ASCII 0) is not allowed in strings. + + For entering larger amounts of text, such as an email message, a + multi-line form is allowed. It starts with the keyword "text:", + followed by a CRLF, and ends with the sequence of a CRLF, a single + period, and another CRLF. In order to allow the message to contain + lines with a single-dot, lines are dot-stuffed. That is, when + composing a message body, an extra `.' is added before each line + which begins with a `.'. When the server interprets the script, + these extra dots are removed. Note that a line that begins with a + dot followed by a non-dot character is not interpreted dot-stuffed; + that is, ".foo" is interpreted as ".foo". However, because this is + potentially ambiguous, scripts SHOULD be properly dot-stuffed so such + lines do not appear. + + Note that a hashed comment or whitespace may occur in between the + "text:" and the CRLF, but not within the string itself. Bracketed + comments are not allowed here. + +2.4.2.1. String Lists + + When matching patterns, it is frequently convenient to match against + groups of strings instead of single strings. For this reason, a list + of strings is allowed in many tests, implying that if the test is + true using any one of the strings, then the test is true. + Implementations are encouraged to use short-circuit evaluation in + these cases. + + + +Showalter Standards Track [Page 7] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + For instance, the test `header :contains ["To", "Cc"] + ["me@example.com", "me00@landru.example.edu"]' is true if either the + To header or Cc header of the input message contains either of the + e-mail addresses "me@example.com" or "me00@landru.example.edu". + + Conversely, in any case where a list of strings is appropriate, a + single string is allowed without being a member of a list: it is + equivalent to a list with a single member. This means that the test + `exists "To"' is equivalent to the test `exists ["To"]'. + +2.4.2.2. Headers + + Headers are a subset of strings. In the Internet Message + Specification [IMAIL] [RFC1123], each header line is allowed to have + whitespace nearly anywhere in the line, including after the field + name and before the subsequent colon. Extra spaces between the + header name and the ":" in a header field are ignored. + + A header name never contains a colon. The "From" header refers to a + line beginning "From:" (or "From :", etc.). No header will match + the string "From:" due to the trailing colon. + + Folding of long header lines (as described in [IMAIL] 3.4.8) is + removed prior to interpretation of the data. The folding syntax (the + CRLF that ends a line plus any leading whitespace at the beginning of + the next line that indicates folding) are interpreted as if they were + a single space. + +2.4.2.3. Addresses + + A number of commands call for email addresses, which are also a + subset of strings. When these addresses are used in outbound + contexts, addresses must be compliant with [IMAIL], but are further + constrained. Using the symbols defined in [IMAIL], section 6.1, the + syntax of an address is: + + sieve-address = addr-spec ; simple address + / phrase "<" addr-spec ">" ; name & addr-spec + + That is, routes and group syntax are not permitted. If multiple + addresses are required, use a string list. Named groups are not used + here. + + Implementations MUST ensure that the addresses are syntactically + valid, but need not ensure that they actually identify an email + recipient. + + + + + +Showalter Standards Track [Page 8] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +2.4.2.4. MIME Parts + + In a few places, [MIME] body parts are represented as strings. These + parts include MIME headers and the body. This provides a way of + embedding typed data within a Sieve script so that, among other + things, character sets other than UTF-8 can be used for output + messages. + +2.5. Tests + + Tests are given as arguments to commands in order to control their + actions. In this document, tests are given to if/elsif/else to + decide which block of code is run. + + Tests MUST NOT have side effects. That is, a test cannot affect the + state of the filter or message. No tests in this specification have + side effects, and side effects are forbidden in extension tests as + well. + + The rationale for this is that tests with side effects impair + readability and maintainability and are difficult to represent in a + graphic interface for generating scripts. Side effects are confined + to actions where they are clearer. + +2.5.1. Test Lists + + Some tests ("allof" and "anyof", which implement logical "and" and + logical "or", respectively) may require more than a single test as an + argument. The test-list syntax element provides a way of grouping + tests. + + Example: if anyof (not exists ["From", "Date"], + header :contains "from" "fool@example.edu") { + discard; + } + +2.6. Arguments + + In order to specify what to do, most commands take arguments. There + are three types of arguments: positional, tagged, and optional. + +2.6.1. Positional Arguments + + Positional arguments are given to a command which discerns their + meaning based on their order. When a command takes positional + arguments, all positional arguments must be supplied and must be in + the order prescribed. + + + + +Showalter Standards Track [Page 9] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +2.6.2. Tagged Arguments + + This document provides for tagged arguments in the style of + CommonLISP. These are also similar to flags given to commands in + most command-line systems. + + A tagged argument is an argument for a command that begins with ":" + followed by a tag naming the argument, such as ":contains". This + argument means that zero or more of the next tokens have some + particular meaning depending on the argument. These next tokens may + be numbers or strings but they are never blocks. + + Tagged arguments are similar to positional arguments, except that + instead of the meaning being derived from the command, it is derived + from the tag. + + Tagged arguments must appear before positional arguments, but they + may appear in any order with other tagged arguments. For simplicity + of the specification, this is not expressed in the syntax definitions + with commands, but they still may be reordered arbitrarily provided + they appear before positional arguments. Tagged arguments may be + mixed with optional arguments. + + To simplify this specification, tagged arguments SHOULD NOT take + tagged arguments as arguments. + +2.6.3. Optional Arguments + + Optional arguments are exactly like tagged arguments except that they + may be left out, in which case a default value is implied. Because + optional arguments tend to result in shorter scripts, they have been + used far more than tagged arguments. + + One particularly noteworthy case is the ":comparator" argument, which + allows the user to specify which [ACAP] comparator will be used to + compare two strings, since different languages may impose different + orderings on UTF-8 [UTF-8] characters. + +2.6.4. Types of Arguments + + Abstractly, arguments may be literal data, tests, or blocks of + commands. In this way, an "if" control structure is merely a command + that happens to take a test and a block as arguments and may execute + the block of code. + + However, this abstraction is ambiguous from a parsing standpoint. + The grammar in section 9.2 presents a parsable version of this: + Arguments are string-lists, numbers, and tags, which may be followed + + + +Showalter Standards Track [Page 10] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + by a test or a test-list, which may be followed by a block of + commands. No more than one test or test list, nor more than one + block of commands, may be used, and commands that end with blocks of + commands do not end with semicolons. + +2.7. String Comparison + + When matching one string against another, there are a number of ways + of performing the match operation. These are accomplished with three + types of matches: an exact match, a substring match, and a wildcard + glob-style match. These are described below. + + In order to provide for matches between character sets and case + insensitivity, Sieve borrows ACAP's comparator registry. + + However, when a string represents the name of a header, the + comparator is never user-specified. Header comparisons are always + done with the "i;ascii-casemap" operator, i.e., case-insensitive + comparisons, because this is the way things are defined in the + message specification [IMAIL]. + +2.7.1. Match Type + + There are three match types describing the matching used in this + specification: ":is", ":contains", and ":matches". Match type + arguments are supplied to those commands which allow them to specify + what kind of match is to be performed. + + These are used as tagged arguments to tests that perform string + comparison. + + The ":contains" match type describes a substring match. If the value + argument contains the key argument as a substring, the match is true. + For instance, the string "frobnitzm" contains "frob" and "nit", but + not "fbm". The null key ("") is contained in all values. + + The ":is" match type describes an absolute match; if the contents of + the first string are absolutely the same as the contents of the + second string, they match. Only the string "frobnitzm" is the string + "frobnitzm". The null key ":is" and only ":is" the null value. + + The ":matches" version specifies a wildcard match using the + characters "*" and "?". "*" matches zero or more characters, and "?" + matches a single character. "?" and "*" may be escaped as "\\?" and + "\\*" in strings to match against themselves. The first backslash + escapes the second backslash; together, they escape the "*". This is + awkward, but it is commonplace in several programming languages that + use globs and regular expressions. + + + +Showalter Standards Track [Page 11] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + In order to specify what type of match is supposed to happen, + commands that support matching take optional tagged arguments + ":matches", ":is", and ":contains". Commands default to using ":is" + matching if no match type argument is supplied. Note that these + modifiers may interact with comparators; in particular, some + comparators are not suitable for matching with ":contains" or + ":matches". It is an error to use a comparator with ":contains" or + ":matches" that is not compatible with it. + + It is an error to give more than one of these arguments to a given + command. + + For convenience, the "MATCH-TYPE" syntax element is defined here as + follows: + + Syntax: ":is" / ":contains" / ":matches" + +2.7.2. Comparisons Across Character Sets + + All Sieve scripts are represented in UTF-8, but messages may involve + a number of character sets. In order for comparisons to work across + character sets, implementations SHOULD implement the following + behavior: + + Implementations decode header charsets to UTF-8. Two strings are + considered equal if their UTF-8 representations are identical. + Implementations should decode charsets represented in the forms + specified by [MIME] for both message headers and bodies. + Implementations must be capable of decoding US-ASCII, ISO-8859-1, + the ASCII subset of ISO-8859-* character sets, and UTF-8. + + If implementations fail to support the above behavior, they MUST + conform to the following: + + No two strings can be considered equal if one contains octets + greater than 127. + +2.7.3. Comparators + + In order to allow for language-independent, case-independent matches, + the match type may be coupled with a comparator name. Comparators + are described for [ACAP]; a registry is defined for ACAP, and this + specification uses that registry. + + ACAP defines multiple comparator types. Only equality types are used + in this specification. + + + + + +Showalter Standards Track [Page 12] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + All implementations MUST support the "i;octet" comparator (simply + compares octets) and the "i;ascii-casemap" comparator (which treats + uppercase and lowercase characters in the ASCII subset of UTF-8 as + the same). If left unspecified, the default is "i;ascii-casemap". + + Some comparators may not be usable with substring matches; that is, + they may only work with ":is". It is an error to try and use a + comparator with ":matches" or ":contains" that is not compatible with + it. + + A comparator is specified by the ":comparator" option with commands + that support matching. This option is followed by a string providing + the name of the comparator to be used. For convenience, the syntax + of a comparator is abbreviated to "COMPARATOR", and (repeated in + several tests) is as follows: + + Syntax: ":comparator" <comparator-name: string> + + So in this example, + + Example: if header :contains :comparator "i;octet" "Subject" + "MAKE MONEY FAST" { + discard; + } + + would discard any message with subjects like "You can MAKE MONEY + FAST", but not "You can Make Money Fast", since the comparator used + is case-sensitive. + + Comparators other than i;octet and i;ascii-casemap must be declared + with require, as they are extensions. If a comparator declared with + require is not known, it is an error, and execution fails. If the + comparator is not declared with require, it is also an error, even if + the comparator is supported. (See 2.10.5.) + + Both ":matches" and ":contains" match types are compatible with the + "i;octet" and "i;ascii-casemap" comparators and may be used with + them. + + It is an error to give more than one of these arguments to a given + command. + +2.7.4. Comparisons Against Addresses + + Addresses are one of the most frequent things represented as strings. + These are structured, and being able to compare against the local- + part or the domain of an address is useful, so some tests that act + + + + +Showalter Standards Track [Page 13] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + exclusively on addresses take an additional optional argument that + specifies what the test acts on. + + These optional arguments are ":localpart", ":domain", and ":all", + which act on the local-part (left-side), the domain part (right- + side), and the whole address. + + The kind of comparison done, such as whether or not the test done is + case-insensitive, is specified as a comparator argument to the test. + + If an optional address-part is omitted, the default is ":all". + + It is an error to give more than one of these arguments to a given + command. + + For convenience, the "ADDRESS-PART" syntax element is defined here as + follows: + + Syntax: ":localpart" / ":domain" / ":all" + +2.8. Blocks + + Blocks are sets of commands enclosed within curly braces. Blocks are + supplied to commands so that the commands can implement control + commands. + + A control structure is a command that happens to take a test and a + block as one of its arguments; depending on the result of the test + supplied as another argument, it runs the code in the block some + number of times. + + With the commands supplied in this memo, there are no loops. The + control structures supplied--if, elsif, and else--run a block either + once or not at all. So there are two arguments, the test and the + block. + +2.9. Commands + + Sieve scripts are sequences of commands. Commands can take any of + the tokens above as arguments, and arguments may be either tagged or + positional arguments. Not all commands take all arguments. + + There are three kinds of commands: test commands, action commands, + and control commands. + + The simplest is an action command. An action command is an + identifier followed by zero or more arguments, terminated by a + semicolon. Action commands do not take tests or blocks as arguments. + + + +Showalter Standards Track [Page 14] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + A control command is similar, but it takes a test as an argument, and + ends with a block instead of a semicolon. + + A test command is used as part of a control command. It is used to + specify whether or not the block of code given to the control command + is executed. + +2.10. Evaluation + +2.10.1. Action Interaction + + Some actions cannot be used with other actions because the result + would be absurd. These restrictions are noted throughout this memo. + + Extension actions MUST state how they interact with actions defined + in this specification. + +2.10.2. Implicit Keep + + Previous experience with filtering systems suggests that cases tend + to be missed in scripts. To prevent errors, Sieve has an "implicit + keep". + + An implicit keep is a keep action (see 4.4) performed in absence of + any action that cancels the implicit keep. + + An implicit keep is performed if a message is not written to a + mailbox, redirected to a new address, or explicitly thrown out. That + is, if a fileinto, a keep, a redirect, or a discard is performed, an + implicit keep is not. + + Some actions may be defined to not cancel the implicit keep. These + actions may not directly affect the delivery of a message, and are + used for their side effects. None of the actions specified in this + document meet that criteria, but extension actions will. + + For instance, with any of the short messages offered above, the + following script produces no actions. + + Example: if size :over 500K { discard; } + + As a result, the implicit keep is taken. + +2.10.3. Message Uniqueness in a Mailbox + + Implementations SHOULD NOT deliver a message to the same folder more + than once, even if a script explicitly asks for a message to be + written to a mailbox twice. + + + +Showalter Standards Track [Page 15] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + The test for equality of two messages is implementation-defined. + + If a script asks for a message to be written to a mailbox twice, it + MUST NOT be treated as an error. + +2.10.4. Limits on Numbers of Actions + + Site policy MAY limit numbers of actions taken and MAY impose + restrictions on which actions can be used together. In the event + that a script hits a policy limit on the number of actions taken for + a particular message, an error occurs. + + Implementations MUST prohibit more than one reject. + + Implementations MUST allow at least one keep or one fileinto. If + fileinto is not implemented, implementations MUST allow at least one + keep. + + Implementations SHOULD prohibit reject when used with other actions. + +2.10.5. Extensions and Optional Features + + Because of the differing capabilities of many mail systems, several + features of this specification are optional. Before any of these + extensions can be executed, they must be declared with the "require" + action. + + If an extension is not enabled with "require", implementations MUST + treat it as if they did not support it at all. + + If a script does not understand an extension declared with require, + the script must not be used at all. Implementations MUST NOT execute + scripts which require unknown capability names. + + Note: The reason for this restriction is that prior experiences with + languages such as LISP and Tcl suggest that this is a workable + way of noting that a given script uses an extension. + + Experience with PostScript suggests that mechanisms that allow + a script to work around missing extensions are not used in + practice. + + Extensions which define actions MUST state how they interact with + actions discussed in the base specification. + + + + + + + +Showalter Standards Track [Page 16] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +2.10.6. Errors + + In any programming language, there are compile-time and run-time + errors. + + Compile-time errors are ones in syntax that are detectable if a + syntax check is done. + + Run-time errors are not detectable until the script is run. This + includes transient failures like disk full conditions, but also + includes issues like invalid combinations of actions. + + When an error occurs in a Sieve script, all processing stops. + + Implementations MAY choose to do a full parse, then evaluate the + script, then do all actions. Implementations might even go so far as + to ensure that execution is atomic (either all actions are executed + or none are executed). + + Other implementations may choose to parse and run at the same time. + Such implementations are simpler, but have issues with partial + failure (some actions happen, others don't). + + Implementations might even go so far as to ensure that scripts can + never execute an invalid set of actions (e.g., reject + fileinto) + before execution, although this could involve solving the Halting + Problem. + + This specification allows any of these approaches. Solving the + Halting Problem is considered extra credit. + + When an error happens, implementations MUST notify the user that an + error occurred, which actions (if any) were taken, and do an implicit + keep. + +2.10.7. Limits on Execution + + Implementations may limit certain constructs. However, this + specification places a lower bound on some of these limits. + + Implementations MUST support fifteen levels of nested blocks. + + Implementations MUST support fifteen levels of nested test lists. + +3. Control Commands + + Control structures are needed to allow for multiple and conditional + actions. + + + +Showalter Standards Track [Page 17] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +3.1. Control Structure If + + There are three pieces to if: "if", "elsif", and "else". Each is + actually a separate command in terms of the grammar. However, an + elsif MUST only follow an if, and an else MUST follow only either an + if or an elsif. An error occurs if these conditions are not met. + + Syntax: if <test1: test> <block1: block> + + Syntax: elsif <test2: test> <block2: block> + + Syntax: else <block> + + The semantics are similar to those of any of the many other + programming languages these control commands appear in. When the + interpreter sees an "if", it evaluates the test associated with it. + If the test is true, it executes the block associated with it. + + If the test of the "if" is false, it evaluates the test of the first + "elsif" (if any). If the test of "elsif" is true, it runs the + elsif's block. An elsif may be followed by an elsif, in which case, + the interpreter repeats this process until it runs out of elsifs. + + When the interpreter runs out of elsifs, there may be an "else" case. + If there is, and none of the if or elsif tests were true, the + interpreter runs the else case. + + This provides a way of performing exactly one of the blocks in the + chain. + + In the following example, both Message A and B are dropped. + + Example: require "fileinto"; + if header :contains "from" "coyote" { + discard; + } elsif header :contains ["subject"] ["$$$"] { + discard; + } else { + fileinto "INBOX"; + } + + + When the script below is run over message A, it redirects the message + to acm@example.edu; message B, to postmaster@example.edu; any other + message is redirected to field@example.edu. + + + + + + +Showalter Standards Track [Page 18] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Example: if header :contains ["From"] ["coyote"] { + redirect "acm@example.edu"; + } elsif header :contains "Subject" "$$$" { + redirect "postmaster@example.edu"; + } else { + redirect "field@example.edu"; + } + + Note that this definition prohibits the "... else if ..." sequence + used by C. This is intentional, because this construct produces a + shift-reduce conflict. + +3.2. Control Structure Require + + Syntax: require <capabilities: string-list> + + The require action notes that a script makes use of a certain + extension. Such a declaration is required to use the extension, as + discussed in section 2.10.5. Multiple capabilities can be declared + with a single require. + + The require command, if present, MUST be used before anything other + than a require can be used. An error occurs if a require appears + after a command other than require. + + Example: require ["fileinto", "reject"]; + + Example: require "fileinto"; + require "vacation"; + +3.3. Control Structure Stop + + Syntax: stop + + The "stop" action ends all processing. If no actions have been + executed, then the keep action is taken. + +4. Action Commands + + This document supplies five actions that may be taken on a message: + keep, fileinto, redirect, reject, and discard. + + Implementations MUST support the "keep", "discard", and "redirect" + actions. + + Implementations SHOULD support "reject" and "fileinto". + + + + + +Showalter Standards Track [Page 19] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Implementations MAY limit the number of certain actions taken (see + section 2.10.4). + +4.1. Action reject + + Syntax: reject <reason: string> + + The optional "reject" action refuses delivery of a message by sending + back an [MDN] to the sender. It resends the message to the sender, + wrapping it in a "reject" form, noting that it was rejected by the + recipient. In the following script, message A is rejected and + returned to the sender. + + Example: if header :contains "from" "coyote@desert.example.org" { + reject "I am not taking mail from you, and I don't want + your birdseed, either!"; + } + + A reject message MUST take the form of a failure MDN as specified by + [MDN]. The human-readable portion of the message, the first + component of the MDN, contains the human readable message describing + the error, and it SHOULD contain additional text alerting the + original sender that mail was refused by a filter. This part of the + MDN might appear as follows: + + ------------------------------------------------------------ + Message was refused by recipient's mail filtering program. Reason + given was as follows: + + I am not taking mail from you, and I don't want your birdseed, + either! + ------------------------------------------------------------ + + The MDN action-value field as defined in the MDN specification MUST + be "deleted" and MUST have the MDN-sent-automatically and automatic- + action modes set. + + Because some implementations can not or will not implement the reject + command, it is optional. The capability string to be used with the + require command is "reject". + +4.2. Action fileinto + + Syntax: fileinto <folder: string> + + The "fileinto" action delivers the message into the specified folder. + Implementations SHOULD support fileinto, but in some environments + this may be impossible. + + + +Showalter Standards Track [Page 20] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + The capability string for use with the require command is "fileinto". + + In the following script, message A is filed into folder + "INBOX.harassment". + + Example: require "fileinto"; + if header :contains ["from"] "coyote" { + fileinto "INBOX.harassment"; + } + +4.3. Action redirect + + Syntax: redirect <address: string> + + The "redirect" action is used to send the message to another user at + a supplied address, as a mail forwarding feature does. The + "redirect" action makes no changes to the message body or existing + headers, but it may add new headers. The "redirect" modifies the + envelope recipient. + + The redirect command performs an MTA-style "forward"--that is, what + you get from a .forward file using sendmail under UNIX. The address + on the SMTP envelope is replaced with the one on the redirect command + and the message is sent back out. (This is not an MUA-style forward, + which creates a new message with a different sender and message ID, + wrapping the old message in a new one.) + + A simple script can be used for redirecting all mail: + + Example: redirect "bart@example.edu"; + + Implementations SHOULD take measures to implement loop control, + possibly including adding headers to the message or counting received + headers. If an implementation detects a loop, it causes an error. + +4.4. Action keep + + Syntax: keep + + The "keep" action is whatever action is taken in lieu of all other + actions, if no filtering happens at all; generally, this simply means + to file the message into the user's main mailbox. This command + provides a way to execute this action without needing to know the + name of the user's main mailbox, providing a way to call it without + needing to understand the user's setup, or the underlying mail + system. + + + + + +Showalter Standards Track [Page 21] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + For instance, in an implementation where the IMAP server is running + scripts on behalf of the user at time of delivery, a keep command is + equivalent to a fileinto "INBOX". + + Example: if size :under 1M { keep; } else { discard; } + + Note that the above script is identical to the one below. + + Example: if not size :under 1M { discard; } + +4.5. Action discard + + Syntax: discard + + Discard is used to silently throw away the message. It does so by + simply canceling the implicit keep. If discard is used with other + actions, the other actions still happen. Discard is compatible with + all other actions. (For instance fileinto+discard is equivalent to + fileinto.) + + Discard MUST be silent; that is, it MUST NOT return a non-delivery + notification of any kind ([DSN], [MDN], or otherwise). + + In the following script, any mail from "idiot@example.edu" is thrown + out. + + Example: if header :contains ["from"] ["idiot@example.edu"] { + discard; + } + + While an important part of this language, "discard" has the potential + to create serious problems for users: Students who leave themselves + logged in to an unattended machine in a public computer lab may find + their script changed to just "discard". In order to protect users in + this situation (along with similar situations), implementations MAY + keep messages destroyed by a script for an indefinite period, and MAY + disallow scripts that throw out all mail. + +5. Test Commands + + Tests are used in conditionals to decide which part(s) of the + conditional to execute. + + Implementations MUST support these tests: "address", "allof", + "anyof", "exists", "false", "header", "not", "size", and "true". + + Implementations SHOULD support the "envelope" test. + + + + +Showalter Standards Track [Page 22] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +5.1. Test address + + Syntax: address [ADDRESS-PART] [COMPARATOR] [MATCH-TYPE] + <header-list: string-list> <key-list: string-list> + + The address test matches Internet addresses in structured headers + that contain addresses. It returns true if any header contains any + key in the specified part of the address, as modified by the + comparator and the match keyword. + + Like envelope and header, this test returns true if any combination + of the header-list and key-list arguments match. + + Internet email addresses [IMAIL] have the somewhat awkward + characteristic that the local-part to the left of the at-sign is + considered case sensitive, and the domain-part to the right of the + at-sign is case insensitive. The "address" command does not deal + with this itself, but provides the ADDRESS-PART argument for allowing + users to deal with it. + + The address primitive never acts on the phrase part of an email + address, nor on comments within that address. It also never acts on + group names, although it does act on the addresses within the group + construct. + + Implementations MUST restrict the address test to headers that + contain addresses, but MUST include at least From, To, Cc, Bcc, + Sender, Resent-From, Resent-To, and SHOULD include any other header + that utilizes an "address-list" structured header body. + + Example: if address :is :all "from" "tim@example.com" { + discard; + +5.2. Test allof + + Syntax: allof <tests: test-list> + + The allof test performs a logical AND on the tests supplied to it. + + Example: allof (false, false) => false + allof (false, true) => false + allof (true, true) => true + + The allof test takes as its argument a test-list. + + + + + + + +Showalter Standards Track [Page 23] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +5.3. Test anyof + + Syntax: anyof <tests: test-list> + + The anyof test performs a logical OR on the tests supplied to it. + + Example: anyof (false, false) => false + anyof (false, true) => true + anyof (true, true) => true + +5.4. Test envelope + + Syntax: envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] + <envelope-part: string-list> <key-list: string-list> + + The "envelope" test is true if the specified part of the SMTP (or + equivalent) envelope matches the specified key. + + If one of the envelope-part strings is (case insensitive) "from", + then matching occurs against the FROM address used in the SMTP MAIL + command. + + If one of the envelope-part strings is (case insensitive) "to", then + matching occurs against the TO address used in the SMTP RCPT command + that resulted in this message getting delivered to this user. Note + that only the most recent TO is available, and only the one relevant + to this user. + + The envelope-part is a string list and may contain more than one + parameter, in which case all of the strings specified in the key-list + are matched against all parts given in the envelope-part list. + + Like address and header, this test returns true if any combination of + the envelope-part and key-list arguments is true. + + All tests against envelopes MUST drop source routes. + + If the SMTP transaction involved several RCPT commands, only the data + from the RCPT command that caused delivery to this user is available + in the "to" part of the envelope. + + If a protocol other than SMTP is used for message transport, + implementations are expected to adapt this command appropriately. + + The envelope command is optional. Implementations SHOULD support it, + but the necessary information may not be available in all cases. + + + + + +Showalter Standards Track [Page 24] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Example: require "envelope"; + if envelope :all :is "from" "tim@example.com" { + discard; + } + +5.5. Test exists + + Syntax: exists <header-names: string-list> + + The "exists" test is true if the headers listed in the header-names + argument exist within the message. All of the headers must exist or + the test is false. + + The following example throws out mail that doesn't have a From header + and a Date header. + + Example: if not exists ["From","Date"] { + discard; + } + +5.6. Test false + + Syntax: false + + The "false" test always evaluates to false. + +5.7. Test header + + Syntax: header [COMPARATOR] [MATCH-TYPE] + <header-names: string-list> <key-list: string-list> + + The "header" test evaluates to true if any header name matches any + key. The type of match is specified by the optional match argument, + which defaults to ":is" if not specified, as specified in section + 2.6. + + Like address and envelope, this test returns true if any combination + of the string-list and key-list arguments match. + + If a header listed in the header-names argument exists, it contains + the null key (""). However, if the named header is not present, it + does not contain the null key. So if a message contained the header + + X-Caffeine: C8H10N4O2 + + + + + + + +Showalter Standards Track [Page 25] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + these tests on that header evaluate as follows: + + header :is ["X-Caffeine"] [""] => false + header :contains ["X-Caffeine"] [""] => true + +5.8. Test not + + Syntax: not <test> + + The "not" test takes some other test as an argument, and yields the + opposite result. "not false" evaluates to "true" and "not true" + evaluates to "false". + +5.9. Test size + + Syntax: size <":over" / ":under"> <limit: number> + + The "size" test deals with the size of a message. It takes either a + tagged argument of ":over" or ":under", followed by a number + representing the size of the message. + + If the argument is ":over", and the size of the message is greater + than the number provided, the test is true; otherwise, it is false. + + If the argument is ":under", and the size of the message is less than + the number provided, the test is true; otherwise, it is false. + + Exactly one of ":over" or ":under" must be specified, and anything + else is an error. + + The size of a message is defined to be the number of octets from the + initial header until the last character in the message body. + + Note that for a message that is exactly 4,000 octets, the message is + neither ":over" 4000 octets or ":under" 4000 octets. + +5.10. Test true + + Syntax: true + + The "true" test always evaluates to true. + +6. Extensibility + + New control structures, actions, and tests can be added to the + language. Sites must make these features known to their users; this + document does not define a way to discover the list of extensions + supported by the server. + + + +Showalter Standards Track [Page 26] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Any extensions to this language MUST define a capability string that + uniquely identifies that extension. If a new version of an extension + changes the functionality of a previously defined extension, it MUST + use a different name. + + In a situation where there is a submission protocol and an extension + advertisement mechanism aware of the details of this language, + scripts submitted can be checked against the mail server to prevent + use of an extension that the server does not support. + + Extensions MUST state how they interact with constraints defined in + section 2.10, e.g., whether they cancel the implicit keep, and which + actions they are compatible and incompatible with. + +6.1. Capability String + + Capability strings are typically short strings describing what + capabilities are supported by the server. + + Capability strings beginning with "vnd." represent vendor-defined + extensions. Such extensions are not defined by Internet standards or + RFCs, but are still registered with IANA in order to prevent + conflicts. Extensions starting with "vnd." SHOULD be followed by the + name of the vendor and product, such as "vnd.acme.rocket-sled". + + The following capability strings are defined by this document: + + envelope The string "envelope" indicates that the implementation + supports the "envelope" command. + + fileinto The string "fileinto" indicates that the implementation + supports the "fileinto" command. + + reject The string "reject" indicates that the implementation + supports the "reject" command. + + comparator- The string "comparator-elbonia" is provided if the + implementation supports the "elbonia" comparator. + Therefore, all implementations have at least the + "comparator-i;octet" and "comparator-i;ascii-casemap" + capabilities. However, these comparators may be used + without being declared with require. + + + + + + + + + +Showalter Standards Track [Page 27] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +6.2. IANA Considerations + + In order to provide a standard set of extensions, a registry is + provided by IANA. Capability names may be registered on a first- + come, first-served basis. Extensions designed for interoperable use + SHOULD be defined as standards track or IESG approved experimental + RFCs. + +6.2.1. Template for Capability Registrations + + The following template is to be used for registering new Sieve + extensions with IANA. + + To: iana@iana.org + Subject: Registration of new Sieve extension + + Capability name: + Capability keyword: + Capability arguments: + Standards Track/IESG-approved experimental RFC number: + Person and email address to contact for further information: + +6.2.2. Initial Capability Registrations + + The following are to be added to the IANA registry for Sieve + extensions as the initial contents of the capability registry. + + Capability name: fileinto + Capability keyword: fileinto + Capability arguments: fileinto <folder: string> + Standards Track/IESG-approved experimental RFC number: + RFC 3028 (Sieve base spec) + Person and email address to contact for further information: + Tim Showalter + tjs@mirapoint.com + + Capability name: reject + Capability keyword: reject + Capability arguments: reject <reason: string> + Standards Track/IESG-approved experimental RFC number: + RFC 3028 (Sieve base spec) + Person and email address to contact for further information: + Tim Showalter + tjs@mirapoint.com + + + + + + + +Showalter Standards Track [Page 28] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Capability name: envelope + Capability keyword: envelope + Capability arguments: + envelope [COMPARATOR] [ADDRESS-PART] [MATCH-TYPE] + <envelope-part: string-list> <key-list: string-list> + Standards Track/IESG-approved experimental RFC number: + RFC 3028 (Sieve base spec) + Person and email address to contact for further information: + Tim Showalter + tjs@mirapoint.com + + Capability name: comparator-* + Capability keyword: + comparator-* (anything starting with "comparator-") + Capability arguments: (none) + Standards Track/IESG-approved experimental RFC number: + RFC 3028, Sieve, by reference of + RFC 2244, Application Configuration Access Protocol + Person and email address to contact for further information: + Tim Showalter + tjs@mirapoint.com + +6.3. Capability Transport + + As the range of mail systems that this document is intended to apply + to is quite varied, a method of advertising which capabilities an + implementation supports is difficult due to the wide range of + possible implementations. Such a mechanism, however, should have + property that the implementation can advertise the complete set of + extensions that it supports. + +7. Transmission + + The MIME type for a Sieve script is "application/sieve". + + The registration of this type for RFC 2048 requirements is as + follows: + + Subject: Registration of MIME media type application/sieve + + MIME media type name: application + MIME subtype name: sieve + Required parameters: none + Optional parameters: none + Encoding considerations: Most sieve scripts will be textual, + written in UTF-8. When non-7bit characters are used, + quoted-printable is appropriate for transport systems + that require 7bit encoding. + + + +Showalter Standards Track [Page 29] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + Security considerations: Discussed in section 10 of RFC 3028. + Interoperability considerations: Discussed in section 2.10.5 + of RFC 3028. + Published specification: RFC 3028. + Applications which use this media type: sieve-enabled mail servers + Additional information: + Magic number(s): + File extension(s): .siv + Macintosh File Type Code(s): + Person & email address to contact for further information: + See the discussion list at ietf-mta-filters@imc.org. + Intended usage: + COMMON + Author/Change controller: + See Author information in RFC 3028. + +8. Parsing + + The Sieve grammar is separated into tokens and a separate grammar as + most programming languages are. + +8.1. Lexical Tokens + + Sieve scripts are encoded in UTF-8. The following assumes a valid + UTF-8 encoding; special characters in Sieve scripts are all ASCII. + + The following are tokens in Sieve: + + - identifiers + - tags + - numbers + - quoted strings + - multi-line strings + - other separators + + Blanks, horizontal tabs, CRLFs, and comments ("white space") are + ignored except as they separate tokens. Some white space is required + to separate otherwise adjacent tokens and in specific places in the + multi-line strings. + + The other separators are single individual characters, and are + mentioned explicitly in the grammar. + + The lexical structure of sieve is defined in the following BNF (as + described in [ABNF]): + + + + + + +Showalter Standards Track [Page 30] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + bracket-comment = "/*" *(CHAR-NOT-STAR / ("*" CHAR-NOT-SLASH)) "*/" + ;; No */ allowed inside a comment. + ;; (No * is allowed unless it is the last character, + ;; or unless it is followed by a character that isn't a + ;; slash.) + + CHAR-NOT-DOT = (%x01-09 / %x0b-0c / %x0e-2d / %x2f-ff) + ;; no dots, no CRLFs + + CHAR-NOT-CRLF = (%x01-09 / %x0b-0c / %x0e-ff) + + CHAR-NOT-SLASH = (%x00-57 / %x58-ff) + + CHAR-NOT-STAR = (%x00-51 / %x53-ff) + + comment = bracket-comment / hash-comment + + hash-comment = ( "#" *CHAR-NOT-CRLF CRLF ) + + identifier = (ALPHA / "_") *(ALPHA DIGIT "_") + + tag = ":" identifier + + number = 1*DIGIT [QUANTIFIER] + + QUANTIFIER = "K" / "M" / "G" + + quoted-string = DQUOTE *CHAR DQUOTE + ;; in general, \ CHAR inside a string maps to CHAR + ;; so \" maps to " and \\ maps to \ + ;; note that newlines and other characters are all allowed + ;; strings + + multi-line = "text:" *(SP / HTAB) (hash-comment / CRLF) + *(multi-line-literal / multi-line-dotstuff) + "." CRLF + multi-line-literal = [CHAR-NOT-DOT *CHAR-NOT-CRLF] CRLF + multi-line-dotstuff = "." 1*CHAR-NOT-CRLF CRLF + ;; A line containing only "." ends the multi-line. + ;; Remove a leading '.' if followed by another '.'. + + white-space = 1*(SP / CRLF / HTAB) / comment + +8.2. Grammar + + The following is the grammar of Sieve after it has been lexically + interpreted. No white space or comments appear below. The start + symbol is "start". + + + +Showalter Standards Track [Page 31] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + argument = string-list / number / tag + + arguments = *argument [test / test-list] + + block = "{" commands "}" + + command = identifier arguments ( ";" / block ) + + commands = *command + + start = commands + + string = quoted-string / multi-line + + string-list = "[" string *("," string) "]" / string ;; if + there is only a single string, the brackets are optional + + test = identifier arguments + + test-list = "(" test *("," test) ")" + +9. Extended Example + + The following is an extended example of a Sieve script. Note that it + does not make use of the implicit keep. + + # + # Example Sieve Filter + # Declare any optional features or extension used by the script + # + require ["fileinto", "reject"]; + + # + # Reject any large messages (note that the four leading dots get + # "stuffed" to three) + # + if size :over 1M + { + reject text: + Please do not send me large attachments. + Put your file on a server and send me the URL. + Thank you. + .... Fred + . + ; + stop; + } + # + + + +Showalter Standards Track [Page 32] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + # Handle messages from known mailing lists + # Move messages from IETF filter discussion list to filter folder + # + if header :is "Sender" "owner-ietf-mta-filters@imc.org" + { + fileinto "filter"; # move to "filter" folder + } + # + # Keep all messages to or from people in my company + # + elsif address :domain :is ["From", "To"] "example.com" + { + keep; # keep in "In" folder + } + + # + # Try and catch unsolicited email. If a message is not to me, + # or it contains a subject known to be spam, file it away. + # + elsif anyof (not address :all :contains + ["To", "Cc", "Bcc"] "me@example.com", + header :matches "subject" + ["*make*money*fast*", "*university*dipl*mas*"]) + { + # If message header does not contain my address, + # it's from a list. + fileinto "spam"; # move to "spam" folder + } + else + { + # Move all other (non-company) mail to "personal" + # folder. + fileinto "personal"; + } + + + + + + + + + + + + + + + + + +Showalter Standards Track [Page 33] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +10. Security Considerations + + Users must get their mail. It is imperative that whatever method + implementations use to store the user-defined filtering scripts be + secure. + + It is equally important that implementations sanity-check the user's + scripts, and not allow users to create on-demand mailbombs. For + instance, an implementation that allows a user to reject or redirect + multiple times to a single message might also allow a user to create + a mailbomb triggered by mail from a specific user. Site- or + implementation-defined limits on actions are useful for this. + + Several commands, such as "discard", "redirect", and "fileinto" allow + for actions to be taken that are potentially very dangerous. + + Implementations SHOULD take measures to prevent languages from + looping. + +11. Acknowledgments + + I am very thankful to Chris Newman for his support and his ABNF + syntax checker, to John Myers and Steve Hole for outlining the + requirements for the original drafts, to Larry Greenfield for nagging + me about the grammar and finally fixing it, to Greg Sereda for + repeatedly fixing and providing examples, to Ned Freed for fixing + everything else, to Rob Earhart for an early implementation and a + great deal of help, and to Randall Gellens for endless amounts of + proofreading. I am grateful to Carnegie Mellon University where most + of the work on this document was done. I am also indebted to all of + the readers of the ietf-mta-filters@imc.org mailing list. + +12. Author's Address + + Tim Showalter + Mirapoint, Inc. + 909 Hermosa Court + Sunnyvale, CA 94085 + + EMail: tjs@mirapoint.com + +13. References + + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 2234, November 1997. + + + + + + +Showalter Standards Track [Page 34] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + + [ACAP] Newman, C. and J. G. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November 1997. + + [BINARY-SI] "Standard IEC 60027-2: Letter symbols to be used in + electrical technology - Part 2: Telecommunications and + electronics", January 1999. + + [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format + for Delivery Status Notifications", RFC 1894, January + 1996. + + [FLAMES] Borenstein, N, and C. Thyberg, "Power, Ease of Use, and + Cooperative Work in a Practical Multimedia Message + System", Int. J. of Man-Machine Studies, April, 1991. + Reprinted in Computer-Supported Cooperative Work and + Groupware, Saul Greenberg, editor, Harcourt Brace + Jovanovich, 1991. Reprinted in Readings in Groupware and + Computer-Supported Cooperative Work, Ronald Baecker, + editor, Morgan Kaufmann, 1993. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [IMAP] Crispin, M., "Internet Message Access Protocol - version + 4rev1", RFC 2060, December 1996. + + [IMAIL] Crocker, D., "Standard for the Format of ARPA Internet + Text Messages", STD 11, RFC 822, August 1982. + + [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message + Bodies", RFC 2045, November 1996. + + [MDN] Fajman, R., "An Extensible Message Format for Message + Disposition Notifications", RFC 2298, March 1998. + + [RFC1123] Braden, R., "Requirements for Internet Hosts -- + Application and Support", STD 3, RFC 1123, November 1989. + + [SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC + 821, August 1982. + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of Unicode + and ISO 10646", RFC 2044, October 1996. + + + + + + + +Showalter Standards Track [Page 35] + +RFC 3028 Sieve: A Mail Filtering Language January 2001 + + +14. Full Copyright Statement + + Copyright (C) The Internet Society (2001). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Showalter Standards Track [Page 36] + diff --git a/rfc/sieve/rfc3431.txt b/rfc/sieve/rfc3431.txt @@ -0,0 +1,451 @@ + + + + + + +Network Working Group W. Segmuller +Request for Comment: 3431 IBM T.J. Watson Research Center +Category: Standards Track December 2002 + + + Sieve Extension: Relational Tests + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2002). All Rights Reserved. + +Abstract + + This document describes the RELATIONAL extension to the Sieve mail + filtering language defined in RFC 3028. This extension extends + existing conditional tests in Sieve to allow relational operators. + In addition to testing their content, it also allows for testing of + the number of entities in header and envelope fields. + +1 Introduction + + Sieve [SIEVE] is a language for filtering e-mail messages at the time + of final delivery. It is designed to be implementable on either a + mail client or mail server. It is meant to be extensible, simple, + and independent of access protocol, mail architecture, and operating + system. It is suitable for running on a mail server where users may + not be allowed to execute arbitrary programs, such as on black box + Internet Messages Access Protocol (IMAP) servers, as it has no + variables, loops, nor the ability to shell out to external programs. + + The RELATIONAL extension provides relational operators on the + address, envelope, and header tests. This extension also provides a + way of counting the entities in a message header or address field. + + With this extension, the sieve script may now determine if a field is + greater than or less than a value instead of just equivalent. One + use is for the x-priority field: move messages with a priority + greater than 3 to the "work on later" folder. Mail could also be + sorted by the from address. Those userids that start with 'a'-'m' go + to one folder, and the rest go to another folder. + + + +Segmuller Standards Track [Page 1] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + + The sieve script can also determine the number of fields in the + header, or the number of addresses in a recipient field. For + example: are there more than 5 addresses in the to and cc fields. + +2 Conventions used in this document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14, RFC 2119. + + Conventions for notations are as in [SIEVE] section 1.1, including + the use of [KEYWORDS] and "Syntax:" label for the definition of + action and tagged arguments syntax, and the use of [ABNF]. + + The capability string associated with extension defined in this + document is "relational". + +3 Comparators + + This document does not define any comparators or exempt any + comparators from the require clause. Any comparator used, other than + "i;octet" and "i;ascii-casemap", MUST be declared a require clause as + defined in [SIEVE]. + + The "i;ascii-numeric" comparator, as defined in [ACAP], MUST be + supported for any implementation of this extension. The comparator + "i;ascii-numeric" MUST support at least 32 bit unsigned integers. + + Larger integers MAY be supported. Note: the "i;ascii-numeric" + comparator does not support negative numbers. + +4 Match Type + + This document defines two new match types. They are the VALUE match + type and the COUNT match type. + + The syntax is: + + MATCH-TYPE =/ COUNT / VALUE + + COUNT = ":count" relational-match + + VALUE = ":value" relational-match + + relational-match = DQUOTE ( "gt" / "ge" / "lt" + / "le" / "eq" / "ne" ) DQUOTE + + + + + +Segmuller Standards Track [Page 2] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + +4.1 Match Type Value + + The VALUE match type does a relational comparison between strings. + + The VALUE match type may be used with any comparator which returns + sort information. + + Leading and trailing white space MUST be removed from the value of + the message for the comparison. White space is defined as + + SP / HTAB / CRLF + + A value from the message is considered the left side of the relation. + A value from the test expression, the key-list for address, envelope, + and header tests, is the right side of the relation. + + If there are multiple values on either side or both sides, the test + is considered true, if any pair is true. + +4.2 Match Type Count + + The COUNT match type first determines the number of the specified + entities in the message and does a relational comparison of the + number of entities to the values specified in the test expression. + + The COUNT match type SHOULD only be used with numeric comparators. + + The Address Test counts the number of recipients in the specified + fields. Group names are ignored. + + The Envelope Test counts the number of recipients in the specified + envelope parts. The envelope "to" will always have only one entry, + which is the address of the user for whom the sieve script is + running. There is no way a sieve script can determine if the message + was actually sent to someone else using this test. The envelope + "from" will be 0 if the MAIL FROM is blank, or 1 if MAIL FROM is not + blank. + + The Header Test counts the total number of instances of the specified + fields. This does not count individual addresses in the "to", "cc", + and other recipient fields. + + In all cases, if more than one field name is specified, the counts + for all specified fields are added together to obtain the number for + comparison. Thus, specifying ["to", "cc"] in an address COUNT test, + comparing the total number of "to" and "cc" addresses; if separate + counts are desired, they must be done in two comparisons, perhaps + joined by "allof" or "anyof". + + + +Segmuller Standards Track [Page 3] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + +5 Security Considerations + + Security considerations are discussed in [SIEVE]. + + An implementation MUST ensure that the test for envelope "to" only + reflects the delivery to the current user. It MUST not be possible + for a user to determine if this message was delivered to someone else + using this test. + +6 Example + + Using the message: + + received: ... + received: ... + subject: example + to: foo@example.com.invalid, baz@example.com.invalid + cc: qux@example.com.invalid + + The test: + + address :count "ge" :comparator "i;ascii-numeric" ["to", "cc"] + ["3"] + + would be true and the test + + anyof ( address :count "ge" :comparator "i;ascii-numeric" + ["to"] ["3"], + address :count "ge" :comparator "i;ascii-numeric" + ["cc"] ["3"] ) + + would be false. + + To check the number of received fields in the header, the + following test may be used: + + header :count "ge" :comparator "i;ascii-numeric" + ["received"] ["3"] + + This would return false. But + + header :count "ge" :comparator "i;ascii-numeric" + ["received", "subject"] ["3"] + + would return true. + + + + + + +Segmuller Standards Track [Page 4] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + + The test: + + header :count "ge" :comparator "i;ascii-numeric" + ["to", "cc"] ["3"] + + will always return false on an RFC 2822 compliant message [RFC2822], + since a message can have at most one "to" field and at most one "cc" + field. This test counts the number of fields, not the number of + addresses. + +7 Extended Example + + require ["relational", "comparator-i;ascii-numeric"]; + + if header :value "lt" :comparator "i;ascii-numeric" + ["x-priority"] ["3"] + { + fileinto "Priority"; + } + + elseif address :count "gt" :comparator "i;ascii-numeric" + ["to"] ["5"] + { + # everything with more than 5 recipients in the "to" field + # is considered SPAM + fileinto "SPAM"; + } + + elseif address :value "gt" :all :comparator "i;ascii-casemap" + ["from"] ["M"] + { + fileinto "From N-Z"; + } else { + fileinto "From A-M"; + } + + if allof ( address :count "eq" :comparator "i;ascii-numeric" + ["to", "cc"] ["1"] , + address :all :comparator "i;ascii-casemap" + ["to", "cc"] ["me@foo.example.com.invalid"] + { + fileinto "Only me"; + } + + + + + + + + +Segmuller Standards Track [Page 5] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + +8 IANA Considerations + + The following template specifies the IANA registration of the Sieve + extension specified in this document: + + To: iana@iana.org + Subject: Registration of new Sieve extension + + Capability name: RELATIONAL + Capability keyword: relational + Capability arguments: N/A + Standards Track/IESG-approved experimental RFC number: this RFC + Person and email address to contact for further information: + Wolfgang Segmuller + IBM T.J. Watson Research Center + 30 Saw Mill River Rd + Hawthorne, NY 10532 + + Email: whs@watson.ibm.com + + This information should be added to the list of sieve extensions + given on http://www.iana.org/assignments/sieve-extensions. + +9 References + +9.1 Normative References + + [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language", RFC + 3028, January 2001. + + [Keywords] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [ABNF] Crocker, D., "Augmented BNF for Syntax Specifications: + ABNF", RFC 2234, November 1997. + + [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April + 2001. + +9.2 Non-Normative References + + [ACAP] Newman, C. and J. G. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November 1997. + + + + + + + + +Segmuller Standards Track [Page 6] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + +10 Author's Address + + Wolfgang Segmuller + IBM T.J. Watson Research Center + 30 Saw Mill River Rd + Hawthorne, NY 10532 + + EMail: whs@watson.ibm.com + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Segmuller Standards Track [Page 7] + +RFC 3431 Sieve Extension: Relational Tests December 2002 + + +11 Full Copyright Statement + + Copyright (C) The Internet Society (2002). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Segmuller Standards Track [Page 8] + diff --git a/rfc/sieve/rfc5231.txt b/rfc/sieve/rfc5231.txt @@ -0,0 +1,507 @@ + + + + + + +Network Working Group W. Segmuller +Request for Comments: 5231 B. Leiba +Obsoletes: 3431 IBM T.J. Watson Research Center +Category: Standards Track January 2008 + + + Sieve Email Filtering: Relational Extension + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This document describes the RELATIONAL extension to the Sieve mail + filtering language defined in RFC 3028. This extension extends + existing conditional tests in Sieve to allow relational operators. + In addition to testing their content, it also allows for testing of + the number of entities in header and envelope fields. + + This document obsoletes RFC 3431. + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 + 2. Conventions Used in This Document . . . . . . . . . . . . . . . 2 + 3. Comparators . . . . . . . . . . . . . . . . . . . . . . . . . . 2 + 4. Match Types . . . . . . . . . . . . . . . . . . . . . . . . . . 3 + 4.1. Match Type VALUE . . . . . . . . . . . . . . . . . . . . . 3 + 4.2. Match Type COUNT . . . . . . . . . . . . . . . . . . . . . 3 + 5. Interaction with Other Sieve Actions . . . . . . . . . . . . . 4 + 6. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 7. Extended Example . . . . . . . . . . . . . . . . . . . . . . . 6 + 8. Changes since RFC 3431 . . . . . . . . . . . . . . . . . . . . 6 + 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 + 10. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 + 11. Normative References . . . . . . . . . . . . . . . . . . . . . 7 + + + + + + + + + + +Segmuller & Leiba Standards Track [Page 1] + +RFC 5231 Sieve: Relational Extension January 2008 + + +1. Introduction + + The RELATIONAL extension to the Sieve mail filtering language [Sieve] + provides relational operators on the address, envelope, and header + tests. This extension also provides a way of counting the entities + in a message header or address field. + + With this extension, the Sieve script may now determine if a field is + greater than or less than a value instead of just equivalent. One + use is for the x-priority field: move messages with a priority + greater than 3 to the "work on later" folder. Mail could also be + sorted by the from address. Those userids that start with 'a'-'m' go + to one folder, and the rest go to another folder. + + The Sieve script can also determine the number of fields in the + header, or the number of addresses in a recipient field, for example, + whether there are more than 5 addresses in the to and cc fields. + + The capability string associated with the extension defined in this + document is "relational". + +2. Conventions Used in This Document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14, RFC 2119. + + Conventions for notations are as in [Sieve] section 1.1, including + the use of [Kwds] and the use of [ABNF]. + +3. Comparators + + This document does not define any comparators or exempt any + comparators from the require clause. Any comparator used must be + treated as defined in [Sieve]. + + The "i;ascii-numeric" comparator, as defined in [RFC4790], MUST be + supported for any implementation of this extension. The comparator + "i;ascii-numeric" MUST support at least 32-bit unsigned integers. + + Larger integers MAY be supported. Note: the "i;ascii-numeric" + comparator does not support negative numbers. + + + + + + + + + +Segmuller & Leiba Standards Track [Page 2] + +RFC 5231 Sieve: Relational Extension January 2008 + + +4. Match Types + + This document defines two new match types. They are the VALUE match + type and the COUNT match type. + + The syntax is: + + MATCH-TYPE =/ COUNT / VALUE + + COUNT = ":count" relational-match + + VALUE = ":value" relational-match + + relational-match = DQUOTE + ("gt" / "ge" / "lt" / "le" / "eq" / "ne") DQUOTE + ; "gt" means "greater than", the C operator ">". + ; "ge" means "greater than or equal", the C operator ">=". + ; "lt" means "less than", the C operator "<". + ; "le" means "less than or equal", the C operator "<=". + ; "eq" means "equal to", the C operator "==". + ; "ne" means "not equal to", the C operator "!=". + +4.1. Match Type VALUE + + The VALUE match type does a relational comparison between strings. + + The VALUE match type may be used with any comparator that returns + sort information. + + A value from the message is considered the left side of the relation. + A value from the test expression, the key-list for address, envelope, + and header tests, is the right side of the relation. + + If there are multiple values on either side or both sides, the test + is considered true if any pair is true. + +4.2. Match Type COUNT + + The COUNT match type first determines the number of the specified + entities in the message and does a relational comparison of the + number of entities, as defined below, to the values specified in the + test expression. + + The COUNT match type SHOULD only be used with numeric comparators. + + The Address Test counts the number of addresses (the number of + "mailbox" elements, as defined in [RFC2822]) in the specified fields. + Group names are ignored, but the contained mailboxes are counted. + + + +Segmuller & Leiba Standards Track [Page 3] + +RFC 5231 Sieve: Relational Extension January 2008 + + + The Envelope Test counts the number of addresses in the specified + envelope parts. The envelope "to" will always have only one entry, + which is the address of the user for whom the Sieve script is + running. Using this test, there is no way a Sieve script can + determine if the message was actually sent to someone else. The + envelope "from" will be 0 if the MAIL FROM is empty, or 1 if MAIL + FROM is not empty. + + The Header Test counts the total number of instances of the specified + fields. This does not count individual addresses in the "to", "cc", + and other recipient fields. + + In all cases, if more than one field name is specified, the counts + for all specified fields are added together to obtain the number for + comparison. Thus, specifying ["to", "cc"] in an address COUNT test + compares the total number of "to" and "cc" addresses; if separate + counts are desired, they must be done in two comparisons, perhaps + joined by "allof" or "anyof". + +5. Interaction with Other Sieve Actions + + This specification adds two match types. The VALUE match type only + works with comparators that return sort information. The COUNT match + type only makes sense with numeric comparators. + + There is no interaction with any other Sieve operations, nor with any + known extensions. In particular, this specification has no effect on + implicit KEEP, nor on any explicit message actions. + +6. Example + + Using the message: + + received: ... + received: ... + subject: example + to: foo@example.com, baz@example.com + cc: qux@example.com + + The test: + + address :count "ge" :comparator "i;ascii-numeric" + ["to", "cc"] ["3"] + + would evaluate to true, and the test + + + + + + +Segmuller & Leiba Standards Track [Page 4] + +RFC 5231 Sieve: Relational Extension January 2008 + + + anyof ( address :count "ge" :comparator "i;ascii-numeric" + ["to"] ["3"], + address :count "ge" :comparator "i;ascii-numeric" + ["cc"] ["3"] ) + + would evaluate to false. + + To check the number of received fields in the header, the following + test may be used: + + header :count "ge" :comparator "i;ascii-numeric" + ["received"] ["3"] + + This would evaluate to false. But + + header :count "ge" :comparator "i;ascii-numeric" + ["received", "subject"] ["3"] + + would evaluate to true. + + The test: + + header :count "ge" :comparator "i;ascii-numeric" + ["to", "cc"] ["3"] + + will always evaluate to false on an RFC 2822 compliant message + [RFC2822], since a message can have at most one "to" field and at + most one "cc" field. This test counts the number of fields, not the + number of addresses. + + + + + + + + + + + + + + + + + + + + + + +Segmuller & Leiba Standards Track [Page 5] + +RFC 5231 Sieve: Relational Extension January 2008 + + +7. Extended Example + + require ["relational", "comparator-i;ascii-numeric", "fileinto"]; + + if header :value "lt" :comparator "i;ascii-numeric" + ["x-priority"] ["3"] + { + fileinto "Priority"; + } + + elsif address :count "gt" :comparator "i;ascii-numeric" + ["to"] ["5"] + { + # everything with more than 5 recipients in the "to" field + # is considered SPAM + fileinto "SPAM"; + } + + elsif address :value "gt" :all :comparator "i;ascii-casemap" + ["from"] ["M"] + { + fileinto "From N-Z"; + } else { + fileinto "From A-M"; + } + + if allof ( address :count "eq" :comparator "i;ascii-numeric" + ["to", "cc"] ["1"] , + address :all :comparator "i;ascii-casemap" + ["to", "cc"] ["me@foo.example.com"] ) + { + fileinto "Only me"; + } + +8. Changes since RFC 3431 + + Apart from several minor editorial/wording changes, the following + list describes the notable changes to this specification since RFC + 3431. + + o Updated references, including changing the comparator reference + from the Application Configuration Access Protocol (ACAP) to the + "Internet Application Protocol Collation Registry" document + [RFC4790]. + + o Updated and corrected the examples. + + + + + +Segmuller & Leiba Standards Track [Page 6] + +RFC 5231 Sieve: Relational Extension January 2008 + + + o Added definition comments to ABNF for "gt", "lt", etc. + + o Clarified what RFC 2822 elements are counted in the COUNT test. + + o Removed the requirement to strip white space from header fields + before comparing; a more general version of this requirement has + been added to the Sieve base spec. + +9. IANA Considerations + + The following template specifies the IANA registration of the + relational Sieve extension specified in this document: + + To: iana@iana.org + Subject: Registration of new Sieve extension + + Capability name: relational + Description: Extends existing conditional tests in Sieve language + to allow relational operators + RFC number: RFC 5231 + Contact address: The Sieve discussion list <ietf-mta-filters@imc.org> + +10. Security Considerations + + An implementation MUST ensure that the test for envelope "to" only + reflects the delivery to the current user. Using this test, it MUST + not be possible for a user to determine if this message was delivered + to someone else. + + Additional security considerations are discussed in [Sieve]. + +11. Normative References + + [ABNF] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", RFC 4234, October 2005. + + [Kwds] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", RFC 2119, March 1997. + + [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, + April 2001. + + [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet + Application Protocol Collation Registry", RFC 4790, + March 2007. + + [Sieve] Guenther, P., Ed. and T. Showalter, Ed., "Sieve: An Email + Filtering Language", RFC 5228, January 2008. + + + +Segmuller & Leiba Standards Track [Page 7] + +RFC 5231 Sieve: Relational Extension January 2008 + + +Authors' Addresses + + Wolfgang Segmuller + IBM T.J. Watson Research Center + 19 Skyline Drive + Hawthorne, NY 10532 + US + + Phone: +1 914 784 7408 + EMail: werewolf@us.ibm.com + + + Barry Leiba + IBM T.J. Watson Research Center + 19 Skyline Drive + Hawthorne, NY 10532 + US + + Phone: +1 914 784 7941 + EMail: leiba@watson.ibm.com + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Segmuller & Leiba Standards Track [Page 8] + +RFC 5231 Sieve: Relational Extension January 2008 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2008). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + + + + + + + + + + + +Segmuller & Leiba Standards Track [Page 9] + diff --git a/rfc/sieve/rfc5260.txt b/rfc/sieve/rfc5260.txt @@ -0,0 +1,731 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 5260 Sun Microsystems +Category: Standards Track July 2008 + + + Sieve Email Filtering: Date and Index Extensions + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This document describes the "date" and "index" extensions to the + Sieve email filtering language. The "date" extension gives Sieve the + ability to test date and time values in various ways. The "index" + extension provides a means to limit header and address tests to + specific instances of header fields when header fields are repeated. + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 + 2. Conventions Used in This Document . . . . . . . . . . . . . . 2 + 3. Capability Identifiers . . . . . . . . . . . . . . . . . . . . 3 + 4. Date Test . . . . . . . . . . . . . . . . . . . . . . . . . . 3 + 4.1. Zone and Originalzone Arguments . . . . . . . . . . . . . 4 + 4.2. Date-part Argument . . . . . . . . . . . . . . . . . . . . 4 + 4.3. Comparator Interactions with Date-part Arguments . . . . . 5 + 4.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 6 + 5. Currentdate Test . . . . . . . . . . . . . . . . . . . . . . . 6 + 5.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 6 + 6. Index Extension . . . . . . . . . . . . . . . . . . . . . . . 7 + 6.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . 8 + 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 + 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 + 9.1. Normative References . . . . . . . . . . . . . . . . . . . 9 + 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 + Appendix A. Julian Date Conversions . . . . . . . . . . . . . . . 11 + Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 12 + + + + + + + +Freed Standards Track [Page 1] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +1. Introduction + + Sieve [RFC5228] is a language for filtering email messages at or + around the time of final delivery. It is designed to be + implementable on either a mail client or mail server. It is meant to + be extensible, simple, and independent of access protocol, mail + architecture, and operating system. It is suitable for running on a + mail server where users may not be allowed to execute arbitrary + programs, such as on black box Internet Message Access Protocol + [RFC3501] servers, as it does not have user-controlled loops or the + ability to run external programs. + + The "date" extension provides a new date test to extract and match + date/time information from structured header fields. The date test + is similar in concept to the address test specified in [RFC5228], + which performs similar operations on addresses in header fields. + + The "date" extension also provides a currentdate test that operates + on the date and time when the Sieve script is executed. + + Some header fields containing date/time information, e.g., Received:, + naturally occur more than once in a single header. In such cases it + is useful to be able to restrict the date test to some subset of the + fields that are present. For example, it may be useful to apply a + date test to the last (earliest) Received: field. Additionally, it + may also be useful to apply similar restrictions to either the header + or address tests specified in [RFC5228]. + + For this reason, this specification also defines an "index" + extension. This extension adds two additional tagged arguments + :index and :last to the header, address, and date tests. If present, + these arguments specify which occurrence of the named header field is + to be tested. + +2. Conventions Used in This Document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [RFC2119]. + + The terms used to describe the various components of the Sieve + language are taken from Section 1.1 of [RFC5228]. Section 2 of the + same document describes basic Sieve language syntax and semantics. + The date-time syntactic element defined using ABNF notation [RFC5234] + in [RFC3339] is also used here. + + + + + + +Freed Standards Track [Page 2] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +3. Capability Identifiers + + The capability strings associated with the two extensions defined in + this document are "date" and "index". + +4. Date Test + + Usage: date [<":zone" <time-zone: string>> / ":originalzone"] + [COMPARATOR] [MATCH-TYPE] <header-name: string> + <date-part: string> <key-list: string-list> + + The date test matches date/time information derived from headers + containing [RFC2822] date-time values. The date/time information is + extracted from the header, shifted to the specified time zone, and + the value of the given date-part is determined. The test returns + true if the resulting string matches any of the strings specified in + the key-list, as controlled by the comparator and match keywords. + The date test returns false unconditionally if the specified header + field does not exist, the field exists but does not contain a + syntactically valid date-time specification, the date-time isn't + valid according to the rules of the calendar system (e.g., January + 32nd, February 29 in a non-leap year), or the resulting string fails + to match any key-list value. + + The type of match defaults to ":is" and the default comparator is + "i;ascii-casemap". + + Unlike the header and address tests, the date test can only be + applied to a single header field at a time. If multiple header + fields with the same name are present, only the first field that is + found is used. (Note, however, that this behavior can be modified + with the "index" extension defined below.) These restrictions + simplify the test and keep the meaning clear. + + The "relational" extension [RFC5231] adds a match type called + ":count". The count of a date test is 1 if the specified field + exists and contains a valid date; 0, otherwise. + + Implementations MUST support extraction of RFC 2822 date-time + information that either makes up the entire header field (e.g., as it + does in a standard Date: header field) or appears at the end of a + header field following a semicolon (e.g., as it does in a standard + Received: header field). Implementations MAY support extraction of + date and time information in RFC2822 or other formats that appears in + other positions in header field content. In the case of a field + containing more than one date or time value, the last one that + appears SHOULD be used. + + + + +Freed Standards Track [Page 3] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +4.1. Zone and Originalzone Arguments + + The :originalzone argument specifies that the time zone offset + originally in the extracted date-time value should be retained. The + :zone argument specifies a specific time zone offset that the date- + time value is to be shifted to prior to testing. It is an error to + specify both :zone and :originalzone. + + The value of time-zone MUST be an offset relative to UTC with the + following syntax: + + time-zone = ( "+" / "-" ) 4DIGIT + + The "+" or "-" indicates whether the time-of-day is ahead of (i.e., + east of) or behind (i.e., west of) UTC. The first two digits + indicate the number of hours difference from Universal Time, and the + last two digits indicate the number of minutes difference from + Universal Time. Note that this agrees with the RFC 2822 format for + time zone offsets, not the ISO 8601 format. + + If both the :zone and :originalzone arguments are omitted, the local + time zone MUST be used. + +4.2. Date-part Argument + + The date-part argument specifies a particular part of the resulting + date/time value to match against the key-list. Possible case- + insensitive values are: + + "year" => the year, "0000" .. "9999". + "month" => the month, "01" .. "12". + "day" => the day, "01" .. "31". + "date" => the date in "yyyy-mm-dd" format. + "julian" => the Modified Julian Day, that is, the date + expressed as an integer number of days since + 00:00 UTC on November 17, 1858 (using the Gregorian + calendar). This corresponds to the regular + Julian Day minus 2400000.5. Sample routines to + convert to and from modified Julian dates are + given in Appendix A. + "hour" => the hour, "00" .. "23". + "minute" => the minute, "00" .. "59". + "second" => the second, "00" .. "60". + "time" => the time in "hh:mm:ss" format. + "iso8601" => the date and time in restricted ISO 8601 format. + "std11" => the date and time in a format appropriate + for use in a Date: header field [RFC2822]. + + + + +Freed Standards Track [Page 4] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + "zone" => the time zone in use. If the user specified a + time zone with ":zone", "zone" will + contain that value. If :originalzone is specified + this value will be the original zone specified + in the date-time value. If neither argument is + specified the value will be the server's default + time zone in offset format "+hhmm" or "-hhmm". An + offset of 0 (Zulu) always has a positive sign. + "weekday" => the day of the week expressed as an integer between + "0" and "6". "0" is Sunday, "1" is Monday, etc. + + The restricted ISO 8601 format is specified by the date-time ABNF + production given in [RFC3339], Section 5.6, with the added + restrictions that the letters "T" and "Z" MUST be in upper case, and + a time zone offset of zero MUST be represented by "Z" and not + "+00:00". + +4.3. Comparator Interactions with Date-part Arguments + + Not all comparators are suitable with all date-part arguments. In + general, the date-parts can be compared and tested for equality with + either "i;ascii-casemap" (the default) or "i;octet", but there are + two exceptions: + + julian This is an integer, and may or may not have leading zeros. + As such, "i;ascii-numeric" is almost certainly the best + comparator to use with it. + + std11 This is provided as a means to obtain date/time values in a + format appropriate for inclusion in email header fields. The + wide range of possible syntaxes for a std11 date/time -- + which implementations of this extension are free to use when + composing a std11 string -- makes this format a poor choice + for comparisons. Nevertheless, if a comparison must be + performed, this is case-insensitive, and therefore "i;ascii- + casemap" needs to be used. + + "year", "month", "day", "hour", "minute", "second" and "weekday" all + use fixed-width string representations of integers, and can therefore + be compared with "i;octet", "i;ascii-casemap", and "i;ascii-numeric" + with equivalent results. + + "date" and "time" also use fixed-width string representations of + integers, and can therefore be compared with "i;octet" and "i;ascii- + casemap"; however, "i;ascii-numeric" can't be used with it, as + "i;ascii-numeric" doesn't allow for non-digit characters. + + + + + +Freed Standards Track [Page 5] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +4.4. Examples + + The Date: field can be checked to test when the sender claims to have + created the message and act accordingly: + + require ["date", "relational", "fileinto"]; + if allof(header :is "from" "boss@example.com", + date :value "ge" :originalzone "date" "hour" "09", + date :value "lt" :originalzone "date" "hour" "17") + { fileinto "urgent"; } + + Testing the initial Received: field can provide an indication of when + a message was actually received by the local system: + + require ["date", "relational", "fileinto"]; + if anyof(date :is "received" "weekday" "0", + date :is "received" "weekday" "6") + { fileinto "weekend"; } + +5. Currentdate Test + + Usage: currentdate [":zone" <time-zone: string>] + [COMPARATOR] [MATCH-TYPE] + <date-part: string> + <key-list: string-list> + + The currentdate test is similar to the date test, except that it + operates on the current date/time rather than a value extracted from + the message header. In particular, the ":zone" and date-part + arguments are the same as those in the date test. + + All currentdate tests in a single Sieve script MUST refer to the same + point in time during execution of the script. + + The :count value of a currentdate test is always 1. + +5.1. Examples + + The simplest use of currentdate is to have an action that only + operates at certain times. For example, a user might want to have + messages redirected to their pager after business hours and on + weekends: + + + + + + + + + +Freed Standards Track [Page 6] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + require ["date", "relational"]; + if anyof(currentdate :is "weekday" "0", + currentdate :is "weekday" "6", + currentdate :value "lt" "hour" "09", + currentdate :value "ge" "hour" "17") + { redirect "pager@example.com"; } + + Currentdate can be used to set up vacation [RFC5230] responses in + advance and to stop response generation automatically: + + require ["date", "relational", "vacation"]; + if allof(currentdate :value "ge" "date" "2007-06-30", + currentdate :value "le" "date" "2007-07-07") + { vacation :days 7 "I'm away during the first week in July."; } + + Currentdate may also be used in conjunction with the variables + extension to pass time-dependent arguments to other tests and + actions. The following Sieve places messages in a folder named + according to the current month and year: + + require ["date", "variables", "fileinto"]; + if currentdate :matches "month" "*" { set "month" "${1}"; } + if currentdate :matches "year" "*" { set "year" "${1}"; } + fileinto "${month}-${year}"; + + Finally, currentdate can be used in conjunction with the editheader + extension to insert a header-field containing date/time information: + + require ["variables", "date", "editheader"]; + if currentdate :matches "std11" "*" + {addheader "Processing-date" "${0}";} + +6. Index Extension + + The "index" extension, if specified, adds optional :index and :last + arguments to the header, address, and date tests as follows: + + Syntax: date [":index" <fieldno: number> [":last"]] + [<":zone" <time-zone: string>> / ":originalzone"] + [COMPARATOR] [MATCH-TYPE] <header-name: string> + <date-part: string> <key-list: string-list> + + + Syntax: header [":index" <fieldno: number> [":last"]] + [COMPARATOR] [MATCH-TYPE] + <header-names: string-list> <key-list: string-list> + + + + + +Freed Standards Track [Page 7] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + Syntax: address [":index" <fieldno: number> [":last"]] + [ADDRESS-PART] [COMPARATOR] [MATCH-TYPE] + <header-list: string-list> <key-list: string-list> + + If :index <fieldno> is specified, the attempts to match a value are + limited to the header field fieldno (beginning at 1, the first named + header field). If :last is also specified, the count is backwards; 1 + denotes the last named header field, 2 the second to last, and so on. + Specifying :last without :index is an error. + + :index only counts separate header fields, not multiple occurrences + within a single field. In particular, :index cannot be used to test + a specific address in an address list contained within a single + header field. + + Both header and address allow the specification of more than one + header field name. If more than one header field name is specified, + all the named header fields are counted in the order specified by the + header-list. + +6.1. Example + + Mail delivery may involve multiple hops, resulting in the Received: + field containing information about when a message first entered the + local administrative domain being the second or subsequent field in + the message. As long as the field offset is consistent, it can be + tested: + + # Implement the Internet-Draft cutoff date check assuming the + # second Received: field specifies when the message first + # entered the local email infrastructure. + require ["date", "relational", "index"]; + if date :value "gt" :index 2 :zone "-0500" "received" + "iso8601" "2007-02-26T09:00:00-05:00", + { redirect "aftercutoff@example.org"; } + +7. Security Considerations + + The facilities defined here, like the facilities in the base Sieve + specification, operate on message header information that can easily + be forged. Note, however, that some fields are inherently more + reliable than others. For example, the Date: field is typically + inserted by the message sender and can be altered at any point. By + contrast, the uppermost Received: field is typically inserted by the + local mail system and is therefore difficult for the sender or an + intermediary to falsify. + + + + + +Freed Standards Track [Page 8] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + Use of the currentdate test makes script behavior inherently less + predictable and harder to analyze. This may have consequences for + systems that use script analysis to try and spot problematic scripts. + + All of the security considerations given in the base Sieve + specification also apply to these extensions. + +8. IANA Considerations + + The following templates specify the IANA registrations of the two + Sieve extensions specified in this document: + + To: iana@iana.org + Subject: Registration of new Sieve extensions + + Capability name: date + Description: The "date" extension gives Sieve the ability + to test date and time values. + RFC number: RFC 5260 + Contact address: Sieve discussion list <ietf-mta-filters@imc.org> + + Capability name: index + Description: The "index" extension provides a means to + limit header and address tests to specific + instances when more than one field of a + given type is present. + RFC number: RFC 5260 + Contact address: Sieve discussion list <ietf-mta-filters@imc.org> + +9. References + +9.1. Normative References + + [CALGO199] Tantzen, R., "Algorithm 199: Conversions Between Calendar + Date and Julian Day Number", Collected Algorithms from + CACM 199. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, + April 2001. + + [RFC3339] Klyne, G., Ed. and C. Newman, "Date and Time on the + Internet: Timestamps", RFC 3339, July 2002. + + [RFC5228] Guenther, P. and T. Showalter, "Sieve: An Email Filtering + Language", RFC 5228, January 2008. + + + +Freed Standards Track [Page 9] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + [RFC5231] Segmuller, W. and B. Leiba, "Sieve Email Filtering: + Relational Extension", RFC 5231, January 2008. + + [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, January 2008. + +9.2. Informative References + + [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION + 4rev1", RFC 3501, March 2003. + + [RFC5230] Showalter, T. and N. Freed, "Sieve Email Filtering: + Vacation Extension", RFC 5230, January 2008. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed Standards Track [Page 10] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +Appendix A. Julian Date Conversions + + The following C routines show how to translate day/month/year + information to and from modified Julian dates. These routines are + straightforward translations of the Algol routines specified in CACM + Algorithm 199 [CALGO199]. + + Given the day, month, and year, jday returns the modified Julian + date. + + int jday(int year, int month, int day) + { + int j, c, ya; + + if (month > 2) + month -= 3; + else + { + month += 9; + year--; + } + c = year / 100; + ya = year - c * 100; + return (c * 146097 / 4 + ya * 1461 / 4 + (month * 153 + 2) / 5 + + day + 1721119); + } + + + + + + + + + + + + + + + + + + + + + + + + + +Freed Standards Track [Page 11] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + + Given j, the modified Julian date, jdate returns the day, month, and + year. + + void jdate(int j, int *year, int *month, int *day) + { + int y, m, d; + + j -= 1721119; + y = (j * 4 - 1) / 146097; + j = j * 4 - y * 146097 - 1; + d = j / 4; + j = (d * 4 + 3) / 1461; + d = d * 4 - j * 1461 + 3; + d = (d + 4) / 4; + m = (d * 5 - 3) / 153; + d = d * 5 - m * 153 - 3; + *day = (d + 5) / 5; + *year = y * 100 + j; + if (m < 10) + *month = m + 3; + else + { + *month = m - 9; + *year += 1; + } + } + +Appendix B. Acknowledgements + + Dave Cridland contributed the text describing the proper comparators + to use with different date-parts. Cyrus Daboo, Frank Ellerman, + Alexey Melnikov, Chris Newman, Dilyan Palauzov, and Aaron Stone + provided helpful suggestions and corrections. + +Author's Address + + Ned Freed + Sun Microsystems + 800 Royal Oaks + Monrovia, CA 91016-6347 + USA + + Phone: +1 909 457 4293 + EMail: ned.freed@mrochek.com + + + + + + + +Freed Standards Track [Page 12] + +RFC 5260 Sieve Date and Index Extensions July 2008 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2008). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + + + + + + + + + + + +Freed Standards Track [Page 13] + diff --git a/rfc/sieve/rfc5437.txt b/rfc/sieve/rfc5437.txt @@ -0,0 +1,787 @@ + + + + + + +Network Working Group P. Saint-Andre +Request for Comments: 5437 Cisco +Category: Standards Track A. Melnikov + Isode Limited + January 2009 + + + Sieve Notification Mechanism: + Extensible Messaging and Presence Protocol (XMPP) + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (c) 2009 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (http://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. + +Abstract + + This document describes a profile of the Sieve extension for + notifications, to allow notifications to be sent over the Extensible + Messaging and Presence Protocol (XMPP), also known as Jabber. + + + + + + + + + + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 1] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +Table of Contents + + 1. Introduction ....................................................3 + 1.1. Overview ...................................................3 + 1.2. Terminology ................................................3 + 2. Definition ......................................................3 + 2.1. Notify Parameter "method" ..................................3 + 2.2. Test notify_method_capability ..............................3 + 2.3. Notify Tag ":from" .........................................4 + 2.4. Notify Tag ":importance" ...................................4 + 2.5. Notify Tag ":message" ......................................4 + 2.6. Notify Tag ":options" ......................................4 + 2.7. XMPP Syntax ................................................4 + 3. Examples ........................................................6 + 3.1. Basic Action ...............................................6 + 3.2. Action with "body" .........................................7 + 3.3. Action with "body", ":importance", ":message", and + "subject" ..................................................7 + 3.4. Action with ":from", ":message", ":importance", + "body", and "subject" ......................................8 + 4. Requirements Conformance ........................................9 + 5. Internationalization Considerations ............................10 + 6. Security Considerations ........................................11 + 7. IANA Considerations ............................................12 + 8. References .....................................................12 + 8.1. Normative References ......................................12 + 8.2. Informative References ....................................13 + + + + + + + + + + + + + + + + + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 2] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +1. Introduction + +1.1. Overview + + The [NOTIFY] extension to the [SIEVE] mail filtering language is a + framework for providing notifications by employing URIs to specify + the notification mechanism. This document defines how xmpp URIs (see + [XMPP-URI]) are used to generate notifications via the Extensible + Messaging and Presence Protocol [XMPP], which is widely implemented + in Jabber instant messaging technologies. + +1.2. Terminology + + This document inherits terminology from [NOTIFY], [SIEVE], and + [XMPP]. In particular, the terms "parameter" and "tag" are used as + described in [NOTIFY] to refer to aspects of Sieve scripts, and the + term "key" is used as described in [XMPP-URI] to refer to aspects of + an XMPP URI. + + The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", + "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT + RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be + interpreted as described in [TERMS]. + +2. Definition + +2.1. Notify Parameter "method" + + The "method" parameter MUST be a URI that conforms to the xmpp URI + scheme (as specified in [XMPP-URI]) and that identifies an XMPP + account associated with the email inbox. The URI MAY include the + resource identifier of an XMPP address and/or the query component + portion of an XMPP URI, but SHOULD NOT include an authority component + or fragment identifier component. The processing application MUST + extract an XMPP address from the URI in accordance with the + processing rules specified in [XMPP-URI]. The resulting XMPP address + MUST be encapsulated in XMPP syntax as the value of the XMPP 'to' + attribute. + +2.2. Test notify_method_capability + + In response to a notify_method_capability test for the "online" + notification-capability, an implementation SHOULD return a value of + "yes" if it has knowledge of an active presence session (see + [XMPP-IM]) for the specified XMPP notification-uri; otherwise, it + SHOULD return a value of "maybe" (since typical XMPP systems may not + allow a Sieve engine to gain knowledge about the presence of XMPP + entities). + + + +Saint-Andre & Melnikov Standards Track [Page 3] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +2.3. Notify Tag ":from" + + If included, the ":from" tag MUST be an electronic address that + conforms to the "Mailbox" rule defined in [RFC5321]. The value of + the ":from" tag MAY be included in the human-readable XML character + data of the XMPP notification; alternatively or in addition, it MAY + be transformed into formal XMPP syntax, in which case it MUST be + encapsulated as the value of an XMPP SHIM (Stanza Headers and + Internet Metadata) [SHIM] header named "Resent-From". + +2.4. Notify Tag ":importance" + + The ":importance" tag has no special meaning for this notification + mechanism, and this specification puts no restriction on its use. + The value of the ":importance" tag MAY be transformed into XMPP + syntax (in addition to or instead of including appropriate text in + the XML character data of the XMPP <body/> element); if so, it SHOULD + be encapsulated as the value of an XMPP SHIM (Stanza Headers and + Internet Metadata) [SHIM] header named "Urgency", where the XML + character of that header is "high" if the value of the ":importance" + tag is "1", "medium" if the value of the ":importance" tag is "2", + and "low" if the value of the ":importance" tag is "3". + +2.5. Notify Tag ":message" + + If the ":message" tag is included, that string MUST be transformed + into the XML character data of an XMPP <body/> element (where the + string is generated according to the guidelines specified in Section + 3.6 of [NOTIFY]). + +2.6. Notify Tag ":options" + + The ":options" tag has no special meaning for this notification + mechanism. Any handling of this tag is the responsibility of an + implementation. + +2.7. XMPP Syntax + + The xmpp mechanism results in the sending of an XMPP message to + notify a recipient about an email message. The general XMPP syntax + is as follows: + + o The notification MUST be an XMPP <message/> stanza. + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 4] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + o The value of the XMPP 'from' attribute SHOULD be the XMPP address + of the notification service associated with the Sieve engine or + the XMPP address of the entity to be notified. The value of the + XMPP 'from' attribute MUST NOT be generated from the Sieve ":from" + tag. + + o The value of the XMPP 'to' attribute MUST be the XMPP address + specified in the XMPP URI contained in the "method" notify + parameter. + + o The value of the XMPP 'type' attribute MUST be 'headline' or + 'normal'. + + o The XMPP <message/> stanza MUST include a <body/> child element. + If the ":message" tag is included in the Sieve script, that string + MUST be used as the XML character data of the <body/> element. If + not and if the XMPP URI contained in the "method" notify parameter + specified a "body" key in the query component, that value SHOULD + be used. Otherwise, the XML character data SHOULD be some + configurable text indicating that the message is a Sieve + notification. + + o The XMPP <message/> stanza MAY include a <subject/> child element. + If the XMPP URI contained in the "method" notify parameter + specified a "subject" key in the query component, that value + SHOULD be used as the XML character data of the <subject/> + element. Otherwise, the XML character data SHOULD be some + configurable text indicating that the message is a Sieve + notification. + + o The XMPP <message/> stanza SHOULD include a URI, for the recipient + to use as a hint in locating the message, encapsulated as the XML + character data of a <url/> child element of an <x/> element + qualified by the 'jabber:x:oob' namespace, as specified in [OOB]. + If included, the URI SHOULD be an Internet Message Access Protocol + [IMAP] URL that specifies the location of the message, as defined + in [IMAP-URL], but MAY be another URI type that can specify or + hint at the location of an email message, such as a URI for an + HTTP resource [HTTP] or a Post Office Protocol Version 3 (POP3) + mailbox [POP-URL] at which the message can be accessed. It is not + expected that an XMPP user agent shall directly handle such a URI, + but instead that it shall invoke an appropriate helper application + to handle the URI. + + o The XMPP <message/> stanza MAY include an XMPP SHIM (Stanza + Headers and Internet Metadata) [SHIM] header named "Resent-From". + If the Sieve script included a ":from" tag, the "Resent-From" + + + + +Saint-Andre & Melnikov Standards Track [Page 5] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + value MUST be the value of the ":from" tag; otherwise, the + "Resent-From" value SHOULD be the envelope recipient address of + the original email message that triggered the notification. + +3. Examples + + In the following examples, the sender of the email has an address of + <mailto:juliet@example.org>, the entity to be notified has an email + address of <mailto:romeo@example.com> and an XMPP address of + romeo@im.example.com (resulting in an XMPP URI of + <xmpp:romeo@im.example.com>), and the notification service associated + with the Sieve engine has an XMPP address of notify.example.com. + + Note: In the following examples, line breaks are included in XMPP + URIs solely for the purpose of readability. + +3.1. Basic Action + + The following is a basic Sieve notify action with only a method. The + XML character data of the XMPP <body/> and <subject/> elements are + therefore generated by the Sieve engine based on configuration. In + addition, the Sieve engine includes a URI pointing to the message. + + Basic action (Sieve syntax) + + notify "xmpp:romeo@im.example.com" + + The resulting XMPP <message/> stanza might be as follows: + + Basic action (XMPP syntax) + + <message from='notify.example.com' + to='romeo@im.example.com' + type='headline' + xml:lang='en'> + <subject>SIEVE</subject> + <body>&lt;juliet@example.com&gt; You got mail.</body> + <x xmlns='jabber:x:oob'> + <url> + imap://romeo@example.com/INBOX;UIDVALIDITY=385759043/;UID=18 + </url> + </x> + </message> + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 6] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +3.2. Action with "body" + + The following action contains a "body" key in the query component of + the XMPP URI but no ":message" tag in the Sieve script. As a result, + the XML character data of the XMPP <body/> element in the XMPP + notification is taken from the XMPP URI. In addition, the Sieve + engine includes a URI pointing to the message. + + Action with "body" (Sieve syntax) + + notify "xmpp:romeo@im.example.com?message + ;body=Wherefore%20art%20thou%3F" + + The resulting XMPP <message/> stanza might be as follows. + + Action with "body" (XMPP syntax) + + <message from='notify.example.com' + to='romeo@im.example.com' + type='headline' + xml:lang='en'> + <subject>SIEVE</subject> + <body>Wherefore art thou?</body> + <x xmlns='jabber:x:oob'> + <url> + imap://romeo@example.com/INBOX;UIDVALIDITY=385759044/;UID=19 + </url> + </x> + </message> + +3.3. Action with "body", ":importance", ":message", and "subject" + + The following action specifies an ":importance" tag and a ":message" + tag in the Sieve script, as well as a "body" key and a "subject" key + in the query component of the XMPP URI. As a result, the ":message" + tag from the Sieve script overrides the "body" key from the XMPP URI + when generating the XML character data of the XMPP <body/> element. + In addition, the Sieve engine includes a URI pointing to the message. + + Action with "body", ":importance", ":message", and "subject" (Sieve + syntax) + + notify :importance "1" + :message "Contact Juliet immediately!" + "xmpp:romeo@im.example.com?message + ;body=You%27re%20in%20trouble + ;subject=ALERT%21" + + + + +Saint-Andre & Melnikov Standards Track [Page 7] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + The resulting XMPP <message/> stanza might be as follows. + + Action with "body", ":importance", ":message", and "subject" (XMPP + syntax) + + <message from='notify.example.com' + to='romeo@im.example.com' + type='headline' + xml:lang='en'> + <subject>ALERT!</subject> + <body>Contact Juliet immediately!</body> + <headers xmlns='http://jabber.org/protocol/shim'> + <header name='Urgency'>high</header> + </headers> + <x xmlns='jabber:x:oob'> + <url> + imap://romeo@example.com/INBOX;UIDVALIDITY=385759045/;UID=20 + </url> + </x> + </message> + +3.4. Action with ":from", ":message", ":importance", "body", and + "subject" + + The following action specifies a ":from" tag, an ":importance" tag, + and a ":message" tag in the Sieve script, as well as a "body" key and + a "subject" key in the query component of the XMPP URI. As a result, + the ":message" tag from the Sieve script overrides the "body" key + from the XMPP URI when generating the XML character data of the XMPP + <body/> element. In addition, the Sieve engine includes a URI + pointing to the message, as well as an XMPP SHIM (Stanza Headers and + Internet Metadata) [SHIM] header named "Resent-From" (which + encapsulates the value of the ":from" tag). + + Action with ":from", ":importance", ":message", "body", and "subject" + (Sieve syntax) + + notify :from "romeo.my.romeo@example.com" + :importance "1" + :message "Contact Juliet immediately!" + "xmpp:romeo@im.example.com?message + ;body=You%27re%20in%20trouble + ;subject=ALERT%21" + + The resulting XMPP <message/> stanza might be as follows. + + + + + + +Saint-Andre & Melnikov Standards Track [Page 8] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + Action with ":from", ":importance", ":message", "body", and "subject" + (XMPP syntax) + + <message from='notify.example.com' + to='romeo@im.example.com' + type='headline' + xml:lang='en'> + <subject>ALERT!</subject> + <body>Contact Juliet immediately!</body> + <headers xmlns='http://jabber.org/protocol/shim'> + <header name='Resent-From'>romeo.my.romeo@example.com</header> + <header name='Urgency'>high</header> + </headers> + <x xmlns='jabber:x:oob'> + <url> + imap://romeo@example.com/INBOX;UIDVALIDITY=385759045/;UID=21 + </url> + </x> + </message> + +4. Requirements Conformance + + Section 3.8 of [NOTIFY] specifies a set of requirements for Sieve + notification methods. The conformance of the xmpp notification + mechanism is provided here. + + 1. An implementation of the xmpp notification method SHOULD NOT + modify the final notification text (e.g., to limit the length); + however, a given deployment MAY do so (e.g., if recipients pay + per character or byte for XMPP messages). Modification of + characters themselves should not be necessary, since XMPP + character data is encoded in [UTF-8]. + + 2. An implementation MAY ignore parameters specified in the + ":from", ":importance", and ":options" tags. + + 3. There is no recommended default message for an implementation to + include if the ":message" tag is not specified. + + 4. A notification sent via the xmpp notification method MAY include + a timestamp in the textual message. + + 5. The value of the XMPP 'from' attribute MUST be the XMPP address + of the notification service associated with the Sieve engine. + The value of the Sieve ":from" tag MAY be transformed into the + value of an XMPP SHIM (Stanza Headers and Internet Metadata) + [SHIM] header named "Resent-From". + + + + +Saint-Andre & Melnikov Standards Track [Page 9] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + 6. The value of the XMPP 'to' attribute MUST be the XMPP address + specified in the XMPP URI contained in the "method" parameter. + + 7. In accordance with [XMPP-URI], an implementation MUST ignore any + URI action or key it does not understand (i.e., the URI MUST be + processed as if the action or key were not present). It is + RECOMMENDED to support the XMPP "message" query type (see + [QUERIES]) and the associated "body" and "subject" keys, which + SHOULD be mapped to the XMPP <body/> and <subject/> child + elements of the XMPP <message/> stanza, respectively. However, + if included, then the Sieve notify ":message" tag MUST be mapped + to the XMPP <body/> element, overriding the "body" key (if any) + included in the XMPP URI. + + 8. An implementation MUST NOT include any other extraneous + information not specified in parameters to the notify action. + + 9. In response to a notify_method_capability test for the "online" + notification-capability, an implementation SHOULD return a value + of "yes" if it has knowledge of an active presence session (see + [XMPP-IM]) for the specified XMPP notification-uri, but only if + the entity that requested the test is authorized to know the + presence of the associated XMPP entity (e.g., via explicit + presence subscription as specified in [XMPP-IM]); otherwise, it + SHOULD return a value of "maybe" (since typical XMPP systems may + not allow a Sieve engine to gain knowledge about the presence of + XMPP entities). + + 10. An implementation SHOULD NOT attempt to retry delivery of a + notification if it receives an XMPP error of type "auth" or + "cancel", MAY attempt to retry delivery if it receives an XMPP + error of type "wait", and MAY attempt to retry delivery if it + receives an XMPP error of "modify", but only if it makes + appropriate modifications to the notification (see [XMPP]); in + any case, the number of retries SHOULD be limited to a + configurable number no less than 3 and no more than 10. An + implementation MAY throttle notifications if the number of + notifications within a given time period becomes excessive + according to local service policy. Duplicate suppression (if + any) is a matter of implementation and is not specified herein. + +5. Internationalization Considerations + + Although an XMPP address may contain nearly any [UNICODE] character, + the value of the "method" parameter MUST be a Uniform Resource + Identifier (see [URI]) rather than an Internationalized Resource + Identifier (see [IRI]). The rules specified in [XMPP-URI] MUST be + followed when generating XMPP URIs. + + + +Saint-Andre & Melnikov Standards Track [Page 10] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + + In accordance with Section 13 of RFC 3920, all data sent over XMPP + MUST be encoded in [UTF-8]. + +6. Security Considerations + + Depending on the information included, sending a notification can be + comparable to forwarding mail to the notification recipient. Care + must be taken when forwarding mail automatically, to ensure that + confidential information is not sent into an insecure environment. + In particular, implementations MUST conform to the security + considerations given in [NOTIFY], [SIEVE], and [XMPP]. + + [NOTIFY] specifies that a notification method MUST provide mechanisms + for avoiding notification loops. One type of notification loop can + be caused by message forwarding; however, such loops are prevented + because XMPP does not support the forwarding of messages from one + XMPP address to another. Another type of notification loop can be + caused by auto-replies to XMPP messages received by the XMPP + notification service associated with the Sieve engine; therefore, + such a service MUST NOT auto-reply to XMPP messages it receives. + + A common use case might be for a user to create a script that enables + the Sieve engine to act differently if the user is currently + available at a particular type of service (e.g., send notifications + to the user's XMPP address if the user has an active session at an + XMPP service). Whether the user is currently available can be + determined by means of a notify_method_capability test for the + "online" notification-capability. In XMPP, information about current + network availability is called "presence" (see also [MODEL]). Since + [XMPP-IM] requires that a user must approve a presence subscription + before an entity can gain access to the user's presence information, + a limited but reasonably safe implementation might be for the Sieve + engine to request a subscription to the user's presence. The user + would then need to approve that subscription request so that the + Sieve engine can act appropriately depending on whether the user is + online or offline. However, the Sieve engine MUST NOT use the user's + presence information when processing scripts on behalf of a script + owner other than the user, unless the Sieve engine has explicit + knowledge (e.g., via integration with an XMPP server's presence + authorization rules) that the script owner is authorized to know the + user's presence. While it would be possible to design a more + advanced approach to the delegation of presence authorization, any + such approach is left to future standards work. + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 11] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +7. IANA Considerations + + The following template provides the IANA registration of the Sieve + notification mechanism specified in this document: + + To: iana@iana.org + Subject: Registration of new Sieve notification mechanism + Mechanism name: xmpp + Mechanism URI: RFC 5122 [XMPP-URI] + Mechanism-specific options: none + Permanent and readily available reference: RFC 5437 + Person and email address to contact for further information: + Peter Saint-Andre <registrar@xmpp.org> + + This information has been added to the list of Sieve notification + mechanisms maintained at <http://www.iana.org>. + +8. References + +8.1. Normative References + + [NOTIFY] Melnikov, A., Ed., Leiba, B., Ed., Segmuller, W., and T. + Martin, "Sieve Email Filtering: Extension for + Notifications", RFC 5435, January 2009. + + [OOB] Saint-Andre, P., "Out of Band Data", XSF XEP 0066, + August 2006. + + [QUERIES] Saint-Andre, P., "XMPP URI Scheme Query Components", XSF + XEP 0147, September 2006. + + [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, + October 2008. + + [SHIM] Saint-Andre, P. and J. Hildebrand, "Stanza Headers and + Internet Metadata", XSF XEP 0131, July 2006. + + [SIEVE] Guenther, P., Ed. and T. Showalter, Ed., "Sieve: An Email + Filtering Language", RFC 5228, January 2008. + + [TERMS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [XMPP-URI] Saint-Andre, P., "Internationalized Resource Identifiers + (IRIs) and Uniform Resource Identifiers (URIs) for the + Extensible Messaging and Presence Protocol (XMPP)", + RFC 5122, February 2008. + + + + +Saint-Andre & Melnikov Standards Track [Page 12] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +8.2. Informative References + + [HTTP] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., + Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext + Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. + + [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION + 4rev1", RFC 3501, March 2003. + + [IMAP-URL] Melnikov, A. and C. Newman, "IMAP URL Scheme", RFC 5092, + November 2007. + + [IRI] Duerst, M. and M. Suignard, "Internationalized Resource + Identifiers (IRIs)", RFC 3987, January 2005. + + [MODEL] Day, M., Rosenberg, J., and H. Sugano, "A Model for + Presence and Instant Messaging", RFC 2778, February 2000. + + [POP-URL] Gellens, R., "POP URL Scheme", RFC 2384, August 1998. + + [UNICODE] The Unicode Consortium, "The Unicode Standard, Version + 3.2.0", 2000. + + The Unicode Standard, Version 3.2.0 is defined by The + Unicode Standard, Version 3.0 (Reading, MA, Addison- + Wesley, 2000. ISBN 0-201-61633-5), as amended by the + Unicode Standard Annex #27: Unicode 3.1 + (http://www.unicode.org/reports/tr27/) and by the Unicode + Standard Annex #28: Unicode 3.2 + (http://www.unicode.org/reports/tr28/). + + [URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, January 2005. + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003. + + [XMPP] Saint-Andre, P., "Extensible Messaging and Presence + Protocol (XMPP): Core", RFC 3920, October 2004. + + [XMPP-IM] Saint-Andre, P., "Extensible Messaging and Presence + Protocol (XMPP): Instant Messaging and Presence", + RFC 3921, October 2004. + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 13] + +RFC 5437 Sieve Notify Method: XMPP January 2009 + + +Authors' Addresses + + Peter Saint-Andre + Cisco + + EMail: psaintan@cisco.com + + + Alexey Melnikov + Isode Limited + + EMail: Alexey.Melnikov@isode.com + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Saint-Andre & Melnikov Standards Track [Page 14] + diff --git a/rfc/sieve/rfc5804.txt b/rfc/sieve/rfc5804.txt @@ -0,0 +1,2747 @@ + + + + + + +Internet Engineering Task Force (IETF) A. Melnikov, Ed. +Request for Comments: 5804 Isode Limited +Category: Standards Track T. Martin +ISSN: 2070-1721 BeThereBeSquare, Inc. + July 2010 + + + A Protocol for Remotely Managing Sieve Scripts + +Abstract + + Sieve scripts allow users to filter incoming email. Message stores + are commonly sealed servers so users cannot log into them, yet users + must be able to update their scripts on them. This document + describes a protocol "ManageSieve" for securely managing Sieve + scripts on a remote server. This protocol allows a user to have + multiple scripts, and also alerts a user to syntactically flawed + scripts. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc5804. + +Copyright Notice + + Copyright (c) 2010 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + + + + +Melnikov & Martin Standards Track [Page 1] + +RFC 5804 ManageSieve July 2010 + + +Table of Contents + + 1. Introduction ....................................................3 + 1.1. Commands and Responses .....................................3 + 1.2. Syntax .....................................................3 + 1.3. Response Codes .............................................3 + 1.4. Active Script ..............................................6 + 1.5. Quotas .....................................................6 + 1.6. Script Names ...............................................6 + 1.7. Capabilities ...............................................7 + 1.8. Transport ..................................................9 + 1.9. Conventions Used in This Document .........................10 + 2. Commands .......................................................10 + 2.1. AUTHENTICATE Command ......................................11 + 2.1.1. Use of SASL PLAIN Mechanism over TLS ...............16 + 2.2. STARTTLS Command ..........................................16 + 2.2.1. Server Identity Check ..............................17 + 2.3. LOGOUT Command ............................................20 + 2.4. CAPABILITY Command ........................................20 + 2.5. HAVESPACE Command .........................................20 + 2.6. PUTSCRIPT Command .........................................21 + 2.7. LISTSCRIPTS Command .......................................23 + 2.8. SETACTIVE Command .........................................24 + 2.9. GETSCRIPT Command .........................................25 + 2.10. DELETESCRIPT Command .....................................25 + 2.11. RENAMESCRIPT Command .....................................26 + 2.12. CHECKSCRIPT Command ......................................27 + 2.13. NOOP Command .............................................28 + 2.14. Recommended Extensions ...................................28 + 2.14.1. UNAUTHENTICATE Command ............................28 + 3. Sieve URL Scheme ...............................................29 + 4. Formal Syntax ..................................................31 + 5. Security Considerations ........................................37 + 6. IANA Considerations ............................................38 + 6.1. ManageSieve Capability Registration Template ..............39 + 6.2. Registration of Initial ManageSieve Capabilities ..........39 + 6.3. ManageSieve Response Code Registration Template ...........41 + 6.4. Registration of Initial ManageSieve Response Codes ........41 + 7. Internationalization Considerations ............................46 + 8. Acknowledgements ...............................................46 + 9. References .....................................................47 + 9.1. Normative References ......................................47 + 9.2. Informative References ....................................48 + + + + + + + + +Melnikov & Martin Standards Track [Page 2] + +RFC 5804 ManageSieve July 2010 + + +1. Introduction + +1.1. Commands and Responses + + A ManageSieve connection consists of the establishment of a client/ + server network connection, an initial greeting from the server, and + client/server interactions. These client/server interactions consist + of a client command, server data, and a server completion result + response. + + All interactions transmitted by client and server are in the form of + lines, that is, strings that end with a CRLF. The protocol receiver + of a ManageSieve client or server is either reading a line or reading + a sequence of octets with a known count followed by a line. + +1.2. Syntax + + ManageSieve is a line-oriented protocol much like [IMAP] or [ACAP], + which runs over TCP. There are three data types: atoms, numbers and + strings. Strings may be quoted or literal. See [ACAP] for detailed + descriptions of these types. + + Each command consists of an atom (the command name) followed by zero + or more strings and numbers terminated by CRLF. + + All client queries are replied to with either an OK, NO, or BYE + response. Each response may be followed by a response code (see + Section 1.3) and by a string consisting of human-readable text in the + local language (as returned by the LANGUAGE capability; see + Section 1.7), encoded in UTF-8 [UTF-8]. The contents of the string + SHOULD be shown to the user ,and implementations MUST NOT attempt to + parse the message for meaning. + + The BYE response SHOULD be used if the server wishes to close the + connection. A server may wish to do this because the client was idle + for too long or there were too many failed authentication attempts. + This response can be issued at any time and should be immediately + followed by a server hang-up of the connection. If a server has an + inactivity timeout resulting in client autologout, it MUST be no less + than 30 minutes after successful authentication. The inactivity + timeout MAY be less before authentication. + +1.3. Response Codes + + An OK, NO, or BYE response from the server MAY contain a response + code to describe the event in a more detailed machine-parsable + fashion. A response code consists of data inside parentheses in the + form of an atom, possibly followed by a space and arguments. + + + +Melnikov & Martin Standards Track [Page 3] + +RFC 5804 ManageSieve July 2010 + + + Response codes are defined when there is a specific action that a + client can take based upon the additional information. In order to + support future extension, the response code is represented as a + slash-separated (Solidus, %x2F) hierarchy with each level of + hierarchy representing increasing detail about the error. Response + codes MUST NOT start with the Solidus character. Clients MUST + tolerate additional hierarchical response code detail that they don't + understand. For example, if the client supports the "QUOTA" response + code, but doesn't understand the "QUOTA/MAXSCRIPTS" response code, it + should treat "QUOTA/MAXSCRIPTS" as "QUOTA". + + Client implementations MUST tolerate (ignore) response codes that + they do not recognize. + + The currently defined response codes are the following: + + AUTH-TOO-WEAK + + This response code is returned in the NO or BYE response from an + AUTHENTICATE command. It indicates that site security policy forbids + the use of the requested mechanism for the specified authentication + identity. + + ENCRYPT-NEEDED + + This response code is returned in the NO or BYE response from an + AUTHENTICATE command. It indicates that site security policy + requires the use of a strong encryption mechanism for the specified + authentication identity and mechanism. + + QUOTA + + If this response code is returned in the NO/BYE response, it means + that the command would have placed the user above the site-defined + quota constraints. If this response code is returned in the OK + response, it can mean that the user's storage is near its quota, or + it can mean that the account exceeded its quota but that the + condition is being allowed by the server (the server supports + so-called soft quotas). The QUOTA response code has two more + detailed variants: "QUOTA/MAXSCRIPTS" (the maximum number of per-user + scripts) and "QUOTA/MAXSIZE" (the maximum script size). + + REFERRAL + + This response code may be returned with a BYE result from any + command, and includes a mandatory parameter that indicates what + server to access to manage this user's Sieve scripts. The server + will be specified by a Sieve URL (see Section 3). The scriptname + + + +Melnikov & Martin Standards Track [Page 4] + +RFC 5804 ManageSieve July 2010 + + + portion of the URL MUST NOT be specified. The client should + authenticate to the specified server and use it for all further + commands in the current session. + + SASL + + This response code can occur in the OK response to a successful + AUTHENTICATE command and includes the optional final server response + data from the server as specified by [SASL]. + + TRANSITION-NEEDED + + This response code occurs in a NO response of an AUTHENTICATE + command. It indicates that the user name is valid, but the entry in + the authentication database needs to be updated in order to permit + authentication with the specified mechanism. This is typically done + by establishing a secure channel using TLS, verifying server identity + as specified in Section 2.2.1, and finally authenticating once using + the [PLAIN] authentication mechanism. The selected mechanism SHOULD + then work for authentications in subsequent sessions. + + This condition can happen if a user has an entry in a system + authentication database such as Unix /etc/passwd, but does not have + credentials suitable for use by the specified mechanism. + + TRYLATER + + A command failed due to a temporary server failure. The client MAY + continue using local information and try the command later. This + response code only makes sense when returned in a NO/BYE response. + + ACTIVE + + A command failed because it is not allowed on the active script, for + example, DELETESCRIPT on the active script. This response code only + makes sense when returned in a NO/BYE response. + + NONEXISTENT + + A command failed because the referenced script name doesn't exist. + This response code only makes sense when returned in a NO/BYE + response. + + ALREADYEXISTS + + A command failed because the referenced script name already exists. + This response code only makes sense when returned in a NO/BYE + response. + + + +Melnikov & Martin Standards Track [Page 5] + +RFC 5804 ManageSieve July 2010 + + + TAG + + This response code name is followed by a string specified in the + command. See Section 2.13 for a possible use case. + + WARNINGS + + This response code MAY be returned by the server in the OK response + (but it might be returned with the NO/BYE response as well) and + signals the client that even though the script is syntactically + valid, it might contain errors not intended by the script writer. + This response code is typically returned in response to PUTSCRIPT + and/or CHECKSCRIPT commands. A client seeing such response code + SHOULD present the returned warning text to the user. + +1.4. Active Script + + A user may have multiple Sieve scripts on the server, yet only one + script may be used for filtering of incoming messages. This is the + active script. Users may have zero or one active script and MUST use + the SETACTIVE command described below for changing the active script + or disabling Sieve processing. For example, users may have an + everyday script they normally use and a special script they use when + they go on vacation. Users can change which script is being used + without having to download and upload a script stored somewhere else. + +1.5. Quotas + + Servers SHOULD impose quotas to prevent malicious users from + overflowing available storage. If a command would place a user over + a quota setting, servers that impose such quotas MUST reply with a NO + response containing the QUOTA response code. Client implementations + MUST be able to handle commands failing because of quota + restrictions. + +1.6. Script Names + + A Sieve script name is a sequence of Unicode characters encoded in + UTF-8 [UTF-8]. A script name MUST comply with Net-Unicode Definition + (Section 2 of [NET-UNICODE]), with the additional restriction of + prohibiting the following Unicode characters: + + o 0000-001F; [CONTROL CHARACTERS] + + o 007F; DELETE + + o 0080-009F; [CONTROL CHARACTERS] + + + + +Melnikov & Martin Standards Track [Page 6] + +RFC 5804 ManageSieve July 2010 + + + o 2028; LINE SEPARATOR + + o 2029; PARAGRAPH SEPARATOR + + Sieve script names MUST be at least one octet (and hence Unicode + character) long. Zero octets script name has a special meaning (see + Section 2.8). Servers MUST allow names of up to 128 Unicode + characters in length (which can take up to 512 bytes when encoded in + UTF-8, not counting the terminating NUL), and MAY allow longer names. + A server that receives a script name longer than its internal limit + MUST reject the corresponding operation, in particular it MUST NOT + truncate the script name. + +1.7. Capabilities + + Server capabilities are sent automatically by the server upon a + client connection, or after successful STARTTLS and AUTHENTICATE + (which establishes a Simple Authentication and Security Layer (SASL)) + commands. Capabilities may change immediately after a successfully + completed STARTTLS command, and/or immediately after a successfully + completed AUTHENTICATE command, and/or after a successfully completed + UNAUTHENTICATE command (see Section 2.14.1). Capabilities MUST + remain static at all other times. + + Clients MAY request the capabilities at a later time by issuing the + CAPABILITY command described later. The capabilities consist of a + series of lines each with one or two strings. The first string is + the name of the capability, which is case-insensitive. The second + optional string is the value associated with that capability. Order + of capabilities is arbitrary, but each capability name can appear at + most once. + + The following capabilities are defined in this document: + + IMPLEMENTATION - Name of implementation and version. This capability + MUST always be returned by the server. + + SASL - List of SASL mechanisms supported by the server, each + separated by a space. This list can be empty if and only if STARTTLS + is also advertised. This means that the client must negotiate TLS + encryption with STARTTLS first, at which point the SASL capability + will list a non-empty list of SASL mechanisms. + + SIEVE - List of space-separated Sieve extensions (as listed in Sieve + "require" action [SIEVE]) supported by the Sieve engine. This + capability MUST always be returned by the server. + + + + + +Melnikov & Martin Standards Track [Page 7] + +RFC 5804 ManageSieve July 2010 + + + STARTTLS - If TLS [TLS] is supported by this implementation. Before + advertising this capability a server MUST verify to the best of its + ability that TLS can be successfully negotiated by a client with + common cipher suites. Specifically, a server should verify that a + server certificate has been installed and that the TLS subsystem has + successfully initialized. This capability SHOULD NOT be advertised + once STARTTLS or AUTHENTICATE command completes successfully. Client + and server implementations MUST implement the STARTTLS extension. + + MAXREDIRECTS - Specifies the limit on the number of Sieve "redirect" + actions a script can perform during a single evaluation. Note that + this is different from the total number of "redirect" actions a + script can contain. The value is a non-negative number represented + as a ManageSieve string. + + NOTIFY - A space-separated list of URI schema parts for supported + notification methods. This capability MUST be specified if the Sieve + implementation supports the "enotify" extension [NOTIFY]. + + LANGUAGE - The language (<Language-Tag> from [RFC5646]) currently + used for human-readable error messages. If this capability is not + returned, the "i-default" [RFC2277] language is assumed. Note that + the current language MAY be per-user configurable (i.e., it MAY + change after authentication). + + OWNER - The canonical name of the logged-in user (SASL "authorization + identity") encoded in UTF-8. This capability MUST NOT be returned in + unauthenticated state and SHOULD be returned once the AUTHENTICATE + command succeeds. + + VERSION - This capability MUST be returned by servers compliant with + this document or its successor. For servers compliant with this + document, the capability value is the string "1.0". Lack of this + capability means that the server predates this specification and thus + doesn't support the following commands: RENAMESCRIPT, CHECKSCRIPT, + and NOOP. + + Section 2.14 defines some additional ManageSieve extensions and their + respective capabilities. + + A server implementation MUST return SIEVE, IMPLEMENTATION, and + VERSION capabilities. + + A client implementation MUST ignore any listed capabilities that it + does not understand. + + + + + + +Melnikov & Martin Standards Track [Page 8] + +RFC 5804 ManageSieve July 2010 + + + Example: + + S: "IMPlemENTATION" "Example1 ManageSieved v001" + S: "SASl" "DIGEST-MD5 GSSAPI" + S: "SIeVE" "fileinto vacation" + S: "StaRTTLS" + S: "NOTIFY" "xmpp mailto" + S: "MAXREdIRECTS" "5" + S: "VERSION" "1.0" + S: OK + + After successful authentication, this might look like this: + + Example: + + S: "IMPlemENTATION" "Example1 ManageSieved v001" + S: "SASl" "DIGEST-MD5 GSSAPI" + S: "SIeVE" "fileinto vacation" + S: "NOTIFY" "xmpp mailto" + S: "OWNER" "alexey@example.com" + S: "MAXREdIRECTS" "5" + S: "VERSION" "1.0" + S: OK + +1.8. Transport + + The ManageSieve protocol assumes a reliable data stream such as that + provided by TCP. When TCP is used, a ManageSieve server typically + listens on port 4190. + + Before opening the TCP connection, the ManageSieve client first MUST + resolve the Domain Name System (DNS) hostname associated with the + receiving entity and determine the appropriate TCP port for + communication with the receiving entity. The process is as follows: + + 1. Attempt to resolve the hostname using a [DNS-SRV] Service of + "sieve" and a Proto of "tcp" for the target domain (e.g., + "example.net"), resulting in resource records such as + "_sieve._tcp.example.net.". The result of the SRV lookup, if + successful, will be one or more combinations of a port and + hostname; the ManageSieve client MUST resolve the returned + hostnames to IPv4/IPv6 addresses according to returned SRV record + weight. IP addresses from the first successfully resolved + hostname (with the corresponding port number returned by SRV + lookup) are used to connect to the server. If connection using + one of the IP addresses fails, the next resolved IP address is + + + + + +Melnikov & Martin Standards Track [Page 9] + +RFC 5804 ManageSieve July 2010 + + + used to connect. If connection to all resolved IP addresses + fails, then the resolution/connect is repeated for the next + hostname returned by SRV lookup. + + 2. If the SRV lookup fails, the fallback SHOULD be a normal IPv4 or + IPv6 address record resolution to determine the IP address, where + the port used is the default ManageSieve port of 4190. + +1.9. Conventions Used in This Document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [KEYWORDS]. + + In examples, "C:" and "S:" indicate lines sent by the client and + server respectively. Line breaks that do not start a new "C:" or + "S:" exist for editorial reasons. + + Examples of authentication in this document are using DIGEST-MD5 + [DIGEST-MD5] and GSSAPI [GSSAPI] SASL mechanisms. + +2. Commands + + This section and its subsections describe valid ManageSieve commands. + Upon initial connection to the server, the client's session is in + non-authenticated state. Prior to successful authentication, only + the AUTHENTICATE, CAPABILITY, STARTTLS, LOGOUT, and NOOP (see Section + 2.13) commands are valid. ManageSieve extensions MAY define other + commands that are valid in non-authenticated state. Servers MUST + reject all other commands with a NO response. Clients may pipeline + commands (send more than one command at a time without waiting for + completion of the first command). However, a group of commands sent + together MUST NOT have an AUTHENTICATE (*), a STARTTLS, or a + HAVESPACE command anywhere but the last command in the list. + + (*) - The only exception to this rule is when the AUTHENTICATE + command contains an initial response for a SASL mechanism that allows + clients to send data first, the mechanism is known to complete in one + round trip, and the mechanism doesn't negotiate a SASL security + layer. Two examples of such SASL mechanisms are PLAIN [PLAIN] and + EXTERNAL [SASL]. + + + + + + + + + + +Melnikov & Martin Standards Track [Page 10] + +RFC 5804 ManageSieve July 2010 + + +2.1. AUTHENTICATE Command + + Arguments: String - mechanism + String - initial data (optional) + + The AUTHENTICATE command indicates a SASL [SASL] authentication + mechanism to the server. If the server supports the requested + authentication mechanism, it performs an authentication protocol + exchange to identify and authenticate the user. Optionally, it also + negotiates a security layer for subsequent protocol interactions. If + the requested authentication mechanism is not supported, the server + rejects the AUTHENTICATE command by sending the NO response. + + The authentication protocol exchange consists of a series of server + challenges and client responses that are specific to the selected + authentication mechanism. A server challenge consists of a string + (quoted or literal) followed by a CRLF. The contents of the string + is a base-64 encoding [BASE64] of the SASL data. A client response + consists of a string (quoted or literal) with the base-64 encoding of + the SASL data followed by a CRLF. If the client wishes to cancel the + authentication exchange, it issues a string containing a single "*". + If the server receives such a response, it MUST reject the + AUTHENTICATE command by sending a NO reply. + + Note that an empty challenge/response is sent as an empty string. If + the mechanism dictates that the final response is sent by the server, + this data MAY be placed within the data portion of the SASL response + code to save a round trip. + + The optional initial-response argument to the AUTHENTICATE command is + used to save a round trip when using authentication mechanisms that + are defined to send no data in the initial challenge. When the + initial-response argument is used with such a mechanism, the initial + empty challenge is not sent to the client and the server uses the + data in the initial-response argument as if it were sent in response + to the empty challenge. If the initial-response argument to the + AUTHENTICATE command is used with a mechanism that sends data in the + initial challenge, the server MUST reject the AUTHENTICATE command by + sending the NO response. + + The service name specified by this protocol's profile of SASL is + "sieve". + + Reauthentication is not supported by ManageSieve protocol's profile + of SASL. That is, after a successfully completed AUTHENTICATE + command, no more AUTHENTICATE commands may be issued in the same + session. After a successful AUTHENTICATE command completes, a server + MUST reject any further AUTHENTICATE commands with a NO reply. + + + +Melnikov & Martin Standards Track [Page 11] + +RFC 5804 ManageSieve July 2010 + + + However, note that a server may implement the UNAUTHENTICATE + extension described in Section 2.14.1. + + If a security layer is negotiated through the SASL authentication + exchange, it takes effect immediately following the CRLF that + concludes the successful authentication exchange for the client, and + the CRLF of the OK response for the server. + + When a security layer takes effect, the ManageSieve protocol is reset + to the initial state (the state in ManageSieve after a client has + connected to the server). The server MUST discard any knowledge + obtained from the client that was not obtained from the SASL (or TLS) + negotiation itself. Likewise, the client MUST discard any knowledge + obtained from the server, such as the list of ManageSieve extensions, + that was not obtained from the SASL (and/or TLS) negotiation itself. + (Note that a client MAY compare the advertised SASL mechanisms before + and after authentication in order to detect an active down- + negotiation attack. See below.) + + Once a SASL security layer is established, the server MUST re-issue + the capability results, followed by an OK response. This is + necessary to protect against man-in-the-middle attacks that alter the + capabilities list prior to SASL negotiation. The capability results + MUST include all SASL mechanisms the server was capable of + negotiating with that client. This is done in order to allow the + client to detect an active down-negotiation attack. If a user- + oriented client detects such a down-negotiation attack, it SHOULD + either notify the user (it MAY give the user the opportunity to + continue with the ManageSieve session in this case) or close the + transport connection and indicate that a down-negotiation attack + might be in progress. If an automated client detects a down- + negotiation attack, it SHOULD return or log an error indicating that + a possible attack might be in progress and/or SHOULD close the + transport connection. + + When both [TLS] and SASL security layers are in effect, the TLS + encoding MUST be applied (when sending data) after the SASL encoding. + + Server implementations SHOULD support SASL proxy authentication so + that an administrator can administer a user's scripts. Proxy + authentication is when a user authenticates as herself/himself but + requests the server to act (authorize) as another user. + + The authorization identity generated by this [SASL] exchange is a + "simple username" (in the sense defined in [SASLprep]), and both + client and server MUST use the [SASLprep] profile of the [StringPrep] + algorithm to prepare these names for transmission or comparison. If + preparation of the authorization identity fails or results in an + + + +Melnikov & Martin Standards Track [Page 12] + +RFC 5804 ManageSieve July 2010 + + + empty string (unless it was transmitted as the empty string), the + server MUST fail the authentication. + + If an AUTHENTICATE command fails with a NO response, the client MAY + try another authentication mechanism by issuing another AUTHENTICATE + command. In other words, the client may request authentication types + in decreasing order of preference. + + Note that a failed (NO) response to the AUTHENTICATE command may + contain one of the following response codes: AUTH-TOO-WEAK, ENCRYPT- + NEEDED, or TRANSITION-NEEDED. See Section 1.3 for detailed + description of the relevant conditions. + + To ensure interoperability, both client and server implementations of + the ManageSieve protocol MUST implement the SCRAM-SHA-1 [SCRAM] SASL + mechanism, as well as [PLAIN] over [TLS]. + + Note: use of PLAIN over TLS reflects current use of PLAIN over TLS in + other email-related protocols; however, a longer-term goal is to + migrate email-related protocols from using PLAIN over TLS to SCRAM- + SHA-1 mechanism. + + Examples (Note that long lines are folded for readability and are not + part of protocol exchange): + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: "VERSION" "1.0" + S: OK + C: Authenticate "DIGEST-MD5" + S: "cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik + 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz + cyxjaGFyc2V0PXV0Zi04" + C: "Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 + QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo + aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX + N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy + ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 + A9YXV0aA==" + S: OK (SASL "cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZ + mZmZA==") + + + + + + + + +Melnikov & Martin Standards Track [Page 13] + +RFC 5804 ManageSieve July 2010 + + + A slightly different variant of the same authentication exchange is: + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "VERSION" "1.0" + S: "STARTTLS" + S: OK + C: Authenticate "DIGEST-MD5" + S: {136} + S: cmVhbG09ImVsd29vZC5pbm5vc29mdC5leGFtcGxlLmNvbSIsbm9uY2U9Ik + 9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgiLGFsZ29yaXRobT1tZDUtc2Vz + cyxjaGFyc2V0PXV0Zi04 + C: {300+} + C: Y2hhcnNldD11dGYtOCx1c2VybmFtZT0iY2hyaXMiLHJlYWxtPSJlbHdvb2 + QuaW5ub3NvZnQuZXhhbXBsZS5jb20iLG5vbmNlPSJPQTZNRzl0RVFHbTJo + aCIsbmM9MDAwMDAwMDEsY25vbmNlPSJPQTZNSFhoNlZxVHJSayIsZGlnZX + N0LXVyaT0ic2lldmUvZWx3b29kLmlubm9zb2Z0LmV4YW1wbGUuY29tIixy + ZXNwb25zZT1kMzg4ZGFkOTBkNGJiZDc2MGExNTIzMjFmMjE0M2FmNyxxb3 + A9YXV0aA== + S: {56} + S: cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA== + C: "" + S: OK + + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 14] + +RFC 5804 ManageSieve July 2010 + + + Another example demonstrating use of SASL PLAIN mechanism under TLS + follows. This example also demonstrate use of SASL "initial + response" (the second parameter to the Authenticate command): + + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: OK + C: STARTTLS + S: OK + <TLS negotiation, further commands are under TLS layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN" + S: "SIEVE" "fileinto vacation" + S: OK + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xu" + S: NO + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xz" + S: NO + C: Authenticate "PLAIN" "QJIrweAPyo6Q1T9xy" + S: BYE "Too many failed authentication attempts" + <Server closes connection> + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 15] + +RFC 5804 ManageSieve July 2010 + + + The following example demonstrates use of SASL "initial response". + It also demonstrates that an empty response can be sent as a literal + and that negotiating a SASL security layer results in the server + re-issuing server capabilities: + + C: AUTHENTICATE "GSSAPI" {1488+} + C: YIIE[...1480 octets here ...]dA== + S: {208} + S: YIGZBgkqhkiG9xIBAgICAG+BiTCBhqADAgEFoQMCAQ+iejB4oAMCARKic + [...114 octets here ...] + /yzpAy9p+Y0LanLskOTvMc0MnjgAa4YEr3eJ6 + C: {0+} + C: + S: {44} + S: BQQF/wAMAAwAAAAAYRGFAo6W0vIHti8i1UXODgEAEAA= + C: {44+} + C: BQQE/wAMAAwAAAAAIsT1iv9UkZApw471iXt6cwEAAAE= + S: OK + <Further commands/responses are under SASL security layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "LANGUAGE" "ru" + S: "MAXREDIRECTS" "3" + S: ok + +2.1.1. Use of SASL PLAIN Mechanism over TLS + + This section is normative for ManageSieve client implementations that + support SASL [PLAIN] over [TLS]. + + If a ManageSieve client is willing to use SASL PLAIN over TLS to + authenticate to the ManageSieve server, the client MUST verify the + server identity (see Section 2.2.1). If the server identity can't be + verified (e.g., the server has not provided any certificate, or if + the certificate verification fails), the client MUST NOT attempt to + authenticate using the SASL PLAIN mechanism. + +2.2. STARTTLS Command + + Support for STARTTLS command in servers is optional. Its + availability is advertised with "STARTTLS" capability as described in + Section 1.7. + + The STARTTLS command requests commencement of a TLS [TLS] + negotiation. The negotiation begins immediately after the CRLF in + the OK response. After a client issues a STARTTLS command, it MUST + + + +Melnikov & Martin Standards Track [Page 16] + +RFC 5804 ManageSieve July 2010 + + + NOT issue further commands until a server response is seen and the + TLS negotiation is complete. + + The STARTTLS command is only valid in non-authenticated state. The + server remains in non-authenticated state, even if client credentials + are supplied during the TLS negotiation. The SASL [SASL] EXTERNAL + mechanism MAY be used to authenticate once TLS client credentials are + successfully exchanged, but servers supporting the STARTTLS command + are not required to support the EXTERNAL mechanism. + + After the TLS layer is established, the server MUST re-issue the + capability results, followed by an OK response. This is necessary to + protect against man-in-the-middle attacks that alter the capabilities + list prior to STARTTLS. This capability result MUST NOT include the + STARTTLS capability. + + The client MUST discard cached capability information and replace it + with the new information. The server MAY advertise different + capabilities after STARTTLS. + + Example: + + C: StartTls + S: oK + <TLS negotiation, further commands are under TLS layer> + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "SASL" "PLAIN DIGEST-MD5 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "VERSION" "1.0" + S: "LANGUAGE" "fr" + S: ok + +2.2.1. Server Identity Check + + During the TLS negotiation, the ManageSieve client MUST check its + understanding of the server hostname/IP address against the server's + identity as presented in the server Certificate message, in order to + prevent man-in-the-middle attacks. In this section, the client's + understanding of the server's identity is called the "reference + identity". + + Checking is performed according to the following rules: + + o If the reference identity is a hostname: + + 1. If a subjectAltName extension of the SRVName [X509-SRV], + dNSName [X509] (in that order of preference) type is present + in the server's certificate, then it SHOULD be used as the + + + +Melnikov & Martin Standards Track [Page 17] + +RFC 5804 ManageSieve July 2010 + + + source of the server's identity. Matching is performed as + described in Section 2.2.1.1, with the exception that no + wildcard matching is allowed for SRVName type. If the + certificate contains multiple names (e.g., more than one + dNSName field), then a match with any one of the fields is + considered acceptable. + + 2. The client MAY use other types of subjectAltName for + performing comparison. + + 3. The server's identity MAY also be verified by comparing the + reference identity to the Common Name (CN) [RFC4519] value in + the leaf Relative Distinguished Name (RDN) of the subjectName + field of the server's certificate. This comparison is + performed using the rules for comparison of DNS names in + Section 2.2.1.1, below. Although the use of the Common Name + value is existing practice, it is deprecated, and + Certification Authorities are encouraged to provide + subjectAltName values instead. Note that the TLS + implementation may represent DNs in certificates according to + X.500 or other conventions. For example, some X.500 + implementations order the RDNs in a DN using a left-to-right + (most significant to least significant) convention instead of + LDAP's right-to-left convention. + + o When the reference identity is an IP address, the iPAddress + subjectAltName SHOULD be used by the client for comparison. The + comparison is performed as described in Section 2.2.1.2. + + If the server identity check fails, user-oriented clients SHOULD + either notify the user (clients MAY give the user the opportunity to + continue with the ManageSieve session in this case) or close the + transport connection and indicate that the server's identity is + suspect. Automated clients SHOULD return or log an error indicating + that the server's identity is suspect and/or SHOULD close the + transport connection. Automated clients MAY provide a configuration + setting that disables this check, but MUST provide a setting that + enables it. + + Beyond the server identity check described in this section, clients + should be prepared to do further checking to ensure that the server + is authorized to provide the service it is requested to provide. The + client may need to make use of local policy information in making + this determination. + + + + + + + +Melnikov & Martin Standards Track [Page 18] + +RFC 5804 ManageSieve July 2010 + + +2.2.1.1. Comparison of DNS Names + + If the reference identity is an internationalized domain name, + conforming implementations MUST convert it to the ASCII Compatible + Encoding (ACE) format as specified in Section 4 of RFC 3490 [RFC3490] + before comparison with subjectAltName values of type dNSName. + Specifically, conforming implementations MUST perform the conversion + operation specified in Section 4 of [RFC3490] as follows: + + o in step 1, the domain name SHALL be considered a "stored string"; + + o in step 3, set the flag called "UseSTD3ASCIIRules"; + + o in step 4, process each label with the "ToASCII" operation; and + + o in step 5, change all label separators to U+002E (full stop). + + After performing the "to-ASCII" conversion, the DNS labels and names + MUST be compared for equality according to the rules specified in + Section 3 of [RFC3490]; i.e., once all label separators are replaced + with U+002E (dot) they are compared in the case-insensitive manner. + + The '*' (ASCII 42) wildcard character is allowed in subjectAltName + values of type dNSName, and then only as the left-most (least + significant) DNS label in that value. This wildcard matches any + left-most DNS label in the server name. That is, the subject + *.example.com matches the server names a.example.com and + b.example.com, but does not match example.com or a.b.example.com. + +2.2.1.2. Comparison of IP Addresses + + When the reference identity is an IP address, the identity MUST be + converted to the "network byte order" octet string representation + [RFC791][RFC2460]. For IP Version 4, as specified in RFC 791, the + octet string will contain exactly four octets. For IP Version 6, as + specified in RFC 2460, the octet string will contain exactly sixteen + octets. This octet string is then compared against subjectAltName + values of type iPAddress. A match occurs if the reference identity + octet string and value octet strings are identical. + +2.2.1.3. Comparison of Other subjectName Types + + Client implementations MAY support matching against subjectAltName + values of other types as described in other documents. + + + + + + + +Melnikov & Martin Standards Track [Page 19] + +RFC 5804 ManageSieve July 2010 + + +2.3. LOGOUT Command + + The client sends the LOGOUT command when it is finished with a + connection and wishes to terminate it. The server MUST reply with an + OK response. The server MUST ignore commands issued by the client + after the LOGOUT command. + + The client SHOULD wait for the OK response before closing the + connection. This avoids the TCP connection going into the TIME_WAIT + state on the server. In order to avoid going into the TIME_WAIT TCP + state, the server MAY wait for a short while for the client to close + the TCP connection first. Whether or not the server waits for the + client to close the connection, it MUST then close the connection + itself. + + Example: + + C: Logout + S: Ok + <connection is terminated> + +2.4. CAPABILITY Command + + The CAPABILITY command requests the server capabilities as described + earlier in this document. It has no parameters. + + Example: + + C: CAPABILITY + S: "IMPLEMENTATION" "Example1 ManageSieved v001" + S: "VERSION" "1.0" + S: "SASL" "PLAIN SCRAM-SHA-1 GSSAPI" + S: "SIEVE" "fileinto vacation" + S: "STARTTLS" + S: OK + +2.5. HAVESPACE Command + + Arguments: String - name + Number - script size + + The HAVESPACE command is used to query the server for available + space. Clients specify the name they wish to save the script as and + its size in octets. Both parameters can be used by the server to see + if the script with the specified name and size is within a user's + quota(s). For example, the server MAY use the script name to check + if a script would be replaced or a new one would be created. Servers + respond with a NO if storing a script with that name and size would + + + +Melnikov & Martin Standards Track [Page 20] + +RFC 5804 ManageSieve July 2010 + + + fail or OK otherwise. Clients SHOULD issue this command before + attempting to place a script on the server. + + Note that the OK response from the HAVESPACE command does not + constitute a guarantee of success as server disk space conditions + could change between the client issuing the HAVESPACE and the client + issuing the PUTSCRIPT commands. A QUOTA response code (see + Section 1.3) remains a possible (albeit unlikely) response to a + subsequent PUTSCRIPT with the same name and size. + + Example: + + C: HAVESPACE "myscript" 999999 + S: NO (QUOTA/MAXSIZE) "Quota exceeded" + + C: HAVESPACE "foobar" 435 + S: OK + +2.6. PUTSCRIPT Command + + Arguments: String - Script name + String - Script content + + The PUTSCRIPT command is used by the client to submit a Sieve script + to the server. + + If the script already exists, upon success the old script will be + overwritten. The old script MUST NOT be overwritten if PUTSCRIPT + fails in any way. A script of zero length SHOULD be disallowed. + + This command places the script on the server. It does not affect + whether the script is processed on incoming mail, unless it replaces + the script that is already active. The SETACTIVE command is used to + mark a script as active. + + When submitting large scripts, clients SHOULD use the HAVESPACE + command beforehand to query if the server is willing to accept a + script of that size. + + The server MUST check the submitted script for validity, which + includes checking that the script complies with the Sieve grammar + [SIEVE] and that all Sieve extensions mentioned in the script's + "require" statement(s) are supported by the Sieve interpreter. (Note + that if the Sieve interpreter supports the Sieve "ihave" extension + [I-HAVE], any unrecognized/unsupported extension mentioned in the + "ihave" test MUST NOT cause the validation failure.) Other checks + such as validating the supplied command arguments for each command + MAY be performed. Essentially, the performed validation SHOULD be + + + +Melnikov & Martin Standards Track [Page 21] + +RFC 5804 ManageSieve July 2010 + + + the same as performed when compiling the script for execution. + Implementations that use a binary representation to store compiled + scripts can extend the validation to a full compilation, in order to + avoid validating uploaded scripts multiple times. + + If the script fails the validation, the server MUST reply with a NO + response. Any script that fails the validity test MUST NOT be stored + on the server. The message given with a NO response MUST be human + readable and SHOULD contain a specific error message giving the line + number of the first error. Implementors should strive to produce + helpful error messages similar to those given by programming language + compilers. Client implementations should note that this may be a + multiline literal string with more than one error message separated + by CRLFs. The human-readable message is in the language returned in + the latest LANGUAGE capability (or in "i-default"; see Section 1.7), + encoded in UTF-8 [UTF-8]. + + An OK response MAY contain the WARNINGS response code. In such a + case the human-readable message that follows the OK response SHOULD + contain a specific warning message (or messages) giving the line + number(s) in the script that might contain errors not intended by the + script writer. The human-readable message is in the language + returned in the latest LANGUAGE capability (or in "i-default"; see + Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a + response code SHOULD present the message to the user. + + + + + + + + + + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 22] + +RFC 5804 ManageSieve July 2010 + + + Examples: + + C: Putscript "foo" {31+} + C: #comment + C: InvalidSieveCommand + C: + S: NO "line 2: Syntax error" + + C: Putscript "mysievescript" {110+} + C: require ["fileinto"]; + C: + C: if envelope :contains "to" "tmartin+sent" { + C: fileinto "INBOX.sent"; + C: } + S: OK + + C: Putscript "myforwards" {190+} + C: redirect "111@example.net"; + C: + C: if size :under 10k { + C: redirect "mobile@cell.example.com"; + C: } + C: + C: if envelope :contains "to" "tmartin+lists" { + C: redirect "lists@groups.example.com"; + C: } + S: OK (WARNINGS) "line 8: server redirect action + limit is 2, this redirect might be ignored" + +2.7. LISTSCRIPTS Command + + This command lists the scripts the user has on the server. Upon + success, a list of CRLF-separated script names (each represented as a + quoted or literal string) is returned followed by an OK response. If + there exists an active script, the atom ACTIVE is appended to the + corresponding script name. The atom ACTIVE MUST NOT appear on more + than one response line. + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 23] + +RFC 5804 ManageSieve July 2010 + + + Example: + + C: Listscripts + S: "summer_script" + S: "vacation_script" + S: {13} + S: clever"script + S: "main_script" ACTIVE + S: OK + + C: listscripts + S: "summer_script" + S: "main_script" active + S: OK + +2.8. SETACTIVE Command + + Arguments: String - script name + + This command sets a script active. If the script name is the empty + string (i.e., ""), then any active script is disabled. Disabling an + active script when there is no script active is not an error and MUST + result in an OK reply. + + If the script does not exist on the server, then the server MUST + reply with a NO response. Such a reply SHOULD contain the + NONEXISTENT response code. + + Examples: + + C: Setactive "vacationscript" + S: Ok + + C: Setactive "" + S: Ok + + C: Setactive "baz" + S: No (NONEXISTENT) "There is no script by that name" + + C: Setactive "baz" + S: No (NONEXISTENT) {31} + S: There is no script by that name + + + + + + + + + +Melnikov & Martin Standards Track [Page 24] + +RFC 5804 ManageSieve July 2010 + + +2.9. GETSCRIPT Command + + Arguments: String - script name + + This command gets the contents of the specified script. If the + script does not exist, the server MUST reply with a NO response. + Such a reply SHOULD contain the NONEXISTENT response code. + + Upon success, a string with the contents of the script is returned + followed by an OK response. + + Example: + + C: Getscript "myscript" + S: {54} + S: #this is my wonderful script + S: reject "I reject all"; + S: + S: OK + +2.10. DELETESCRIPT Command + + Arguments: String - script name + + This command is used to delete a user's Sieve script. Servers MUST + reply with a NO response if the script does not exist. Such + responses SHOULD include the NONEXISTENT response code. + + The server MUST NOT allow the client to delete an active script, so + the server MUST reply with a NO response if attempted. Such a + response SHOULD contain the ACTIVE response code. If a client wishes + to delete an active script, it should use the SETACTIVE command to + disable the script first. + + Example: + + C: Deletescript "foo" + S: Ok + + C: Deletescript "baz" + S: No (ACTIVE) "You may not delete an active script" + + + + + + + + + + +Melnikov & Martin Standards Track [Page 25] + +RFC 5804 ManageSieve July 2010 + + +2.11. RENAMESCRIPT Command + + Arguments: String - Old Script name + String - New Script name + + This command is used to rename a user's Sieve script. Servers MUST + reply with a NO response if the old script does not exist (in which + case the NONEXISTENT response code SHOULD be included), or a script + with the new name already exists (in which case the ALREADYEXISTS + response code SHOULD be included). Renaming the active script is + allowed; the renamed script remains active. + + Example: + + C: Renamescript "foo" "bar" + S: Ok + + C: Renamescript "baz" "bar" + S: No "bar already exists" + + If the server doesn't support the RENAMESCRIPT command, the client + can emulate it by performing the following steps: + + 1. List available scripts with LISTSCRIPTS. If the script with the + new script name exists, then the client should ask the user + whether to abort the operation, to replace the script (by issuing + the DELETESCRIPT <newname> after that), or to choose a different + name. + + 2. Download the old script with GETSCRIPT <oldname>. + + 3. Upload the old script with the new name: PUTSCRIPT <newname>. + + 4. If the old script was active (as reported by LISTSCRIPTS in step + 1), then make the new script active: SETACTIVE <newname>. + + 5. Delete the old script: DELETESCRIPT <oldname>. + + Note that these steps don't describe how to handle various other + error conditions (for example, NO response containing QUOTA response + code in step 3). Error handling is left as an exercise for the + reader. + + + + + + + + + +Melnikov & Martin Standards Track [Page 26] + +RFC 5804 ManageSieve July 2010 + + +2.12. CHECKSCRIPT Command + + Arguments: String - Script content + + The CHECKSCRIPT command is used by the client to verify Sieve script + validity without storing the script on the server. + + The server MUST check the submitted script for syntactic validity, + which includes checking that all Sieve extensions mentioned in Sieve + script "require" statement(s) are supported by the Sieve interpreter. + (Note that if the Sieve interpreter supports the Sieve "ihave" + extension [I-HAVE], any unrecognized/unsupported extension mentioned + in the "ihave" test MUST NOT cause the syntactic validation failure.) + If the script fails this test, the server MUST reply with a NO + response. The message given with a NO response MUST be human + readable and SHOULD contain a specific error message giving the line + number of the first error. Implementors should strive to produce + helpful error messages similar to those given by programming language + compilers. Client implementations should note that this may be a + multiline literal string with more than one error message separated + by CRLFs. The human-readable message is in the language returned in + the latest LANGUAGE capability (or in "i-default"; see Section 1.7), + encoded in UTF-8 [UTF-8]. + + Examples: + + C: CheckScript {31+} + C: #comment + C: InvalidSieveCommand + C: + S: NO "line 2: Syntax error" + + A ManageSieve server supporting this command MUST NOT check if the + script will put the current user over its quota limit. + + An OK response MAY contain the WARNINGS response code. In such a + case, the human-readable message that follows the OK response SHOULD + contain a specific warning message (or messages) giving the line + number(s) in the script that might contain errors not intended by the + script writer. The human-readable message is in the language + returned in the latest LANGUAGE capability (or in "i-default"; see + Section 1.7), encoded in UTF-8 [UTF-8]. A client seeing such a + response code SHOULD present the message to the user. + + + + + + + + +Melnikov & Martin Standards Track [Page 27] + +RFC 5804 ManageSieve July 2010 + + +2.13. NOOP Command + + Arguments: String - tag to echo back (optional) + + The NOOP command does nothing, beyond returning a response to the + client. It may be used by clients for protocol re-synchronization or + to reset any inactivity auto-logout timer on the server. + + The response to the NOOP command is always OK, followed by the TAG + response code together with the supplied string. If no string was + supplied in the NOOP command, the TAG response code MUST NOT be + included. + + Examples: + + C: NOOP + S: OK "NOOP completed" + + C: NOOP "STARTTLS-SYNC-42" + S: OK (TAG {16} + S: STARTTLS-SYNC-42) "Done" + +2.14. Recommended Extensions + + The UNAUTHENTICATE extension (advertised as the "UNAUTHENTICATE" + capability with no parameters) defines a new UNAUTHENTICATE command, + which allows a client to return the server to non-authenticated + state. Support for this extension is RECOMMENDED. + +2.14.1. UNAUTHENTICATE Command + + The UNAUTHENTICATE command returns the server to the + non-authenticated state. It doesn't affect any previously + established TLS [TLS] or SASL (Section 2.1) security layer. + + The UNAUTHENTICATE command is only valid in authenticated state. If + issued in a wrong state, the server MUST reject it with a NO + response. + + The UNAUTHENTICATE command has no parameters. + + When issued in the authenticated state, the UNAUTHENTICATE command + MUST NOT fail (i.e., it must never return anything other than OK or + BYE). + + + + + + + +Melnikov & Martin Standards Track [Page 28] + +RFC 5804 ManageSieve July 2010 + + +3. Sieve URL Scheme + + URI scheme name: sieve + + Status: permanent + + URI scheme syntax: Described using ABNF [ABNF]. Some ABNF + productions not defined below are from [URI-GEN]. + + sieveurl = sieveurl-server / sieveurl-list-scripts / + sieveurl-script + + sieveurl-server = "sieve://" authority + + sieveurl-list-scripts = "sieve://" authority ["/"] + + sieveurl-script = "sieve://" authority "/" + [owner "/"] scriptname + + authority = <defined in [URI-GEN]> + + owner = *ochar + ;; %-encoded version of [SASL] authorization + ;; identity (script owner) or "userid". + ;; + ;; Empty owner is used to reference + ;; global scripts. + ;; + ;; Note that ASCII characters such as " ", ";", + ;; "&", "=", "/" and "?" must be %-encoded + ;; as per rule specified in [URI-GEN]. + + scriptname = 1*ochar + ;; %-encoded version of UTF-8 representation + ;; of the script name. + ;; Note that ASCII characters such as " ", ";", + ;; "&", "=", "/" and "?" must be %-encoded + ;; as per rule specified in [URI-GEN]. + + ochar = unreserved / pct-encoded / sub-delims-sh / + ":" / "@" + ;; Same as [URI-GEN] 'pchar', + ;; but without ";", "&" and "=". + + unreserved = <defined in [URI-GEN]> + + pct-encoded = <defined in [URI-GEN]> + + + + +Melnikov & Martin Standards Track [Page 29] + +RFC 5804 ManageSieve July 2010 + + + sub-delims-sh = "!" / "$" / "'" / "(" / ")" / + "*" / "+" / "," + ;; Same as [URI-GEN] sub-delims, + ;; but without ";", "&" and "=". + + URI scheme semantics: + + A Sieve URL identifies a Sieve server or a Sieve script on a Sieve + server. The latter form is associated with the application/sieve + MIME type defined in [SIEVE]. There is no MIME type associated + with the former form of Sieve URI. + + The server form is used in the REFERRAL response code (see Section + 1.3) in order to designate another server where the client should + perform its operations. + + The script form allows to retrieve (GETSCRIPT), update + (PUTSCRIPT), delete (DELETESCRIPT), or activate (SETACTIVE) the + named script; however, the most typical action would be to + retrieve the script. If the script name is empty (omitted), the + URI requests that the client lists available scripts using the + LISTSCRIPTS command. + + Encoding considerations: + + The script name and/or the owner, if present, is in UTF-8. Non-- + US-ASCII UTF-8 octets MUST be percent-encoded as described in + [URI-GEN]. US-ASCII characters such as " " (space), ";", "&", + "=", "/" and "?" MUST be %-encoded as described in [URI-GEN]. + Note that "&" and "?" are in this list in order to allow for + future extensions. + + Note that the empty owner (e.g., sieve://example.com//script) is + different from the missing owner (e.g., + sieve://example.com/script) and is reserved for referencing global + scripts. + + The user name (in the "authority" part), if present, is in UTF-8. + Non-US-ASCII UTF-8 octets MUST be percent-encoded as described in + [URI-GEN]. + + Applications/protocols that use this URI scheme name: + ManageSieve [RFC5804] clients and servers. Clients that can store + user preferences in protocols such as [LDAP] or [ACAP]. + + Interoperability considerations: None. + + + + + +Melnikov & Martin Standards Track [Page 30] + +RFC 5804 ManageSieve July 2010 + + + Security considerations: + The <scriptname> part of a ManageSieve URL might potentially disclose + some confidential information about the author of the script or, + depending on a ManageSieve implementation, about configuration of the + mail system. The latter might be used to prepare for a more complex + attack on the mail system. + + Clients resolving ManageSieve URLs that wish to achieve data + confidentiality and/or integrity SHOULD use the STARTTLS command (if + supported by the server) before starting authentication, or use a + SASL mechanism, such as GSSAPI, that provides a confidentiality + security layer. + + Contact: Alexey Melnikov <alexey.melnikov@isode.com> + + Author/Change controller: IESG. + + References: This document and RFC 5228 [SIEVE]. + +4. Formal Syntax + + The following syntax specification uses the Augmented Backus-Naur + Form (BNF) notation as specified in [ABNF]. This uses the ABNF core + rules as specified in Appendix A of the ABNF specification [ABNF]. + "UTF8-2", "UTF8-3", and "UTF8-4" non-terminal are defined in [UTF-8]. + + Except as noted otherwise, all alphabetic characters are case- + insensitive. The use of upper- or lowercase characters to define + token strings is for editorial clarity only. Implementations MUST + accept these strings in a case-insensitive fashion. + + SAFE-CHAR = %x01-09 / %x0B-0C / %x0E-21 / %x23-5B / + %x5D-7F + ;; any TEXT-CHAR except QUOTED-SPECIALS + + QUOTED-CHAR = SAFE-UTF8-CHAR / "\" QUOTED-SPECIALS + + QUOTED-SPECIALS = DQUOTE / "\" + + SAFE-UTF8-CHAR = SAFE-CHAR / UTF8-2 / UTF8-3 / UTF8-4 + ;; <UTF8-2>, <UTF8-3>, and <UTF8-4> + ;; are defined in [UTF-8]. + + ATOM-CHAR = "!" / %x23-27 / %x2A-5B / %x5D-7A / %x7C-7E + ;; Any CHAR except ATOM-SPECIALS + + ATOM-SPECIALS = "(" / ")" / "{" / SP / CTL / QUOTED-SPECIALS + + + + +Melnikov & Martin Standards Track [Page 31] + +RFC 5804 ManageSieve July 2010 + + + NZDIGIT = %x31-39 + ;; 1-9 + + atom = 1*1024ATOM-CHAR + + iana-token = atom + ;; MUST be registered with IANA + + auth-type = DQUOTE auth-type-name DQUOTE + + auth-type-name = iana-token + ;; as defined in SASL [SASL] + + command = (command-any / command-auth / + command-nonauth) CRLF + ;; Modal based on state + + command-any = command-capability / command-logout / + command-noop + ;; Valid in all states + + command-auth = command-getscript / command-setactive / + command-listscripts / command-deletescript / + command-putscript / command-checkscript / + command-havespace / + command-renamescript / + command-unauthenticate + ;; Valid only in Authenticated state + + command-nonauth = command-authenticate / command-starttls + ;; Valid only when in Non-Authenticated + ;; state + + command-authenticate = "AUTHENTICATE" SP auth-type [SP string] + *(CRLF string) + + command-capability = "CAPABILITY" + + command-deletescript = "DELETESCRIPT" SP sieve-name + + command-getscript = "GETSCRIPT" SP sieve-name + + command-havespace = "HAVESPACE" SP sieve-name SP number + + command-listscripts = "LISTSCRIPTS" + + command-noop = "NOOP" [SP string] + + + + +Melnikov & Martin Standards Track [Page 32] + +RFC 5804 ManageSieve July 2010 + + + command-logout = "LOGOUT" + + command-putscript = "PUTSCRIPT" SP sieve-name SP sieve-script + + command-checkscript = "CHECKSCRIPT" SP sieve-script + + sieve-script = string + + command-renamescript = "RENAMESCRIPT" SP old-sieve-name SP + new-sieve-name + + old-sieve-name = sieve-name + + new-sieve-name = sieve-name + + command-setactive = "SETACTIVE" SP active-sieve-name + + command-starttls = "STARTTLS" + + command-unauthenticate= "UNAUTHENTICATE" + + extend-token = atom + ;; MUST be defined by a Standards Track or + ;; IESG-approved experimental protocol + ;; extension + + extension-data = extension-item *(SP extension-item) + + extension-item = extend-token / string / number / + "(" [extension-data] ")" + + literal-c2s = "{" number "+}" CRLF *OCTET + ;; The number represents the number of + ;; octets. + ;; This type of literal can only be sent + ;; from the client to the server. + + literal-s2c = "{" number "}" CRLF *OCTET + ;; Almost identical to literal-c2s, + ;; but with no '+' character. + ;; The number represents the number of + ;; octets. + ;; This type of literal can only be sent + ;; from the server to the client. + + + + + + + +Melnikov & Martin Standards Track [Page 33] + +RFC 5804 ManageSieve July 2010 + + + number = (NZDIGIT *DIGIT) / "0" + ;; A 32-bit unsigned number + ;; with no extra leading zeros. + ;; (0 <= n < 4,294,967,296) + + number-str = string + ;; <number> encoded as a <string>. + + quoted = DQUOTE *1024QUOTED-CHAR DQUOTE + ;; limited to 1024 octets between the <">s + + resp-code = "AUTH-TOO-WEAK" / "ENCRYPT-NEEDED" / "QUOTA" + ["/" ("MAXSCRIPTS" / "MAXSIZE")] / + resp-code-sasl / + resp-code-referral / + "TRANSITION-NEEDED" / "TRYLATER" / + "ACTIVE" / "NONEXISTENT" / + "ALREADYEXISTS" / "WARNINGS" / + "TAG" SP string / + resp-code-ext + + resp-code-referral = "REFERRAL" SP sieveurl + + resp-code-sasl = "SASL" SP string + + resp-code-name = iana-token + ;; The response code name is hierarchical, + ;; separated by '/'. + ;; The response code name MUST NOT start + ;; with '/'. + + resp-code-ext = resp-code-name [SP extension-data] + ;; unknown response codes MUST be tolerated + ;; by the client. + + response = response-authenticate / + response-logout / + response-getscript / + response-setactive / + response-listscripts / + response-deletescript / + response-putscript / + response-checkscript / + response-capability / + response-havespace / + response-starttls / + response-renamescript / + response-noop / + + + +Melnikov & Martin Standards Track [Page 34] + +RFC 5804 ManageSieve July 2010 + + + response-unauthenticate + + response-authenticate = *(string CRLF) + ((response-ok [response-capability]) / + response-nobye) + ;; <response-capability> is REQUIRED if a + ;; SASL security layer was negotiated and + ;; MUST be omitted otherwise. + + response-capability = *(single-capability) response-oknobye + + single-capability = capability-name [SP string] CRLF + + capability-name = string + + ;; Note that literal-s2c is allowed. + + initial-capabilities = DQUOTE "IMPLEMENTATION" DQUOTE SP string / + DQUOTE "SASL" DQUOTE SP sasl-mechs / + DQUOTE "SIEVE" DQUOTE SP sieve-extensions / + DQUOTE "MAXREDIRECTS" DQUOTE SP number-str / + DQUOTE "NOTIFY" DQUOTE SP notify-mechs / + DQUOTE "STARTTLS" DQUOTE / + DQUOTE "LANGUAGE" DQUOTE SP language / + DQUOTE "VERSION" DQUOTE SP version / + DQUOTE "OWNER" DQUOTE SP string + ;; Each capability conforms to + ;; the syntax for single-capability. + ;; Also, note that the capability name + ;; can be returned as either literal-s2c + ;; or quoted, even though only "quoted" + ;; string is shown above. + + version = ( DQUOTE "1.0" DQUOTE ) / version-ext + + version-ext = DQUOTE ver-major "." ver-minor DQUOTE + ; Future versions specified in updates + ; to this document. An increment to + ; the ver-major means a backward-incompatible + ; change to the protocol, e.g., "3.5" (ver-major "3") + ; is not backward-compatible with any "2.X" version. + ; Any version "Z.W" MUST be backward compatible + ; with any version "Z.Q", where Q < W. + ; For example, version "2.4" is backward compatible + ; with version "2.0", "2.1", "2.2", and "2.3". + + ver-major = number + + + + +Melnikov & Martin Standards Track [Page 35] + +RFC 5804 ManageSieve July 2010 + + + ver-minor = number + + sasl-mechs = string + ; Space-separated list of SASL mechanisms, + ; each SASL mechanism name complies with rules + ; specified in [SASL]. + ; Can be empty. + + sieve-extensions = string + ; Space-separated list of supported SIEVE extensions. + ; Can be empty. + + language = string + ; Contains <Language-Tag> from [RFC5646]. + + + notify-mechs = string + ; Space-separated list of URI schema parts + ; for supported notification [NOTIFY] methods. + ; MUST NOT be empty. + + response-deletescript = response-oknobye + + response-getscript = (sieve-script CRLF response-ok) / + response-nobye + + response-havespace = response-oknobye + + response-listscripts = *(sieve-name [SP "ACTIVE"] CRLF) + response-oknobye + ;; ACTIVE may only occur with one sieve-name + + response-logout = response-oknobye + + response-unauthenticate= response-oknobye + ;; "NO" response can only be returned when + ;; the command is issued in a wrong state + ;; or has a wrong number of parameters + + response-ok = "OK" [SP "(" resp-code ")"] + [SP string] CRLF + ;; The string contains human-readable text + ;; encoded as UTF-8. + + response-nobye = ("NO" / "BYE") [SP "(" resp-code ")"] + [SP string] CRLF + ;; The string contains human-readable text + ;; encoded as UTF-8. + + + +Melnikov & Martin Standards Track [Page 36] + +RFC 5804 ManageSieve July 2010 + + + response-oknobye = response-ok / response-nobye + + response-noop = response-ok + + response-putscript = response-oknobye + + response-checkscript = response-oknobye + + response-renamescript = response-oknobye + + response-setactive = response-oknobye + + response-starttls = (response-ok response-capability) / + response-nobye + + sieve-name = string + ;; See Section 1.6 for the full list of + ;; prohibited characters. + ;; Empty string is not allowed. + + active-sieve-name = string + ;; See Section 1.6 for the full list of + ;; prohibited characters. + ;; This is similar to <sieve-name>, but + ;; empty string is allowed and has a special + ;; meaning. + + string = quoted / literal-c2s / literal-s2c + ;; literal-c2s is only allowed when sent + ;; from the client to the server. + ;; literal-s2c is only allowed when sent + ;; from the server to the client. + ;; quoted is allowed in either direction. + +5. Security Considerations + + The AUTHENTICATE command uses SASL [SASL] to provide authentication + and authorization services. Integrity and privacy services can be + provided by [SASL] and/or [TLS]. When a SASL mechanism is used, the + security considerations for that mechanism apply. + + This protocol's transactions are susceptible to passive observers or + man-in-the-middle attacks that alter the data, unless the optional + encryption and integrity services of the SASL (via the AUTHENTICATE + command) and/or [TLS] (via the STARTTLS command) are enabled, or an + external security mechanism is used for protection. It may be useful + to allow configuration of both clients and servers to refuse to + transfer sensitive information in the absence of strong encryption. + + + +Melnikov & Martin Standards Track [Page 37] + +RFC 5804 ManageSieve July 2010 + + + If an implementation supports SASL mechanisms that are vulnerable to + passive eavesdropping attacks (such as [PLAIN]), then the + implementation MUST support at least one configuration where these + SASL mechanisms are not advertised or used without the presence of an + external security layer such as [TLS]. + + Some response codes returned on failed AUTHENTICATE command may + disclose whether or not the username is valid (e.g., TRANSITION- + NEEDED), so server implementations SHOULD provide the ability to + disable these features (or make them not conditional on a per-user + basis) for sites concerned about such disclosure. In the case of + ENCRYPT-NEEDED, if it is applied to all identities then no extra + information is disclosed, but if it is applied on a per-user basis it + can disclose information. + + A compromised or malicious server can use the TRANSITION-NEEDED + response code to force the client that is configured to use a + mechanism that does not disclose the user's password to the server + (e.g., Kerberos), to send the bare password to the server. Clients + SHOULD have the ability to disable the password transition feature, + or disclose that risk to the user and offer the user an option of how + to proceed. + +6. IANA Considerations + + IANA has reserved TCP port number 4190 for use with the ManageSieve + protocol described in this document. + + IANA has registered the "sieve" URI scheme defined in Section 3 of + this document. + + IANA has registered "sieve" in the "GSSAPI/Kerberos/SASL Service + Names" registry. + + IANA has created a new registry for ManageSieve capabilities. The + registration template for ManageSieve capabilities is specified in + Section 6.1. ManageSieve protocol capabilities MUST be specified in + a Standards-Track or IESG-approved Experimental RFC. + + IANA has created a new registry for ManageSieve response codes. The + registration template for ManageSieve response codes is specified in + Section 6.3. ManageSieve protocol response codes MUST be specified + in a Standards-Track or IESG-approved Experimental RFC. + + + + + + + + +Melnikov & Martin Standards Track [Page 38] + +RFC 5804 ManageSieve July 2010 + + +6.1. ManageSieve Capability Registration Template + + To: iana@iana.org + Subject: ManageSieve Capability Registration + + Please register the following ManageSieve capability: + + Capability name: + Description: + Relevant publications: + Person & email address to contact for further information: + Author/Change controller: + +6.2. Registration of Initial ManageSieve Capabilities + + To: iana@iana.org + Subject: ManageSieve Capability Registration + + Please register the following ManageSieve capabilities: + + Capability name: IMPLEMENTATION + Description: Its value contains the name of the server + implementation and its version. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: SASL + Description: Its value contains a space-separated list of SASL + mechanisms supported by the server. + Relevant publications: this RFC, Sections 1.7 and 2.1. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: SIEVE + Description: Its value contains a space-separated list of supported + SIEVE extensions. + Relevant publications: this RFC, Section 1.7. Also [SIEVE]. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 39] + +RFC 5804 ManageSieve July 2010 + + + Capability name: STARTTLS + Description: This capability is returned if the server supports TLS + (STARTTLS command). + Relevant publications: this RFC, Sections 1.7 and 2.2. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: NOTIFY + Description: This capability is returned if the server supports the + 'enotify' [NOTIFY] Sieve extension. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: MAXREDIRECTS + Description: This capability returns the limit on the number of + Sieve "redirect" actions a script can perform during a + single evaluation. The value is a non-negative number + represented as a ManageSieve string. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: LANGUAGE + Description: The language (<Language-Tag> from [RFC5646]) currently + used for human-readable error messages. + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Capability name: OWNER + Description: Its value contains the UTF-8-encoded name of the + currently logged-in user ("authorization identity" + according to RFC 4422). + Relevant publications: this RFC, Section 1.7. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + +Melnikov & Martin Standards Track [Page 40] + +RFC 5804 ManageSieve July 2010 + + + Capability name: VERSION + Description: This capability is returned if the server is compliant + with RFC 5804; i.e., that it supports RENAMESCRIPT, + CHECKSCRIPT, and NOOP commands. + Relevant publications: this RFC, Sections 2.11, 2.12, and 2.13. + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + +6.3. ManageSieve Response Code Registration Template + + To: iana@iana.org + Subject: ManageSieve Response Code Registration + + Please register the following ManageSieve response code: + + Response Code: + Arguments (use ABNF to specify syntax, or the word NONE if none + can be specified): + Purpose: + Published Specification(s): + Person & email address to contact for further information: + Author/Change controller: + +6.4. Registration of Initial ManageSieve Response Codes + + To: iana@iana.org + Subject: ManageSieve Response Code Registration + + Please register the following ManageSieve response codes: + + Response Code: AUTH-TOO-WEAK + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code is returned in the NO response from + an AUTHENTICATE command. It indicates that site + security policy forbids the use of the requested + mechanism for the specified authentication identity. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + +Melnikov & Martin Standards Track [Page 41] + +RFC 5804 ManageSieve July 2010 + + + Response Code: ENCRYPT-NEEDED + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code is returned in the NO response from + an AUTHENTICATE command. It indicates that site + security policy requires the use of a strong + encryption mechanism for the specified authentication + identity and mechanism. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: QUOTA + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined quota constraints. If + this response code is returned in the OK response, it + can mean that the user is near its quota or that the + user exceeded its quota, but the server supports soft + quotas. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: QUOTA/MAXSCRIPTS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined limit on the number of + Sieve scripts. If this response code is returned in + the OK response, it can mean that the user is near its + quota or that the user exceeded its quota, but the + server supports soft quotas. This response code is a + more specific version of the QUOTA response code. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 42] + +RFC 5804 ManageSieve July 2010 + + + Response Code: QUOTA/MAXSIZE + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: If this response code is returned in the NO/BYE + response, it means that the command would have placed + the user above the site-defined maximum script size. + If this response code is returned in the OK response, + it can mean that the user is near its quota or that + the user exceeded its quota, but the server supports + soft quotas. This response code is a more specific + version of the QUOTA response code. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: REFERRAL + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): <sieveurl> + Purpose: This response code may be returned with a BYE result + from any command, and includes a mandatory parameter + that indicates what server to access to manage this + user's Sieve scripts. The server will be specified by + a Sieve URL (see Section 3). The scriptname portion + of the URL MUST NOT be specified. The client should + authenticate to the specified server and use it for + all further commands in the current session. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: SASL + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): <string> + Purpose: This response code can occur in the OK response to a + successful AUTHENTICATE command and includes the + optional final server response data from the server as + specified by [SASL]. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + +Melnikov & Martin Standards Track [Page 43] + +RFC 5804 ManageSieve July 2010 + + + Response Code: TRANSITION-NEEDED + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code occurs in a NO response of an + AUTHENTICATE command. It indicates that the user name + is valid, but the entry in the authentication database + needs to be updated in order to permit authentication + with the specified mechanism. This is typically done + by establishing a secure channel using TLS, followed + by authenticating once using the [PLAIN] + authentication mechanism. The selected mechanism + SHOULD then work for authentications in subsequent + sessions. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: TRYLATER + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed due to a temporary server failure. + The client MAY continue using local information and + try the command later. This response code only make + sense when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: ACTIVE + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because it is not allowed on the + active script, for example, DELETESCRIPT on the active + script. This response code only makes sense when + returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + + + + + +Melnikov & Martin Standards Track [Page 44] + +RFC 5804 ManageSieve July 2010 + + + Response Code: NONEXISTENT + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because the referenced script name + doesn't exist. This response code only makes sense + when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: ALREADYEXISTS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: A command failed because the referenced script name + already exists. This response code only makes sense + when returned in a NO/BYE response. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: WARNINGS + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): NONE + Purpose: This response code MAY be returned by the server in + the OK response (but it might be returned with the NO/ + BYE response as well) and signals the client that even + though the script is syntactically valid, it might + contain errors not intended by the script writer. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + Response Code: TAG + Arguments (use ABNF to specify syntax, or the word NONE if none can + be specified): string + Purpose: This response code name is followed by a string + specified in the command that caused this response. + It is typically used for client state synchronization. + Published Specification(s): [RFC5804] + Person & email address to contact for further information: + Alexey Melnikov <alexey.melnikov@isode.com> + Author/Change controller: IESG. + + + + + + +Melnikov & Martin Standards Track [Page 45] + +RFC 5804 ManageSieve July 2010 + + +7. Internationalization Considerations + + The LANGUAGE capability (see Section 1.7) allows a client to discover + the current language used in all human-readable responses that might + be returned at the end of any OK/NO/BYE response. Human-readable + text in OK responses typically doesn't need to be shown to the user, + unless it is returned in response to a PUTSCRIPT or CHECKSCRIPT + command that also contains the WARNINGS response code (Section 1.3). + Human-readable text from NO/BYE responses is intended be shown to the + user, unless the client can automatically handle failure of the + command that caused such a response. Clients SHOULD use response + codes (Section 1.3) for automatic error handling. Response codes MAY + also be used by the client to present error messages in a language + understood by the user, for example, if the LANGUAGE capability + doesn't return a language understood by the user. + + Note that the human-readable text from OK (WARNINGS) or NO/BYE + responses for PUTSCRIPT/CHECKSCRIPT commands is intended for advanced + users that understand Sieve language. Such advanced users are often + sophisticated enough to be able to handle whatever language the + server is using, even if it is not their preferred language, and will + want to see error/warning text no matter what language the server + puts it in. + + A client that generates Sieve script automatically, for example, if + the script is generated without user intervention or from a UI that + presents an abstract list of conditions and corresponding actions, + SHOULD NOT present warning/error messages to the user, because the + user might not even be aware that the client is using Sieve + underneath. However, if the client has a debugging mode, such + warnings/errors SHOULD be available in the debugging mode. + + Note that this document doesn't provide a way to modify the currently + used language. It is expected that a future extension will address + that. + +8. Acknowledgements + + Thanks to Simon Josefsson, Larry Greenfield, Allen Johnson, Chris + Newman, Lyndon Nerenberg, Tim Showalter, Sarah Robeson, Walter Wong, + Barry Leiba, Arnt Gulbrandsen, Stephan Bosch, Ken Murchison, Phil + Pennock, Ned Freed, Jeffrey Hutzelman, Mark E. Mallett, Dilyan + Palauzov, Dave Cridland, Aaron Stone, Robert Burrell Donkin, Patrick + Ben Koetter, Bjoern Hoehrmann, Martin Duerst, Pasi Eronen, Magnus + Westerlund, Tim Polk, and Julien Coloos for help with this document. + Special thank you to Phil Pennock for providing text for the NOOP + command, as well as finding various bugs in the document. + + + + +Melnikov & Martin Standards Track [Page 46] + +RFC 5804 ManageSieve July 2010 + + +9. References + +9.1. Normative References + + [ABNF] Crocker, D. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, January 2008. + + [ACAP] Newman, C. and J. Myers, "ACAP -- Application + Configuration Access Protocol", RFC 2244, November + 1997. + + [BASE64] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, October 2006. + + [DNS-SRV] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR + for specifying the location of services (DNS SRV)", + RFC 2782, February 2000. + + [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [NET-UNICODE] Klensin, J. and M. Padlipsky, "Unicode Format for + Network Interchange", RFC 5198, March 2008. + + [NOTIFY] Melnikov, A., Leiba, B., Segmuller, W., and T. Martin, + "Sieve Email Filtering: Extension for Notifications", + RFC 5435, January 2009. + + [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and + Languages", BCP 18, RFC 2277, January 1998. + + [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version + 6 (IPv6) Specification", RFC 2460, December 1998. + + [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, + "Internationalizing Domain Names in Applications + (IDNA)", RFC 3490, March 2003. + + [RFC4519] Sciberras, A., "Lightweight Directory Access Protocol + (LDAP): Schema for User Applications", RFC 4519, June + 2006. + + [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying + Languages", BCP 47, RFC 5646, September 2009. + + [RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791, + September 1981. + + + + +Melnikov & Martin Standards Track [Page 47] + +RFC 5804 ManageSieve July 2010 + + + [SASL] Melnikov, A. and K. Zeilenga, "Simple Authentication + and Security Layer (SASL)", RFC 4422, June 2006. + + [SASLprep] Zeilenga, K., "SASLprep: Stringprep Profile for User + Names and Passwords", RFC 4013, February 2005. + + [SCRAM] Menon-Sen, A., Melnikov, A., Newman, C., and N. + Williams, "Salted Challenge Response Authentication + Mechanism (SCRAM) SASL and GSS-API Mechanisms", RFC + 5802, July 2010. + + [SIEVE] Guenther, P. and T. Showalter, "Sieve: An Email + Filtering Language", RFC 5228, January 2008. + + [StringPrep] Hoffman, P. and M. Blanchet, "Preparation of + Internationalized Strings ("stringprep")", RFC 3454, + December 2002. + + [TLS] Dierks, T. and E. Rescorla, "The Transport Layer + Security (TLS) Protocol Version 1.2", RFC 5246, August + 2008. + + [URI-GEN] Berners-Lee, T., Fielding, R., and L. Masinter, + "Uniform Resource Identifier (URI): Generic Syntax", + STD 66, RFC 3986, January 2005. + + [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, November 2003. + + [X509] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., + Housley, R., and W. Polk, "Internet X.509 Public Key + Infrastructure Certificate and Certificate Revocation + List (CRL) Profile", RFC 5280, May 2008. + + [X509-SRV] Santesson, S., "Internet X.509 Public Key + Infrastructure Subject Alternative Name for Expression + of Service Name", RFC 4985, August 2007. + +9.2. Informative References + + [DIGEST-MD5] Leach, P. and C. Newman, "Using Digest Authentication + as a SASL Mechanism", RFC 2831, May 2000. + + [GSSAPI] Melnikov, A., "The Kerberos V5 ("GSSAPI") Simple + Authentication and Security Layer (SASL) Mechanism", + RFC 4752, November 2006. + + + + + +Melnikov & Martin Standards Track [Page 48] + +RFC 5804 ManageSieve July 2010 + + + [I-HAVE] Freed, N., "Sieve Email Filtering: Ihave Extension", + RFC 5463, March 2009. + + [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - + VERSION 4rev1", RFC 3501, March 2003. + + [LDAP] Zeilenga, K., "Lightweight Directory Access Protocol + (LDAP): Technical Specification Road Map", RFC 4510, + June 2006. + + [PLAIN] Zeilenga, K., "The PLAIN Simple Authentication and + Security Layer (SASL) Mechanism", RFC 4616, August + 2006. + +Authors' Addresses + + Alexey Melnikov (editor) + Isode Limited + 5 Castle Business Village + 36 Station Road + Hampton, Middlesex TW12 2BX + UK + + EMail: Alexey.Melnikov@isode.com + + + Tim Martin + BeThereBeSquare, Inc. + 672 Haight st. + San Francisco, CA 94117 + USA + + Phone: +1 510 260-4175 + EMail: timmartin@alumni.cmu.edu + + + + + + + + + + + + + + + + + +Melnikov & Martin Standards Track [Page 49] +