rohrpost

A commandline mail client to change the world as we see it.
git clone git://r-36.net/rohrpost
Log | Files | Refs | README | LICENSE

rfc5256.txt (40779B)


      1 
      2 
      3 
      4 
      5 
      6 
      7 Network Working Group                                         M. Crispin
      8 Request for Comments: 5256                             Panda Programming
      9 Category: Standards Track                                   K. Murchison
     10                                               Carnegie Mellon University
     11                                                                June 2008
     12 
     13 
     14      Internet Message Access Protocol - SORT and THREAD Extensions
     15 
     16 Status of This Memo
     17 
     18    This document specifies an Internet standards track protocol for the
     19    Internet community, and requests discussion and suggestions for
     20    improvements.  Please refer to the current edition of the "Internet
     21    Official Protocol Standards" (STD 1) for the standardization state
     22    and status of this protocol.  Distribution of this memo is unlimited.
     23 
     24 Abstract
     25 
     26    This document describes the base-level server-based sorting and
     27    threading extensions to the IMAP protocol.  These extensions provide
     28    substantial performance improvements for IMAP clients that offer
     29    sorted and threaded views.
     30 
     31 1.  Introduction
     32 
     33    The SORT and THREAD extensions to the [IMAP] protocol provide a means
     34    of server-based sorting and threading of messages, without requiring
     35    that the client download the necessary data to do so itself.  This is
     36    particularly useful for online clients as described in [IMAP-MODELS].
     37 
     38    A server that supports the base-level SORT extension indicates this
     39    with a capability name which starts with "SORT".  Future, upwards-
     40    compatible extensions to the SORT extension will all start with
     41    "SORT", indicating support for this base level.
     42 
     43    A server that supports the THREAD extension indicates this with one
     44    or more capability names consisting of "THREAD=" followed by a
     45    supported threading algorithm name as described in this document.
     46    This provides for future upwards-compatible extensions.
     47 
     48    A server that implements the SORT and/or THREAD extensions MUST
     49    collate strings in accordance with the requirements of I18NLEVEL=1,
     50    as described in [IMAP-I18N], and SHOULD implement and advertise the
     51    I18NLEVEL=1 extension.  Alternatively, a server MAY implement
     52    I18NLEVEL=2 (or higher) and comply with the rules of that level.
     53 
     54 
     55 
     56 
     57 
     58 Crispin & Murchison         Standards Track                     [Page 1]
     59 
     60 RFC 5256                       IMAP Sort                       June 2008
     61 
     62 
     63       Discussion: The SORT and THREAD extensions predate [IMAP-I18N] by
     64       several years.  At the time of this writing, all known server
     65       implementations of SORT and THREAD comply with the rules of
     66       I18NLEVEL=1, but do not necessarily advertise it.  As discussed in
     67       [IMAP-I18N] section 4.5, all server implementations should
     68       eventually be updated to comply with the I18NLEVEL=2 extension.
     69 
     70    Historical note: The REFERENCES threading algorithm is based on the
     71    [THREADING] algorithm written and used in "Netscape Mail and News"
     72    versions 2.0 through 3.0.
     73 
     74 2.  Terminology
     75 
     76    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
     77    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
     78    document are to be interpreted as described in [KEYWORDS].
     79 
     80    The word "can" (not "may") is used to refer to a possible
     81    circumstance or situation, as opposed to an optional facility of the
     82    protocol.
     83 
     84    "User" is used to refer to a human user, whereas "client" refers to
     85    the software being run by the user.
     86 
     87    In examples, "C:" and "S:" indicate lines sent by the client and
     88    server, respectively.
     89 
     90 2.1.  Base Subject
     91 
     92    Subject sorting and threading use the "base subject", which has
     93    specific subject artifacts removed.  Due to the complexity of these
     94    artifacts, the formal syntax for the subject extraction rules is
     95    ambiguous.  The following procedure is followed to determine the
     96    "base subject", using the [ABNF] formal syntax rules described in
     97    section 5:
     98 
     99       (1) Convert any RFC 2047 encoded-words in the subject to [UTF-8]
    100           as described in "Internationalization Considerations".
    101           Convert all tabs and continuations to space.  Convert all
    102           multiple spaces to a single space.
    103 
    104       (2) Remove all trailing text of the subject that matches the
    105           subj-trailer ABNF; repeat until no more matches are possible.
    106 
    107       (3) Remove all prefix text of the subject that matches the subj-
    108           leader ABNF.
    109 
    110 
    111 
    112 
    113 
    114 Crispin & Murchison         Standards Track                     [Page 2]
    115 
    116 RFC 5256                       IMAP Sort                       June 2008
    117 
    118 
    119       (4) If there is prefix text of the subject that matches the subj-
    120           blob ABNF, and removing that prefix leaves a non-empty subj-
    121           base, then remove the prefix text.
    122 
    123       (5) Repeat (3) and (4) until no matches remain.
    124 
    125    Note: It is possible to defer step (2) until step (6), but this
    126    requires checking for subj-trailer in step (4).
    127 
    128       (6) If the resulting text begins with the subj-fwd-hdr ABNF and
    129           ends with the subj-fwd-trl ABNF, remove the subj-fwd-hdr and
    130           subj-fwd-trl and repeat from step (2).
    131 
    132       (7) The resulting text is the "base subject" used in the SORT.
    133 
    134    All servers and disconnected (as described in [IMAP-MODELS]) clients
    135    MUST use exactly this algorithm to determine the "base subject".
    136    Otherwise, there is potential for a user to get inconsistent results
    137    based on whether they are running in connected or disconnected mode.
    138 
    139 2.2.  Sent Date
    140 
    141    As used in this document, the term "sent date" refers to the date and
    142    time from the Date: header, adjusted by time zone to normalize to
    143    UTC.  For example, "31 Dec 2000 16:01:33 -0800" is equivalent to the
    144    UTC date and time of "1 Jan 2001 00:01:33 +0000".
    145 
    146    If the time zone is invalid, the date and time SHOULD be treated as
    147    UTC.  If the time is also invalid, the time SHOULD be treated as
    148    00:00:00.  If there is no valid date or time, the date and time
    149    SHOULD be treated as 00:00:00 on the earliest possible date.
    150 
    151    This differs from the date-related criteria in the SEARCH command
    152    (described in [IMAP] section 6.4.4), which use just the date and not
    153    the time, and are not adjusted by time zone.
    154 
    155    If the sent date cannot be determined (a Date: header is missing or
    156    cannot be parsed), the INTERNALDATE for that message is used as the
    157    sent date.
    158 
    159    When comparing two sent dates that match exactly, the order in which
    160    the two messages appear in the mailbox (that is, by sequence number)
    161    is used as a tie-breaker to determine the order.
    162 
    163 
    164 
    165 
    166 
    167 
    168 
    169 
    170 Crispin & Murchison         Standards Track                     [Page 3]
    171 
    172 RFC 5256                       IMAP Sort                       June 2008
    173 
    174 
    175 3.  Additional Commands
    176 
    177    These commands are extensions to the [IMAP] base protocol.
    178 
    179    The section headings are intended to correspond with where they would
    180    be located in the main document if they were part of the base
    181    specification.
    182 
    183 BASE.6.4.SORT. SORT Command
    184 
    185    Arguments:  sort program
    186                charset specification
    187                searching criteria (one or more)
    188 
    189    Data:       untagged responses: SORT
    190 
    191    Result:     OK - sort completed
    192                NO - sort error: can't sort that charset or
    193                     criteria
    194                BAD - command unknown or arguments invalid
    195 
    196       The SORT command is a variant of SEARCH with sorting semantics for
    197       the results.  There are two arguments before the searching
    198       criteria argument: a parenthesized list of sort criteria, and the
    199       searching charset.
    200 
    201       The charset argument is mandatory (unlike SEARCH) and indicates
    202       the [CHARSET] of the strings that appear in the searching
    203       criteria.  The US-ASCII and [UTF-8] charsets MUST be implemented.
    204       All other charsets are optional.
    205 
    206       There is also a UID SORT command that returns unique identifiers
    207       instead of message sequence numbers.  Note that there are separate
    208       searching criteria for message sequence numbers and UIDs; thus,
    209       the arguments to UID SORT are interpreted the same as in SORT.
    210       This is analogous to the behavior of UID SEARCH, as opposed to UID
    211       COPY, UID FETCH, or UID STORE.
    212 
    213       The SORT command first searches the mailbox for messages that
    214       match the given searching criteria using the charset argument for
    215       the interpretation of strings in the searching criteria.  It then
    216       returns the matching messages in an untagged SORT response, sorted
    217       according to one or more sort criteria.
    218 
    219       Sorting is in ascending order.  Earlier dates sort before later
    220       dates; smaller sizes sort before larger sizes; and strings are
    221       sorted according to ascending values established by their
    222       collation algorithm (see "Internationalization Considerations").
    223 
    224 
    225 
    226 Crispin & Murchison         Standards Track                     [Page 4]
    227 
    228 RFC 5256                       IMAP Sort                       June 2008
    229 
    230 
    231       If two or more messages exactly match according to the sorting
    232       criteria, these messages are sorted according to the order in
    233       which they appear in the mailbox.  In other words, there is an
    234       implicit sort criterion of "sequence number".
    235 
    236       When multiple sort criteria are specified, the result is sorted in
    237       the priority order that the criteria appear.  For example,
    238       (SUBJECT DATE) will sort messages in order by their base subject
    239       text; and for messages with the same base subject text, it will
    240       sort by their sent date.
    241 
    242       Untagged EXPUNGE responses are not permitted while the server is
    243       responding to a SORT command, but are permitted during a UID SORT
    244       command.
    245 
    246       The defined sort criteria are as follows.  Refer to the Formal
    247       Syntax section for the precise syntactic definitions of the
    248       arguments.  If the associated RFC-822 header for a particular
    249       criterion is absent, it is treated as the empty string.  The empty
    250       string always collates before non-empty strings.
    251 
    252       ARRIVAL
    253          Internal date and time of the message.  This differs from the
    254          ON criteria in SEARCH, which uses just the internal date.
    255 
    256       CC
    257          [IMAP] addr-mailbox of the first "cc" address.
    258 
    259       DATE
    260          Sent date and time, as described in section 2.2.
    261 
    262       FROM
    263          [IMAP] addr-mailbox of the first "From" address.
    264 
    265       REVERSE
    266          Followed by another sort criterion, has the effect of that
    267          criterion but in reverse (descending) order.
    268             Note: REVERSE only reverses a single criterion, and does not
    269             affect the implicit "sequence number" sort criterion if all
    270             other criteria are identical.  Consequently, a sort of
    271             REVERSE SUBJECT is not the same as a reverse ordering of a
    272             SUBJECT sort.  This can be avoided by use of additional
    273             criteria, e.g., SUBJECT DATE vs. REVERSE SUBJECT REVERSE
    274             DATE.  In general, however, it's better (and faster, if the
    275             client has a "reverse current ordering" command) to reverse
    276             the results in the client instead of issuing a new SORT.
    277 
    278 
    279 
    280 
    281 
    282 Crispin & Murchison         Standards Track                     [Page 5]
    283 
    284 RFC 5256                       IMAP Sort                       June 2008
    285 
    286 
    287       SIZE
    288          Size of the message in octets.
    289 
    290       SUBJECT
    291          Base subject text.
    292 
    293       TO
    294          [IMAP] addr-mailbox of the first "To" address.
    295 
    296    Example:    C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994
    297                S: * SORT 2 84 882
    298                S: A282 OK SORT completed
    299                C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL
    300                S: * SORT 5 3 4 1 2
    301                S: A283 OK SORT completed
    302                C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox"
    303                S: * SORT
    304                S: A284 OK SORT completed
    305 
    306 BASE.6.4.THREAD. THREAD Command
    307 
    308 Arguments:  threading algorithm
    309             charset specification
    310             searching criteria (one or more)
    311 
    312 Data:       untagged responses: THREAD
    313 
    314 Result:     OK - thread completed
    315             NO - thread error: can't thread that charset or
    316                  criteria
    317             BAD - command unknown or arguments invalid
    318 
    319       The THREAD command is a variant of SEARCH with threading semantics
    320       for the results.  Thread has two arguments before the searching
    321       criteria argument: a threading algorithm and the searching
    322       charset.
    323 
    324       The charset argument is mandatory (unlike SEARCH) and indicates
    325       the [CHARSET] of the strings that appear in the searching
    326       criteria.  The US-ASCII and [UTF-8] charsets MUST be implemented.
    327       All other charsets are optional.
    328 
    329       There is also a UID THREAD command that returns unique identifiers
    330       instead of message sequence numbers.  Note that there are separate
    331       searching criteria for message sequence numbers and UIDs; thus the
    332       arguments to UID THREAD are interpreted the same as in THREAD.
    333       This is analogous to the behavior of UID SEARCH, as opposed to UID
    334       COPY, UID FETCH, or UID STORE.
    335 
    336 
    337 
    338 Crispin & Murchison         Standards Track                     [Page 6]
    339 
    340 RFC 5256                       IMAP Sort                       June 2008
    341 
    342 
    343       The THREAD command first searches the mailbox for messages that
    344       match the given searching criteria using the charset argument for
    345       the interpretation of strings in the searching criteria.  It then
    346       returns the matching messages in an untagged THREAD response,
    347       threaded according to the specified threading algorithm.
    348 
    349       All collation is in ascending order.  Earlier dates collate before
    350       later dates and strings are collated according to ascending values
    351       established by their collation algorithm (see
    352       "Internationalization Considerations").
    353 
    354       Untagged EXPUNGE responses are not permitted while the server is
    355       responding to a THREAD command, but are permitted during a UID
    356       THREAD command.
    357 
    358       The defined threading algorithms are as follows:
    359 
    360       ORDEREDSUBJECT
    361 
    362          The ORDEREDSUBJECT threading algorithm is also referred to as
    363          "poor man's threading".  The searched messages are sorted by
    364          base subject and then by the sent date.  The messages are then
    365          split into separate threads, with each thread containing
    366          messages with the same base subject text.  Finally, the threads
    367          are sorted by the sent date of the first message in the thread.
    368 
    369          The top level or "root" in ORDEREDSUBJECT threading contains
    370          the first message of every thread.  All messages in the root
    371          are siblings of each other.  The second message of a thread is
    372          the child of the first message, and subsequent messages of the
    373          thread are siblings of the second message and hence children of
    374          the message at the root.  Hence, there are no grandchildren in
    375          ORDEREDSUBJECT threading.
    376 
    377          Children in ORDEREDSUBJECT threading do not have descendents.
    378          Client implementations SHOULD treat descendents of a child in a
    379          server response as being siblings of that child.
    380 
    381       REFERENCES
    382 
    383          The REFERENCES threading algorithm threads the searched
    384          messages by grouping them together in parent/child
    385          relationships based on which messages are replies to others.
    386          The parent/child relationships are built using two methods:
    387          reconstructing a message's ancestry using the references
    388          contained within it; and checking the original (not base)
    389          subject of a message to see if it is a reply to (or forward of)
    390          another message.
    391 
    392 
    393 
    394 Crispin & Murchison         Standards Track                     [Page 7]
    395 
    396 RFC 5256                       IMAP Sort                       June 2008
    397 
    398 
    399             Note: "Message ID" in the following description refers to a
    400             normalized form of the msg-id in [RFC2822].  The actual text
    401             in RFC 2822 may use quoting, resulting in multiple ways of
    402             expressing the same Message ID.  Implementations of the
    403             REFERENCES threading algorithm MUST normalize any msg-id in
    404             order to avoid false non-matches due to differences in
    405             quoting.
    406 
    407             For example, the msg-id
    408                <"01KF8JCEOCBS0045PS"@xxx.yyy.com>
    409             and the msg-id
    410                <01KF8JCEOCBS0045PS@xxx.yyy.com>
    411             MUST be interpreted as being the same Message ID.
    412 
    413          The references used for reconstructing a message's ancestry are
    414          found using the following rules:
    415 
    416             If a message contains a References header line, then use the
    417             Message IDs in the References header line as the references.
    418 
    419             If a message does not contain a References header line, or
    420             the References header line does not contain any valid
    421             Message IDs, then use the first (if any) valid Message ID
    422             found in the In-Reply-To header line as the only reference
    423             (parent) for this message.
    424 
    425                Note: Although [RFC2822] permits multiple Message IDs in
    426                the In-Reply-To header, in actual practice this
    427                discipline has not been followed.  For example,
    428                In-Reply-To headers have been observed with message
    429                addresses after the Message ID, and there are no good
    430                heuristics for software to determine the difference.
    431                This is not a problem with the References header,
    432                however.
    433 
    434             If a message does not contain an In-Reply-To header line, or
    435             the In-Reply-To header line does not contain a valid Message
    436             ID, then the message does not have any references (NIL).
    437 
    438          A message is considered to be a reply or forward if the base
    439          subject extraction rules, applied to the original subject,
    440          remove any of the following: a subj-refwd, a "(fwd)" subj-
    441          trailer, or a subj-fwd-hdr and subj-fwd-trl.
    442 
    443          The REFERENCES algorithm is significantly more complex than
    444          ORDEREDSUBJECT and consists of six main steps.  These steps are
    445          outlined in detail below.
    446 
    447 
    448 
    449 
    450 Crispin & Murchison         Standards Track                     [Page 8]
    451 
    452 RFC 5256                       IMAP Sort                       June 2008
    453 
    454 
    455          (1) For each searched message:
    456 
    457              (A) Using the Message IDs in the message's references, link
    458                  the corresponding messages (those whose Message-ID
    459                  header line contains the given reference Message ID)
    460                  together as parent/child.  Make the first reference the
    461                  parent of the second (and the second a child of the
    462                  first), the second the parent of the third (and the
    463                  third a child of the second), etc.  The following rules
    464                  govern the creation of these links:
    465 
    466                      If a message does not contain a Message-ID header
    467                      line, or the Message-ID header line does not
    468                      contain a valid Message ID, then assign a unique
    469                      Message ID to this message.
    470 
    471                      If two or more messages have the same Message ID,
    472                      then only use that Message ID in the first (lowest
    473                      sequence number) message, and assign a unique
    474                      Message ID to each of the subsequent messages with
    475                      a duplicate of that Message ID.
    476 
    477                      If no message can be found with a given Message ID,
    478                      create a dummy message with this ID.  Use this
    479                      dummy message for all subsequent references to this
    480                      ID.
    481 
    482                      If a message already has a parent, don't change the
    483                      existing link.  This is done because the References
    484                      header line may have been truncated by a Mail User
    485                      Agent (MUA).  As a result, there is no guarantee
    486                      that the messages corresponding to adjacent Message
    487                      IDs in the References header line are parent and
    488                      child.
    489 
    490                      Do not create a parent/child link if creating that
    491                      link would introduce a loop.  For example, before
    492                      making message A the parent of B, make sure that A
    493                      is not a descendent of B.
    494 
    495                         Note: Message ID comparisons are case-sensitive.
    496 
    497              (B) Create a parent/child link between the last reference
    498                  (or NIL if there are no references) and the current
    499                  message.  If the current message already has a parent,
    500                  it is probably the result of a truncated References
    501                  header line, so break the current parent/child link
    502                  before creating the new correct one.  As in step 1.A,
    503 
    504 
    505 
    506 Crispin & Murchison         Standards Track                     [Page 9]
    507 
    508 RFC 5256                       IMAP Sort                       June 2008
    509 
    510 
    511                  do not create the parent/child link if creating that
    512                  link would introduce a loop.  Note that if this message
    513                  has no references, it will now have no parent.
    514 
    515                     Note: The parent/child links created in steps 1.A
    516                     and 1.B MUST be kept consistent with one another at
    517                     ALL times.
    518 
    519          (2) Gather together all of the messages that have no parents
    520              and make them all children (siblings of one another) of a
    521              dummy parent (the "root").  These messages constitute the
    522              first (head) message of the threads created thus far.
    523 
    524          (3) Prune dummy messages from the thread tree.  Traverse each
    525              thread under the root, and for each message:
    526 
    527                  If it is a dummy message with NO children, delete it.
    528 
    529                  If it is a dummy message with children, delete it, but
    530                  promote its children to the current level.  In other
    531                  words, splice them in with the dummy's siblings.
    532 
    533                  Do not promote the children if doing so would make them
    534                  children of the root, unless there is only one child.
    535 
    536          (4) Sort the messages under the root (top-level siblings only)
    537              by sent date as described in section 2.2.  In the case of a
    538              dummy message, sort its children by sent date and then use
    539              the first child for the top-level sort.
    540 
    541          (5) Gather together messages under the root that have the same
    542              base subject text.
    543 
    544              (A) Create a table for associating base subjects with
    545                  messages, called the subject table.
    546 
    547              (B) Populate the subject table with one message per each
    548                  base subject.  For each child of the root:
    549 
    550                  (i)   Find the subject of this thread, by using the
    551                        base subject from either the current message or
    552                        its first child if the current message is a
    553                        dummy.  This is the thread subject.
    554 
    555                  (ii)  If the thread subject is empty, skip this
    556                        message.
    557 
    558 
    559 
    560 
    561 
    562 Crispin & Murchison         Standards Track                    [Page 10]
    563 
    564 RFC 5256                       IMAP Sort                       June 2008
    565 
    566 
    567                  (iii) Look up the message associated with the thread
    568                        subject in the subject table.
    569 
    570                  (iv)  If there is no message in the subject table with
    571                        the thread subject, add the current message and
    572                        the thread subject to the subject table.
    573 
    574                        Otherwise, if the message in the subject table is
    575                        not a dummy, AND either of the following criteria
    576                        are true:
    577 
    578                            The current message is a dummy, OR
    579 
    580                            The message in the subject table is a reply
    581                            or forward and the current message is not.
    582 
    583                        then replace the message in the subject table
    584                        with the current message.
    585 
    586              (C) Merge threads with the same thread subject.  For each
    587                  child of the root:
    588 
    589                  (i)   Find the message's thread subject as in step
    590                        5.B.i above.
    591 
    592                  (ii)  If the thread subject is empty, skip this
    593                        message.
    594 
    595                  (iii) Lookup the message associated with this thread
    596                        subject in the subject table.
    597 
    598                  (iv)  If the message in the subject table is the
    599                        current message, skip this message.
    600 
    601                  Otherwise, merge the current message with the one in
    602                  the subject table using the following rules:
    603 
    604                      If both messages are dummies, append the current
    605                      message's children to the children of the message
    606                      in the subject table (the children of both messages
    607                      become siblings), and then delete the current
    608                      message.
    609 
    610                      If the message in the subject table is a dummy and
    611                      the current message is not, make the current
    612                      message a child of the message in the subject table
    613                      (a sibling of its children).
    614 
    615 
    616 
    617 
    618 Crispin & Murchison         Standards Track                    [Page 11]
    619 
    620 RFC 5256                       IMAP Sort                       June 2008
    621 
    622 
    623                      If the current message is a reply or forward and
    624                      the message in the subject table is not, make the
    625                      current message a child of the message in the
    626                      subject table (a sibling of its children).
    627 
    628                      Otherwise, create a new dummy message and make both
    629                      the current message and the message in the subject
    630                      table children of the dummy.  Then replace the
    631                      message in the subject table with the dummy
    632                      message.
    633 
    634                         Note: Subject comparisons are case-insensitive,
    635                         as described under "Internationalization
    636                         Considerations".
    637 
    638          (6) Traverse the messages under the root and sort each set of
    639              siblings by sent date as described in section 2.2.
    640              Traverse the messages in such a way that the "youngest" set
    641              of siblings are sorted first, and the "oldest" set of
    642              siblings are sorted last (grandchildren are sorted before
    643              children, etc).  In the case of a dummy message (which can
    644              only occur with top-level siblings), use its first child
    645              for sorting.
    646 
    647    Example:    C: A283 THREAD ORDEREDSUBJECT UTF-8 SINCE 5-MAR-2000
    648                S: * THREAD (166)(167)(168)(169)(172)(170)(171)
    649                   (173)(174 (175)(176)(178)(181)(180))(179)(177
    650                   (183)(182)(188)(184)(185)(186)(187)(189))(190)
    651                   (191)(192)(193)(194 195)(196 (197)(198))(199)
    652                   (200 202)(201)(203)(204)(205)(206 207)(208)
    653                S: A283 OK THREAD completed
    654                C: A284 THREAD ORDEREDSUBJECT US-ASCII TEXT "gewp"
    655                S: * THREAD
    656                S: A284 OK THREAD completed
    657                C: A285 THREAD REFERENCES UTF-8 SINCE 5-MAR-2000
    658                S: * THREAD (166)(167)(168)(169)(172)((170)(179))
    659                   (171)(173)((174)(175)(176)(178)(181)(180))
    660                   ((177)(183)(182)(188 (184)(189))(185 186)(187))
    661                   (190)(191)(192)(193)((194)(195 196))(197 198)
    662                   (199)(200 202)(201)(203)(204)(205 206 207)(208)
    663                S: A285 OK THREAD completed
    664 
    665              Note: The line breaks in the first and third server
    666              responses are for editorial clarity and do not appear in
    667              real THREAD responses.
    668 
    669 
    670 
    671 
    672 
    673 
    674 Crispin & Murchison         Standards Track                    [Page 12]
    675 
    676 RFC 5256                       IMAP Sort                       June 2008
    677 
    678 
    679 4.  Additional Responses
    680 
    681    These responses are extensions to the [IMAP] base protocol.
    682 
    683    The section headings of these responses are intended to correspond
    684    with where they would be located in the main document.
    685 
    686 BASE.7.2.SORT. SORT Response
    687 
    688    Data:       zero or more numbers
    689 
    690       The SORT response occurs as a result of a SORT or UID SORT
    691       command.  The number(s) refer to those messages that match the
    692       search criteria.  For SORT, these are message sequence numbers;
    693       for UID SORT, these are unique identifiers.  Each number is
    694       delimited by a space.
    695 
    696    Example:    S: * SORT 2 3 6
    697 
    698 BASE.7.2.THREAD. THREAD Response
    699 
    700    Data:       zero or more threads
    701 
    702       The THREAD response occurs as a result of a THREAD or UID THREAD
    703       command.  It contains zero or more threads.  A thread consists of
    704       a parenthesized list of thread members.
    705 
    706       Thread members consist of zero or more message numbers, delimited
    707       by spaces, indicating successive parent and child.  This continues
    708       until the thread splits into multiple sub-threads, at which point,
    709       the thread nests into multiple sub-threads with the first member
    710       of each sub-thread being siblings at this level.  There is no
    711       limit to the nesting of threads.
    712 
    713       The messages numbers refer to those messages that match the search
    714       criteria.  For THREAD, these are message sequence numbers; for UID
    715       THREAD, these are unique identifiers.
    716 
    717    Example:    S: * THREAD (2)(3 6 (4 23)(44 7 96))
    718 
    719       The first thread consists only of message 2.  The second thread
    720       consists of the messages 3 (parent) and 6 (child), after which it
    721       splits into two sub-threads; the first of which contains messages
    722       4 (child of 6, sibling of 44) and 23 (child of 4), and the second
    723       of which contains messages 44 (child of 6, sibling of 4), 7 (child
    724       of 44), and 96 (child of 7).  Since some later messages are
    725       parents of earlier messages, the messages were probably moved from
    726       some other mailbox at different times.
    727 
    728 
    729 
    730 Crispin & Murchison         Standards Track                    [Page 13]
    731 
    732 RFC 5256                       IMAP Sort                       June 2008
    733 
    734 
    735             -- 2
    736 
    737             -- 3
    738                \-- 6
    739                    |-- 4
    740                    |   \-- 23
    741                    |
    742                    \-- 44
    743                         \-- 7
    744                             \-- 96
    745 
    746    Example:    S: * THREAD ((3)(5))
    747 
    748       In this example, 3 and 5 are siblings of a parent that does not
    749       match the search criteria (and/or does not exist in the mailbox);
    750       however they are members of the same thread.
    751 
    752 5.  Formal Syntax of SORT and THREAD Commands and Responses
    753 
    754    The following syntax specification uses the Augmented Backus-Naur
    755    Form (ABNF) notation as specified in [ABNF].  It also uses [ABNF]
    756    rules defined in [IMAP].
    757 
    758 sort            = ["UID" SP] "SORT" SP sort-criteria SP search-criteria
    759 
    760 sort-criteria   = "(" sort-criterion *(SP sort-criterion) ")"
    761 
    762 sort-criterion  = ["REVERSE" SP] sort-key
    763 
    764 sort-key        = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" /
    765                   "SUBJECT" / "TO"
    766 
    767 thread          = ["UID" SP] "THREAD" SP thread-alg SP search-criteria
    768 
    769 thread-alg      = "ORDEREDSUBJECT" / "REFERENCES" / thread-alg-ext
    770 
    771 thread-alg-ext  = atom
    772                     ; New algorithms MUST be registered with IANA
    773 
    774 search-criteria = charset 1*(SP search-key)
    775 
    776 charset         = atom / quoted
    777                     ; CHARSET values MUST be registered with IANA
    778 
    779 sort-data       = "SORT" *(SP nz-number)
    780 
    781 thread-data     = "THREAD" [SP 1*thread-list]
    782 
    783 
    784 
    785 
    786 Crispin & Murchison         Standards Track                    [Page 14]
    787 
    788 RFC 5256                       IMAP Sort                       June 2008
    789 
    790 
    791 thread-list     = "(" (thread-members / thread-nested) ")"
    792 
    793 thread-members  = nz-number *(SP nz-number) [SP thread-nested]
    794 
    795 thread-nested   = 2*thread-list
    796 
    797    The following syntax describes base subject extraction rules (2)-(6):
    798 
    799 subject         = *subj-leader [subj-middle] *subj-trailer
    800 
    801 subj-refwd      = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":"
    802 
    803 subj-blob       = "[" *BLOBCHAR "]" *WSP
    804 
    805 subj-fwd        = subj-fwd-hdr subject subj-fwd-trl
    806 
    807 subj-fwd-hdr    = "[fwd:"
    808 
    809 subj-fwd-trl    = "]"
    810 
    811 subj-leader     = (*subj-blob subj-refwd) / WSP
    812 
    813 subj-middle     = *subj-blob (subj-base / subj-fwd)
    814                     ; last subj-blob is subj-base if subj-base would
    815                     ; otherwise be empty
    816 
    817 subj-trailer    = "(fwd)" / WSP
    818 
    819 subj-base       = NONWSP *(*WSP NONWSP)
    820                     ; can be a subj-blob
    821 
    822 BLOBCHAR        = %x01-5a / %x5c / %x5e-ff
    823                     ; any CHAR8 except '[' and ']'.
    824                     ; SHOULD comply with [UTF-8]
    825 
    826 NONWSP          = %x01-08 / %x0a-1f / %x21-ff
    827                     ; any CHAR8 other than WSP.
    828                     ; SHOULD comply with [UTF-8]
    829 
    830 6.  Security Considerations
    831 
    832    The SORT and THREAD extensions do not raise any security
    833    considerations that are not present in the base [IMAP] protocol, and
    834    these issues are discussed in [IMAP].  Nevertheless, it is important
    835    to remember that [IMAP] protocol transactions, including message
    836    data, are sent in the clear over the network unless protection from
    837    snooping is negotiated, either by the use of STARTTLS, privacy
    838    protection in AUTHENTICATE, or some other protection mechanism.
    839 
    840 
    841 
    842 Crispin & Murchison         Standards Track                    [Page 15]
    843 
    844 RFC 5256                       IMAP Sort                       June 2008
    845 
    846 
    847    Although not a security consideration, it is important to recognize
    848    that sorting by REFERENCES can lead to misleading threading trees.
    849    For example, a message with false References: header data will cause
    850    a thread to be incorporated into another thread.
    851 
    852    The process of extracting the base subject may lead to incorrect
    853    collation if the extracted data was significant text as opposed to a
    854    subject artifact.
    855 
    856 7.  Internationalization Considerations
    857 
    858    As stated in the introduction, the rules of I18NLEVEL=1 as described
    859    in [IMAP-I18N] MUST be followed; that is, the SORT and THREAD
    860    extensions MUST collate strings according to the i;unicode-casemap
    861    collation described in [UNICASEMAP].  Servers SHOULD also advertise
    862    the I18NLEVEL=1 extension.  Alternatively, a server MAY implement
    863    I18NLEVEL=2 (or higher) and comply with the rules of that level.
    864 
    865    As discussed in [IMAP-I18N] section 4.5, all server implementations
    866    should eventually be updated to support the [IMAP-I18N] I18NLEVEL=2
    867    extension.
    868 
    869    Translations of the "re" or "fw"/"fwd" tokens are not specified for
    870    removal in the base subject extraction process.  An attempt to add
    871    such translated tokens would result in a geometrically complex, and
    872    ultimately unimplementable, task.
    873 
    874    Instead, note that [RFC2822] section 3.6.5 recommends that "re:"
    875    (from the Latin "res", meaning "in the matter of") be used to
    876    identify a reply.  Although it is evident that, from the multiple
    877    forms of token to identify a forwarded message, there is considerable
    878    variation found in the wild, the variations are (still) manageable.
    879    Consequently, it is suggested that "re:" and one of the variations of
    880    the tokens for a forward supported by the base subject extraction
    881    rules be adopted for Internet mail messages, since doing so makes it
    882    a simple display-time task to localize the token language for the
    883    user.
    884 
    885 8.  IANA Considerations
    886 
    887    [IMAP] capabilities are registered by publishing a standards track or
    888    IESG-approved experimental RFC.  This document constitutes
    889    registration of the SORT and THREAD capabilities in the [IMAP]
    890    capabilities registry.
    891 
    892 
    893 
    894 
    895 
    896 
    897 
    898 Crispin & Murchison         Standards Track                    [Page 16]
    899 
    900 RFC 5256                       IMAP Sort                       June 2008
    901 
    902 
    903    This document creates a new [IMAP] threading algorithms registry,
    904    which registers threading algorithms by publishing a standards track
    905    or IESG-approved experimental RFC.  This document constitutes
    906    registration of the ORDEREDSUBJECT and REFERENCES algorithms in that
    907    registry.
    908 
    909 9.  Normative References
    910 
    911    [ABNF]        Crocker, D., Ed., and P. Overell, "Augmented BNF for
    912                  Syntax Specifications: ABNF", STD 68, RFC 5234, January
    913                  2008.
    914 
    915    [CHARSET]     Freed, N. and J. Postel, "IANA Charset Registration
    916                  Procedures", BCP 19, RFC 2978, October 2000.
    917 
    918    [IMAP]        Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL -
    919                  VERSION 4rev1", RFC 3501, March 2003.
    920 
    921    [IMAP-I18N]   Newman, C., Gulbrandsen, A., and A. Melnikov, "Internet
    922                  Message Access Protocol Internationalization", RFC
    923                  5255, June 2008.
    924 
    925    [KEYWORDS]    Bradner, S., "Key words for use in RFCs to Indicate
    926                  Requirement Levels", BCP 14, RFC 2119, March 1997.
    927 
    928    [RFC2822]     Resnick, P., Ed., "Internet Message Format", RFC 2822,
    929                  April 2001.
    930 
    931    [UNICASEMAP]  Crispin, M., "i;unicode-casemap - Simple Unicode
    932                  Collation Algorithm", RFC 5051, October 2007.
    933 
    934    [UTF-8]       Yergeau, F., "UTF-8, a transformation format of ISO
    935                  10646", STD 63, RFC 3629, November 2003.
    936 
    937 10.  Informative References
    938 
    939    [IMAP-MODELS] Crispin, M., "Distributed Electronic Mail Models in
    940                  IMAP4", RFC 1733, December 1994.
    941 
    942    [THREADING]   Zawinski, J. "Message Threading",
    943                  http://www.jwz.org/doc/threading.html, 1997-2002.
    944 
    945 
    946 
    947 
    948 
    949 
    950 
    951 
    952 
    953 
    954 Crispin & Murchison         Standards Track                    [Page 17]
    955 
    956 RFC 5256                       IMAP Sort                       June 2008
    957 
    958 
    959 Authors' Addresses
    960 
    961    Mark R. Crispin
    962    Panda Programming
    963    6158 NE Lariat Loop
    964    Bainbridge Island, WA 98110-2098
    965 
    966    Phone: +1 (206) 842-2385
    967    EMail: IMAP+SORT+THREAD@Lingling.Panda.COM
    968 
    969 
    970    Kenneth Murchison
    971    Carnegie Mellon University
    972    5000 Forbes Avenue
    973    Cyert Hall 285
    974    Pittsburgh, PA  15213
    975 
    976    Phone: +1 (412) 268-2638
    977    EMail: murch@andrew.cmu.edu
    978 
    979 
    980 
    981 
    982 
    983 
    984 
    985 
    986 
    987 
    988 
    989 
    990 
    991 
    992 
    993 
    994 
    995 
    996 
    997 
    998 
    999 
   1000 
   1001 
   1002 
   1003 
   1004 
   1005 
   1006 
   1007 
   1008 
   1009 
   1010 Crispin & Murchison         Standards Track                    [Page 18]
   1011 
   1012 RFC 5256                       IMAP Sort                       June 2008
   1013 
   1014 
   1015 Full Copyright Statement
   1016 
   1017    Copyright (C) The IETF Trust (2008).
   1018 
   1019    This document is subject to the rights, licenses and restrictions
   1020    contained in BCP 78, and except as set forth therein, the authors
   1021    retain all their rights.
   1022 
   1023    This document and the information contained herein are provided on an
   1024    "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   1025    OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   1026    THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   1027    OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   1028    THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   1029    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
   1030 
   1031 Intellectual Property
   1032 
   1033    The IETF takes no position regarding the validity or scope of any
   1034    Intellectual Property Rights or other rights that might be claimed to
   1035    pertain to the implementation or use of the technology described in
   1036    this document or the extent to which any license under such rights
   1037    might or might not be available; nor does it represent that it has
   1038    made any independent effort to identify any such rights.  Information
   1039    on the procedures with respect to rights in RFC documents can be
   1040    found in BCP 78 and BCP 79.
   1041 
   1042    Copies of IPR disclosures made to the IETF Secretariat and any
   1043    assurances of licenses to be made available, or the result of an
   1044    attempt made to obtain a general license or permission for the use of
   1045    such proprietary rights by implementers or users of this
   1046    specification can be obtained from the IETF on-line IPR repository at
   1047    http://www.ietf.org/ipr.
   1048 
   1049    The IETF invites any interested party to bring to its attention any
   1050    copyrights, patents or patent applications, or other proprietary
   1051    rights that may cover technology that may be required to implement
   1052    this standard.  Please address the information to the IETF at
   1053    ietf-ipr@ietf.org.
   1054 
   1055 
   1056 
   1057 
   1058 
   1059 
   1060 
   1061 
   1062 
   1063 
   1064 
   1065 
   1066 Crispin & Murchison         Standards Track                    [Page 19]
   1067