Ken Pogran, MIT-LCS/CSR (Pogran at MIT-Multics
John Vittal, BBN (Vittal at BBN-TENEXA
Dave Crocker, RAND-ISD (DCrocker at Rand-Unix
Austin Henderson, BBN (Henderson at BBN-TENEXD
ARPA's Committee on Computer-Aided Human
(CAHCOM) wishes to promulgate an officialstandard for the
of ARPA Network mail headers which will adequately meet the
of the various message service subsystems on the Network today
The authors of this RFC constitute the CAHCOM
charged with the task of developing this new standard; documentpresents our current thoughts on the matter and specificproposal
This document is organized as follows: First, we present
history, of the development of what has become known as the
Network "mail" or "message" service, and the issues which we
are most pressing -- problems for which solutions are
today, inhibiting the further development of message subsystems
We then present the specification for the new ARPA
Message Header standard. This is followed by a
section
Essentially, we propose a revision to Request for
(RFC) 561, "Standardizing Network Mail Headers", and RFC 680,
"Message TransmissionProtocol". This revision removes
compacts portions of the previous syntax and adds features to network address specification. In particular,
focus on people and not mailboxes as recipients and reference to stored address lists. We expect this syntax
provide sufficient capabilities to meet most users'
needs and, therefore, give developers enough breathing room
produce a new mail transmissionprotocol "properly". We
that there is enough of a consensus in the Network community
favor of such a standard syntax to make possible its adoption
this time
We would like to make clear the status of this standard: The CAHCOM Steering Committee has replaced the
Service Committee as the ARPANET standards-setting
in the area of message services. It is expected that proposal of this CAHCOM subcommittee, when in its final form
will be adopted as an ARPANET standard by CAHCOM. In
interests of making this standard the best possible one, we distributing this proposal as an RFC
Please send any comments and criticisms to any of
authors of this RFC by 15 June 1977. It is planned that standard will be officially adopted by 1 September 1977,
hosts expected to accept its syntax by 1 January 1978.
As message service subsystems on various host
(especially TENEX) developed to the point where
parsing of incomingmessages was being done, it became clear
it would be desirable to standardize the format and content
the headers of messagestransmitted between hosts using these commands. To this end, an ad hoc committee wrote RFC 561, suggested a standard message header format. The committee
unofficial, so it could not legislate a standard, it could recommend. However, the standard it suggested adequately met
urgent need, and was generally adopted
Several salient points should be noted
1. RFC 561 defined the concept of a message header, specified the syntax which delimited it from the
text of a message
2. It proposed a standard format for the most obvious
most urgently-needed header items: "From:", "Date:",
"Subject:";
3. It proposed that a general standard syntax be used
all other header items
4. RFC 561 is still, today, an unofficial standard,
to by most because of its utility
5. Its syntax was designed to allow humans to read the
easily, without the aid of special message
systems
As message services grew in sophistication, the need specific header items in RFC 561's "miscellaneous" category grew
"To:" and "cc:", especially, were generated and recognized
several different message services. However, there was specificstandard for the syntax of the contents of these items
The message service subsystems on TENEX developed a
format for these items; since more messages originated from
TENEX hosts on the Network than from any other type of
system, the TENEX format for these fields soon became a de standard. Message service subsystems on TENEX began to
these fields, expecting them to be in the TENEX-generated format
Message service subsystems on other hosts -- Multics, for
-- began to dabble with other formats for these fields,
there was no standard for them, only to receive complaints
users of TENEX message service subsystems that their "non standard" message headers could not be parsed according to
(de facto) "standard" syntax
Recognizing that the time had come to make an attempt
standardize the additional header fields that had come into
since RFC 561 was published, ARPA's Message Service
chartered a small group in 1975 to develop a revised version
RFC 561 which would define the syntax of these additional
header fields. Several things should be noted about this
group of people: first, they were TENEX-oriented; when functionality of the message header items they desired
matched by the functionality of an already-existing
header item of the TENEX message subsystems, they adopted
syntax used by the TENEX message subsystems. Second, they additional header items not already found on TENEX
subsystems on the deliberations of the Message Service Committee
Third, they were not familiar with the procedure for
of a document as a Network RFC
For all its shortcomings, RFC 680 has performed a
service, just as did RFC 561 before it. It defined
message header items at a time when this needed to be done
Unfortunately, since the group had not sought ideas and
from others, the specification did not adequately respond to
sufficient set of community needs. In addition, the manner
which the document was promulgated -- or not promulgated --
a great deal to be desired. Implementators of message-
subsystems who had not received RFC 680 proceeded to go their
ways, feeling justified in doing so, while those who accepted
680 as a standard felt justified in complaining to -- and
-- those whom they considered to be maverick implementors
idiosyncratic message service subsystems
Perhaps because of the ad-hoc nature of the interim facility, users have not, until recently, attempted to push
system to the limits of their imagination. Presently, however
several different sites are using the "interim" mail facility
more than it was designed and in ways which are incompatible
with each other and with the original intent of the facility
Mail subsystem implementors are increasingly being asked
provide for the handling of mail from idiosyncratic hosts. Also
it has become clear that there are a few very specificfeatures
too useful to ignore, which cannot reasonably be specified
the syntax of RFC 680.
B. ISSUES AND
At first glance, it would seem that a resolution of today'
somewhat chaotic situation could best be obtained by
junking the existing "interim" mail facility, and adopting a
mail transmissionprotocol. We strongly believe that this
be ill-advised at this time, for we feel that there is no
understanding within the Network community today of how
specify and implement a full and adequate mail protocol. However, we are convinced that there is, finally,
strong commitment within the Network community to attack
problem (which there was not at the time the "interim" transmissionfacility was specified and developed).
The frontal attacks on the mail protocol problem have,
far, resulted in at least two suggestions for a mail protocol. Why should not one of these protocols be immediately? We feel that, in general, there has been a
for experimental Network software to be prematurely treated
though it were adequately designed and fully operational Typically, the system or protocolproposed is so much better
what was previously available that its experimental nature
disregarded, and it is pressed into service before it has had
chance to properly develop and mature. We are very
that this phenomenon not afflict the Network mail system any
than it already has
While it is true that there are several sites in the Community which have mail systems that understand the specified in RFC's 561 and 680, in addition to some of the "non standard" syntax provided by the mail generating programs
several other sites, most mail systems do not parse much of contents of receivedmessages. A consideration of the specified here is that messages which are sent to people
be easily read by people. Parsers which can turn an ugly
syntactically expedient form into something which is easy to
are the exception, rather than the rule, in today's
systems. Also, the modifications to the existing "non-standard
syntax should be kept to a minimum, enhancing the
that the requirement of small perturbations to existing
will be accepted
1. Users of mail systems can have multiplemailboxes,
on one machine or multiplemachines, all of which
treated identically; the default mailbox for a user
not necessarily associated (directly) with his
name
2. Mail for a person can be sent to other than a single
default mailbox
3. Named groups may consist of both individuals
(possibly) other named groups (i.e., nesting
groups is permitted).
5. Address lists may contain references to addresses
are not accessible through the standard ARPANET
system. For example, U.S. Postal system addresses
be specified. Such addresses are, of course, expected
be ignored by the ARPANET system, although
sites may provide services for using the
(e.g., automatically sending a copy of the message to
line printer, in preparation for transmission through
Postal system).
6. Parenthetical remarks, or comments, can be included
syntactically recognized as such within some
items
7. Receivedmessages are capable of being read by
without a program having to parse the message (or
of it) before presenting the message to the user;
there is sufficient formal syntax to enable a
program to modify the appearance and content of presented to users. Although message-display
may exercise considerable control over
appearance, the degree to which a message's actual
is PLEASANT for humans to read is entirely responsibility of the message creation program
No mechanism for authentication is provided, since the provides no mechanisms for enforcing mail security. The
does provide for one aspect of "correctness": a distinction
made between an address which is claimed to be a valid
address and one which is simply free text, included for
convenience of the human participants
In computer-based message systems, human users do
generally encounter "envelopes", which are often
automatically, to be used by the participating system(s)
deliver the message. For example on TENEX, the envelope is
name of the file containing a message awaiting transmission.
FTP servers, it is the data portion of the MAIL or MLFL
line. Some systems attach "envelope-like" information to
message header, such as time-stamp and originating host name
In paper-based communications, headers occur both
(e.g., "To:" and "From:" and after (e.g., "cc:" and "enclosure:")
the body of the message. Within this standard, all headers
before the body of the message, although local message programs may choose to alter that ordering
Wayne Hathaway has pointed out that ARPANET message
does not support specification of letterheads, since these are
type of organizational public relations symbol.
idiosyncrasies are supported, however, by way of choosing
field names
In general, it is important to realize that the
portion of a message plays several roles during the life of
message, variously participating in each of the three suggested by Stefferud
D. ADOPTION OF THE
During the early phases of specifying this standard, a
deal of concern was expressed over the problems which may
experienced during the transition from the current standard
this new one. We feel that the true problem is the lack
realization that THERE IS NO CURRENT OFFICIALSTANDARD.
systems have enough overlapping behaviors to allow the
mail environment to function, but this in no way constitutes standard
In fact, we strongly believe that the new
imposed by the proposedstandard involve less complexity than
ambiguities resulting from the current variations in
behaviors
This specification is intended strictly as a definition
what is to be passed between hosts on the ARPANET. It is
intended to dictate either features which systems on the
are expected to support, or user interfaces to message
or reading programs
A distinction should be made between what the
requires and what it allows. Certain equivalences are defined
such as between a space character and an end-of- character, which both facilitate the formal
and indicate what the OFFICIALsemantics are for messages Particular implementations may wish to preserve
distinctions which the specification does not require
A.
Since there are many message systems which exist outside
ARPANET environment, as well as those within it, it may be
to consider the general framework, and resulting capabilities
limitations, of this standard
No significant consideration has been given to questions
data compression or transmission/storage efficiency. standard, in fact, tends to be very free with the number of
consumed. For example, field names are specified as free text
rather than special terse codes
A general "memo" framework is used. That is, a consists of some information, in a rigid format, followed by
main part of the message, which is text and whose format is
This syntax is given in four parts. The first
describes a base-level lexical analyzer which feeds the higher
level parser described in the succeeding sections. The
part gives a general syntax for messages and standard
fields. The third part specifies the syntax of addresses.
final section specifies some general syntax which supports
other sections
A message consists of headers and, optionally, a body (i.e
the ). The part is just sequence of ASCII characters; it is separated from
headers by a null line (i.e., a line with nothing
the ).
1) Folding and unfolding of
Each header item can be viewed as a single, logical,
line of ASCII characters. For convenience, conceptual entity can be split into a multiple- representation (i.e., "folded"). The general rule is
wherever there can be characters,
can instead insert a immediatelyfollowed by
LEAST one character. Thus,
single
Once header fields have been unfolded, they may be
as being composed of a followed by a ":"
(colon), followed by a . The
must be composed of printable ASCII characters (i.e.,
characters which have decimal values between 33 and 126)
and characters. The composed of any ASCII characters (other than , which have been removed by unfolding).
Certain header fields may be interpretedaccording to internal syntax which some systems may wish to parse
These fields will be referred to as structured fields Examples include fields containing dates and addresses
Other fields, such as the subject field, are
simply as a single line of text
3) Field
To aid in the creation and reading of s,
free insertion of characters
allowed in reasonable places. Rather than obscuring
syntax specification for with the
syntax for these characters, existence of a simple "lexical" analyzer is assumed.
analyzer reinterprets the unfolded text which
the as a sequence of separated characters. The field name may
conveniently represented by the sequence of these atoms
separated by a single ASCII space character
To aid in the creation and reading of structured fields
the free insertion of characters
allowed in reasonable places. Rather than obscuring
syntax specifications for these structured fields explicit syntax for these characters
the existence of another simple "lexical" analyzer
assumed. It provides an interpretation of the
text comprising the body of the field as a sequence
lexical symbols. These
contents> ::=
up the , as defined
the following sections, consisting of combinations , , ,
and tokens
::= sequence of one or more
ASCII alpha-numeric or
characters, excluding all
characters (those characters
a decimal value less than 33
equal to 127) and >
::= sequence of one or more
ASCII characters, where adjacent quotes are treated as
single quote and part of
string> <">
::= sequence of one or more
ASCII characters excluding
and >
Comments may appear only within s structured fields. A comment is any set of TELNET
characters, which is not within a quoted string, and
is enclosed in matching parentheses; parentheses nest,
that if a left paren occurs in a comment string,
must also be a matching right paren
Comments are NOT passed to the FTP server, as part of
MAIL or MLFL command, since comments are not part of
"formal" address
2) "White space
Remember that in structured fields, MULTIPLE LINEAR
SPACE TELNET ASCII CHARACTERS (namely s and s
ARE TREATED AS SINGLE SPACES AND MAY FREELY SURROUND
SYMBOL. In all header fields, at least one REQUIRED only at the beginning of folded lines
Writers of mail-sending (i.e. header generating)
should realize that there is no Network-wide definition
the effect of TELNET ASCII characters on
appearance of text at another Network host; therefore,
use of s in message headers, though permitted,
discouraged
Note that the contents of messages are required to
with TELNET NVT conventions (e.g. must be
by either , making a , or , if the
to stand alone).
3) Quoted
Where permitted (i.e., in structured fields)
strings are treated as a single symbol (i.e.
to an syntactically). However, if quoted
are to be "folded" onto multiple lines, then the
for folding must be adhered to (See items II.B.1.a.1,
above, and II.B.1.c.6, below.) Note that the semantics do not encounter s in quoted strings
although particular parsing programs may wish to
their presence
- Angle brackets ("<" and ">") are
where there is a question of the
of machine-usable code (e.g. mailboxes).
5) Case independence of certain specials
It should be assumed by all mail reading programs
certain s can be represented in any combination
upper and lower case. These are
- s
- "File", in a ,
- "at", in an indicator>,
- s
- s
- s,
-
For example, the s "From", "FROM", "from",
even "FroM" should all be treated identically. Note that
at the level of this specification, case IS relevant
other s and s. Also see
II.C.1.a.4, below
6) Folding long
Each header item (field of the message) may be
on exactly one line consisting of the name of the
and its body, and this is what the parser sees.
readability, it is recommended that the
portion of long header items be "folded" onto
lines of the actual header
7) Backspace
Backspace TELNET ASCII characters (ASCII BS, decimal 8)
may be included in and
effect overstriking; however, any use of backspaces
effects an overstrike to the left of the beginning of or is prohibited
The following syntax for the bodies of various fields should
thought of as describing each field body as a single long
(or line). The section on Lexical Analysis (section II.B.1) indicated how such long strings can be represented on more
one line in the actual transmitted message