As per Relevance of the word reference, we have this rfc below:











Network Working Group J.
Request for Comments: 2557 Stockholm University/
Obsoletes: 2110 A.
Category: Standards Track Microsoft
N.
Lotus Development
March 1999


MIME Encapsulation of Aggregate Documents, such as HTML (MHTML

Status of this

This document specifies an Internet standards track protocol for
Internet community, and requests discussion and suggestions
improvements. Please refer to the current edition of the "
Official Protocol Standards" (STD 1) for the standardization
and status of this protocol. Distribution of this memo is unlimited

Copyright

Copyright (C) The Internet Society (1999). All Rights Reserved



HTML [RFC 1866] defines a powerful means of specifying
documents. These multimedia documents consist of a text/html
resource (object) and other subsidiary resources (image, video clip
applet, etc. objects) referenced by Uniform Resource
(URIs) within the text/html root resource. When an HTML
document is retrieved by a browser, each of these component
is individually retrieved in real time from a location, and using
protocol, specified by each URI

In order to transfer a complete HTML multimedia document in a
e-mail message, it is necessary to: a) aggregate a text/html
resource and all of the subsidiary resources it references into
single composite message structure, and b) define a means by
URIs in the text/html root can reference subsidiary resources
that composite message structure

This document a) defines the use of a MIME multipart/
structure to aggregate a text/html root resource and the
resources it references, and b) specifies a MIME content-
(Content-Location) that allow URIs in a multipart/related text/
root body part to reference subsidiary resources in other body
of the same multipart/related structure




Palme, et al. Standards Track [Page 1]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


While initially designed to support e-mail transfer of
multi-resource HTML multimedia documents, these conventions can
be employed to resources retrieved by other transfer protocols
as HTTP and FTP to retrieve a complete multi-resource HTML
document in a single transfer or for storage and archiving
complete HTML-documents

Differences between this and a previous version of this standard
which was published as RFC 2110, are summarized in chapter 12.

Table of

1. Introduction ................................................. 3
2. Terminology ................................................. 4
2.1 Conformance requirement terminology ...................... 4
2.2 Other terminology ........................................ 4
3. Overview ..................................................... 6
4. The Content-Location MIME Content Header ..................... 6
4.1 MIME content headers ..................................... 6
4.2 The Content-Location Header .............................. 7
4.3 URIs of MHTML aggregates ................................. 8
4.4 Encoding and decoding of URIs in MIME header fields ...... 8
5. Base URIs for resolution of relative URIs .................... 9
6. Sending documents without linked objects ..................... 10
7. Use of the Content-Type "multipart/related" .................. 11
8. Usage of Links to Other Body Parts ........................... 13
8.1 General principle ........................................ 13
8.2 Resolution of URIs in text/html body parts ............... 13
8.3 Use of the Content-ID header and CID URLs ................ 14
9. Examples ..................................................... 14
9.1 Example of a HTML body without included linked objects ... 15
9.2 Example with an absolute URI to an embedded GIF picture .. 15
9.3 Example with relative URIs to embedded GIF pictures ...... 16
9.4 Example with a relative URI and no BASE available ........ 17
9.5 Example using CID URL and Content-ID header to an
GIF picture .............................................. 18
9.6 Example showing permitted and forbidden references
nested body parts ........................................ 19
10. Character encoding issues and end-of-line issues ............ 21
11. Security Considerations ..................................... 22
11.1 Security considerations not related to caching .......... 22
11.2 Security considerations related to caching .............. 23
12. Differences as compared to the previous version of
proposed standard in RFC 2110 ............................... 24
13. Acknowledgments ............................................. 24
14. References .................................................. 25
15. Authors' Addresses .......................................... 27
16. Full Copyright Statement .................................... 28



Palme, et al. Standards Track [Page 2]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


1.

There are a number of document formats (Hypertext Markup
[HTML2], Extended Markup Language [XML], Portable Document
[PDF] and Virtual Reality Markup Language [VRML]) that
documents consisting of a root resource and a number of
subsidiary resources referenced by URIs within that root resource
There is an obvious need to be able to send such multi-
documents in e-mail [SMTP], [RFC822] messages

The standard defined in this document specifies how to aggregate
multi-resource documents in MIME-formatted [MIME1 to MIME5]
for precisely this purpose

While this specification was developed to satisfy the
aggregation requirements of multi-resource HTML documents, it
also be applicable to other multi-resource document
linked by URIs. While this is the case, there is no requirement
implementations claiming conformance to this standard be able
handle any URI linked document representations other than those
root is HTML

This aggregation into a single message of a root resource and
subsidiary resources it references may also be applicable
resources retrieved by other protocols such as HTTP or FTP, or to
archiving of complete web pages as they appeared at a
point in time

An informational RFC will be published as a supplement to
standard. The informational RFC will discuss implementation
and some implementation problems. Implementers are
recommended to read this informational RFC when
implementations of this standard. You can find it through
http://www.dsv.su.se/~jpalme/ietf/mhtml.html

This standard specifies that body parts to be referenced can
identified either by a Content-ID (containing a Message-ID value)
by a Content-Location (containing an arbitrary URL). The reason
this standard does not only recommend the use of Content-ID-s is
it should be possible to forward existing web pages via e-
without having to rewrite the source text of the web pages.
rewriting has several disadvantages, one of them that
checksums will probably be invalidated








Palme, et al. Standards Track [Page 3]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


2.

2.1 Conformance requirement

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
document are to be interpreted as described in [IETF-TERMS].

An implementation is not compliant if it fails to satisfy one or
of the MUST requirements for the protocols it implements.
implementation that satisfies all the MUST and all the
requirements for its protocols is said to be "
compliant"; one that satisfies all the MUST requirements but not
the SHOULD requirements for its protocols is said to
"conditionally compliant."

2.2 Other

Most of the terms used in this document are defined in other RFCs

Absolute URI, See Relative Uniform Resource
AbsoluteURI [RELURL].

CID See Message/External Body Content-ID [MIDCID].

Content-Base This header was specified in RFC 2110, but
been removed in this new version of the
standard

Content-ID See Message/External Body Content-ID [MIDCID].

Content-Location MIME message or content part header with
URI of the MIME message or content part body
defined in section 4.2 below

Content-Transfer- Conversion of a text into 7-bit octets
Encoding specified in [MIME1] chapter 6.

CR See [RFC822].

CRLF See [RFC822].

Displayed text The text shown to the user reading a
with a web browser. This may be different
the HTML markup, see the definition of
markup below





Palme, et al. Standards Track [Page 4]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


Header Field in a message or content
specifying the value of one attribute

Heading Part of a message or content before the
CRLFCRLF, containing formatted fields
attributes of the message or content

HTML See HTML 2 specification [HTML2].

HTML Aggregate HTML objects together with some or all objects
objects to which the HTML object contains hyperlinks
directly or indirectly

HTML markup A file containing HTML encodings as
in [HTML] which may be different from
displayed text which a person using a
browser sees. For example, the HTML markup
contain "<" where the displayed
contains the character "<".

LF See [RFC822].

MIC Message Integrity Codes, codes use to
that a message has not been modified

MIME See the MIME specifications [MIME1 to MIME5].

MUA Messaging User Agent

PDF Portable Document Format, see [PDF].

Relative URI, See HTML 2 [HTML2] and RFC 1808 [RELURL].


URI, absolute and See RFC 1866 [HTML2].


URL See RFC 1738 [URL].

URL, relative See Relative Uniform Resource Locators [RELURL].

VRML See Virtual Reality Markup Language [VRML].









Palme, et al. Standards Track [Page 5]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


3.

An aggregate document is a MIME-encoded message that contains a
resource (object) as well as other resources linked to it via URIs
These other resources may be required to display a
document based on the root resource (inline pictures, style sheets
applets, etc.), or be the root resources of other
documents. It is important to keep in mind that aggregate
need to satisfy the differing needs of several audiences

Mail sending agents might send aggregate documents as an encoding
normal day-to-day electronic mail. Mail sending agents might
send aggregate documents when a user wishes to mail a
document from the web to someone else. Finally mail sending
might send aggregate documents as automatic responders,
access to WWW resources for non-IP connected clients. Also with
protocols such as HTTP or FTP, there may sometimes be a need
retrieve aggregate documents. Receiving agents also have
differing needs. Some receiving agents might be able to receive
aggregate document and display it just as any other text content
would be displayed. Others might have to pass this
document to a browsing program, and provisions need to be made
make this possible

Finally several other constraints on the problem arise. It
important that it be possible for a document to be signed and for
to be transmitted and displayed without breaking the
integrity (MIC) checksum that is part of the signature

4. The Content-Location MIME Content

4.1 MIME content

In order to resolve URI references to resources in other body parts
one MIME content header is defined, Content-Location. This header
occur in any message or content heading

The syntax for this header is, using the syntax definition tools
[ABNF]:

quoted-pair = ("\" text

text = %d1-9 / ; Characters excluding CR and
%d11-12 /
%d14-127

WSP = SP / HTAB ; Whitespace




Palme, et al. Standards Track [Page 6]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


FWS = ([*WSP CRLF] 1*WSP) ; Folding white-

ctext = NO-WS-CTL / ; Non-white-space
%d33-39 / ; The rest of the US-
%d42-91 / ; characters not including "(",
%d93-127 ; ")", or "\"

comment = "(" *([FWS] (ctext / quoted-pair / comment))
[FWS] ")"

CFWS = *([FWS] comment) (([FWS] comment) / FWS

content-location = "Content-Location:" [CFWS] URI [CFWS

URI = absoluteURI |

where URI is restricted to the syntax for URLs as defined in
Resource Locators [URL] until IETF specifies other kinds of URIs

4.2 The Content-Location

A Content-Location header specifies an URI that labels the content
a body part in whose heading it is placed. Its value CAN be
absolute or a relative URI. Any URI or URL scheme may be used,
use of non-standardized URI or URL schemes might entail some
that recipients cannot handle them correctly

An URI in a Content-Location header need not refer to an
which is globally available for retrieval using this URI (
resolution of relative URIs). However, URI-s in Content-
headers (if absolute, or resolvable to absolute URIs) SHOULD still
globally unique

A Content-Location header can thus be used to label a resource
is not retrievable by some or all recipients of a message.
example a Content-Location header may label an object which is
retrievable using this URI in a restricted domain, such as within
company-internal web space. A Content-Location header can
contain a fictitious URI. Such an URI need not be globally unique

A single Content-Location header field is allowed in any message
content heading, in addition to a Content-ID header (as specified
[MIME1]) and, in Message headings, a Message-ID (as specified
[RFC822]). All of these constitute different, equally valid body
labels, and any of them may be used to satisfy a reference to a
part. Multiple Content-Location header fields in the same
heading are not allowed




Palme, et al. Standards Track [Page 7]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


Example of a multipart/related structure containing body parts
both Content-Location and Content-ID labels

Content-Type: multipart/related; boundary="boundary-example";
type="text/html

--boundary-

Content-Type: text/html; charset="US-ASCII

... ... ... ...
... ... ... ...

--boundary-
Content-Type: image/
Content-ID: <97116092511xyz@foo.bar.net
Content-Location: fiction1/fiction

--boundary-
Content-Type: image/
Content-ID: <97116092811xyz@foo.bar.net
Content-Location: fiction1/fiction

--boundary-example--

4.3 URIs of MHTML

The URI of an MHTML aggregate is not the same as the URI of its root
The URI of its root will directly retrieve only the root
itself, even if it may cause a web browser to separately
in-line linked resources. If a Content-Location header field is
in the heading of a multipart/related, this Content-Location
apply to the whole aggregate, not to its root part

When an URI referring to an MHTML aggregate is used to retrieve
aggregate, the set of resources retrieved can be different from
set of resources retrieved using the Content-Locations of its parts
For example, retrieving an MHTML aggregate may return an old version
while retrieving the root URI and its in-line linked objects
return a newer version

4.4 Encoding and decoding of URIs in MIME header

4.4.1 Encoding of URIs containing inappropriate

Some documents may contain URIs with characters that
inappropriate for an RFC 822 header, either because the URI
has an incorrect syntax according to [URL] or the URI syntax



Palme, et al. Standards Track [Page 8]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


has been changed to allow characters not previously allowed in
headers. These URIs cannot be sent directly in a message header.
such a URI occurs, all spaces and other illegal characters in it
be encoded using one of the methods described in [MIME3] section 4.
This encoding MUST only be done in the header, not in the HTML text
Receiving clients MUST decode the [MIME3] encoding in the
before comparing URIs in body text to URIs in Content-
headers

The charset parameter value "US-ASCII" SHOULD be used if the
contains no octets outside of the 7-bit range. If such octets
present, the correct charset parameter value (derived e.g.
information about the HTML document the URI was found in) SHOULD
used. If this cannot be safely established, the value "UNKNOWN-8BIT
[RFC 1428] MUST be used

Note, that for the matching of URIs in text/html body parts to
in Content-Location headers, the value of the charset parameter
irrelevant, but that it may be relevant for other purposes, and
incorrect labeling MUST, therefore, be avoided. Warning:
of the charset parameter may not be true in the future, if
character encodings of the same non-English filename are used
HTML

4.4.2 Folding of long

Since MIME header fields have a limited length and long URIs
result in Content-Location headers that exceed this length, Content
Location headers may have to be folded

Encoding as discussed in clause 4.4.1 MUST be done before
folding. After that, the folding can be done, using the
defined in [URLBODY] section 3.1.

4.4.3 Unfolding and decoding of received URLs in MIME header

Upon receipt, folded MIME header fields should be unfolded, and
any MIME encoding should be removed, to retrieve the original URI

5. Base URIs for resolution of relative

Relative URIs inside the contents of MIME body parts are
relative to a base URI using the methods for resolving relative
described in [RELURL]. In order to determine this base URI,
first-applicable method in the following list applies






Palme, et al. Standards Track [Page 9]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


(a) There is a base specification inside the MIME body
containing the relative URI which resolves relative URIs
absolute URIs. For example, HTML provides the BASE element
this purpose

(b) There is a Content-Location header in the immediately
heading of the body part and it contains an absolute URI.
URI can serve as a base in the same way as a requested URI
serve as a base for relative URIs within a file retrieved
HTTP [HTTP].

(c) If necessary, step (b) can be repeated recursively to find
suitable Content-Location header in a surrounding multi-part
message heading

(d) If the MIME object is returned in a HTTP response, use the
used to initiate the

(e) When the methods above do not yield an absolute URI, a base
of "thismessage:/" MUST be employed. This base URL has
defined for the sole purpose of resolving relative
within a multipart/related structure when no other base URI
specified

This is also described in other words in section 8.2 below

6. Sending documents without linked

If a text/html resource (object) is sent without
resources, to which it refers, it MAY be sent by itself. In
case, embedding it in a multipart/related structure is not necessary

Such a text/html resource may either contain no URIs, or URIs
the recipient is expected to retrieve (if possible) via a
specified protocol. A text/html resource may also be sent
unresolvable links in special cases, such as when two
exchange drafts of unfinished resources

Inclusion of URIs referencing resources which the recipient has
retrieve via an URI specified protocol may not work for
recipients. This is because not all e-mail recipients have
Internet connectivity, or because URIs which work for a sender
not work for a recipient. This occurs, for example, when an
refers to a resource within a company-internal network that is
accessible from outside the company






Palme, et al. Standards Track [Page 10]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


7. Use of the Content-Type "multipart/related

If a message contains one or more MIME body parts containing URIs
also contains as separate body parts, resources, to which these
(as defined, for example, in HTML 2.0 [HTML2]) refer, then this
set of body parts (referring body parts and referred-to body parts
SHOULD be sent within a multipart/related structure as defined
[REL].

Even though headers can occur in a message that lacks an
multipart/related structure, this standard only covers their use
resolution of URIs between body parts inside a multipart/
structure. This standard does cover the case where a resource in
nested multipart/related structure contains URIs that reference
body parts in another multipart/related structure, in which it
enclosed. This standard does not cover the case where a resource in
multipart/related structure contains URIs that reference MIME
parts in another parallel or nested multipart/related structure,
in another MIME message, even if methods similar to those
in this standard are used. Implementers who employ such URIs
warned that receiving agents implementing this standard may not
able to process such references

When the start body part of a multipart/related structure is
atomic object, such as a text/html resource, it SHOULD be employed
the root resource of that multipart/related structure. When the
body part of a multipart/related structure is a multipart/
structure, and that structure contains at least one alternative
part which is a suitable atomic object, such as a text/html resource
then that body part SHOULD be employed as the root resource of
aggregate document. Implementers are warned, however, that
receiving agents treat multipart/alternative as if it had
multipart/mixed (even though MIME [MIME1] requires support
multipart/alternative).

[REL] specifies that a type parameter is mandatory in a "Content
Type: multipart/related" header, and requires that it be employed
specify the type of the multipart/related start object. Thus,
type parameter value shall be "multipart/alternative", when the
part is of "Content-type multipart/alternative", even if the
root resource is of type "text/html". In addition, if
multipart/related start object is not the first body part in
multipart/related structure, [REL] further requires that
Content-ID MUST be specified as the value of a start parameter in
"Content-Type: multipart/related" header






Palme, et al. Standards Track [Page 11]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


When rendering a resource in a multipart/related structure,
references within that resource can be satisfied by body parts
the same multipart/related structure (see section 8.2 below). This
useful

(a) For those recipients who only have email but not full
access

(b) For those recipients who for other reasons, such as firewalls
the use of company-internal links, cannot retrieve URI
resources via URI specified protocols

Note, that this means that you can, via e-mail, send text/
objects which includes URIs which the recipient cannot
via HTTP or other connectivity-requiring URIs

(c) To send a document whose content is preserved even if
resources to which embedded URIs refer are later changed
deleted

(d) For resources which are not available for protocol
retrieval

(e) To speed up access

When a sending MUA sends objects which were retrieved from the WWW
it SHOULD maintain their WWW URIs. It SHOULD not transform these
into some other URI form prior to transmitting them. This will

the receiving MUA to both verify MICs included with the message,
well as verify the documents against their WWW counterpoints, if
is appropriate

In certain cases this will not work - for example, if a
contains URIs as parameters to objects and applets. In such a case
it might be better to rewrite the document before sending it.
problem is discussed in more detail in the informational RFC
will be published as a supplement to this standard

Within a multipart/related structure, each body part MUST have,
assigned, a different Content-ID header value and a Content-
header field values which resolve to a different URI

Two body parts in the same multipart/related structure can have
same relative Content-Location header value, only if when resolved
absolute URIs they become different





Palme, et al. Standards Track [Page 12]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


8. Usage of Links to Other Body

8.1 General

A body part, such as a text/html body part, may contain URIs
reference resources which are included as body parts in the
message -- in detail, as body parts within the same multipart/
structure. Often such URI linked resources are meant to be
inline to the viewer of the referencing body part; for example
objects referenced with the SRC attribute of the IMG element in
2.0 [HTML2]. New elements and attributes with this property
proposed in the ongoing development of HTML (examples: applet, frame
profile, OBJECT, classid, codebase, data, SCRIPT). A sender
also want to send a set of HTML documents which the reader
traverse, and which are related with the attribute href of the
element

If a user retrieves and displays a web page formed from a text/
resource, and the subsidiary resources it references, and
saves the text/html resource, that user may not at a later time
able to retrieve and display the web page as it appeared when saved
The format described in this standard can be used to archive
retrieve all of the resources required to display the web page, as
originally appeared at a certain moment of time, in one
file

In order to send or store complete such messages, there is a need
specify how a URI in one body part can reference a resource
another body part

8.2 Resolution of URIs in text/html body

The resolution of inline, retrieval and other kinds of URIs
text/html body parts is performed in the following way

(a) Unfold multiple line header values according to [URLBODY]. Do
however translate character encodings of the kind described
[URL]. Example: Do not transform "a%2eb/c%20d" into "a/b/c d".

(b) Remove all MIME encodings, such as content-transfer encoding
header encodings as defined in MIME part 3 [MIME3] Do NOT
translate character encodings of the kind described in [URL].
Example: Do not transform "a%2eb/c%20d" into "a/b/c d".

(c) Try to resolve all relative URIs in the HTML content and
Content-Location headers using the procedure described in
5 above. The result of this resolution can be an absolute URI,
an absolute URI with the base "thismessage:/" as specified



Palme, et al. Standards Track [Page 13]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


chapter 5.

(d) For each referencing URI in a text/html body part, compare
value of the referencing URI after resolution as described in (a
and (b), with the URI derived from Content-ID and Content
Location headers for other body parts within the same or
surrounding Multipart/related structure. If the strings
identical, octet by octet, then the referencing URI
that body part. This comparison will only succeed if the two
are identical. This means that if one of the two URIs to
compared was a fictitious absolute URI with the
"thismessage:/", the other must also be such a
absolute URI, and not resolvable to a real absolute URI

(e) If (d) fails, try to retrieve the URI referenced
hyperlink through ordinary Internet lookup. Resolution of URIs
the URL-types "mid" or "cid" to other content-parts, outside
same multipart/related structure, or in other separately
messages, is not covered by this standard, and is thus
encouraged nor forbidden

8.3 Use of the Content-ID header and CID

When URIs employing a CID (Content-ID) scheme as defined in [URL]
[MIDCID] are used to reference other body parts in an
multipart/related structure, they MUST only be matched
Content-ID header values, and not against Content-Location
with CID: values. Thus, even though the following two headers
identical in meaning, only the Content-ID value will be matched,
the Content-Location value will be ignored

Content-ID: Content-Location: CID: foo@bar.

Note: Content-IDs MUST be globally unique [MIME1]. It is thus
permitted to make them unique only within a message or within
single multipart/related structure

9.

Warning: The examples are provided for illustrative purposes only.
there is a contradiction between the explanatory text and
examples in this standard, then the explanatory text is normative

Notation: The examples contain indentation to show the structure,
real objects should not be indented in this way





Palme, et al. Standards Track [Page 14]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


9.1 Example of a HTML body without included linked

The first example is the simplest form of an HTML email message.
message does not contain an aggregate HTML object, but simply
message with a single HTML body part. This body part contains a
but the messages does not contain the resource referenced by
URI. To retrieve the resource referenced by the URI the
client would need either IP access to the Internet, or an
mail web gateway

From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: 8

Acute accent


The following two lines look have the same screen rendering: E with acute accent becomes Ï. E with acute accent becomes É. Try clicking
here.

9.2 Example with an absolute URI to an embedded GIF

The second example is an HTML message which includes a single image
referenced using the Content-Location mechanism

From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html"; start=""

--boundary-
Content-Type: text/html;charset="US-ASCII
Content-ID:
... text of the HTML document, which might contain a
referencing a resource in another body part, for
through a statement such as
Standards Track [Page 15]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


ALT="IETF logo">

--boundary-
Content-Location
http://www.ietf.cnri.reston.va.us/images/ietflogo.
Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-example--

9.3 Example with relative URIs to embedded GIF

In this example, a Content-Location header field in the
heading will be a base to all relative URLs, also inside the
text being sent

From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Location: http://www.ietf.cnri.reston.va.us
Content-Type: multipart/related; boundary="boundary-example";
type="text/html

--boundary-
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: QUOTED-

... text of the HTML document, which might contain
referencing resources in other body parts, for example
statements such as

IETF logo1
IETF logo2
IETF logo3

Example of a copyright sign encoded with Quoted-Printable: =A
Example of a copyright sign mapped onto HTML markup: ¨

--boundary-
Content-Location
http://www.ietf.cnri.reston.va.us/images/ietflogo1.
; Note - Absolute Content-Location does not require
;



Palme, et al. Standards Track [Page 16]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-
Content-Location: images/ietflogo2.
; Note - Relative Content-Location is resolved by
; specified in the Multipart/Related Content-Location
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-
Content-Location
http://www.ietf.cnri.reston.va.us/images/ietflogo3.
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-example--

9.4 Example with a relative URI and no BASE

From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html

--boundary-
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: QUOTED-

... text of the HTML document, which might contain a
referencing a resource in another body part, for
through a statement such as
IETF logo
Example of a copyright sign encoded with Quoted-Printable: =A
Example of a copyright sign mapped onto HTML markup: ¨




Palme, et al. Standards Track [Page 17]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


--boundary-
Content-Location: ietflogo.
Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-example--

9.5 Example using CID URL and Content-ID header to an embedded


From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example";
type="text/html

--boundary-
Content-Type: text/html; charset="US-ASCII

... text of the HTML document, which might contain a
referencing a resource in another body part, for
through a statement such as
IETF logo

--boundary-
Content-Location: CID:something@else ; this header is
Content-ID: Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...

--boundary-example--











Palme, et al. Standards Track [Page 18]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


9.6 Example showing permitted and forbidden references between
body

This example shows in which cases references are allowed
multiple multipart/related body parts in a message

From: foo1@bar.
To: foo2@bar.
Subject: A simple
Mime-Version: 1.0
Content-Type: multipart/related; boundary="boundary-example-1";
type="text/html

--boundary-example-1
Content-Type: text/html;charset="US-ASCII
Content-ID:
The image reference below will be resolved with the
in the next body part


The image reference below cannot be resolved within
MIME message, since it contains a reference from an
body part to an inside body part, which is not
by this standard
ALT="IETF logo with transparent background">

The anchor reference immediately below will be resolved
the nested text/html body part below
reference immediately below will be resolved
the nested text/html body part below
boundary-example-1
Content-Location
http://www.ietf.cnri.reston.va.us/images/ietflogo.
Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk
NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4
etc...



Palme, et al. Standards Track [Page 19]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


--boundary-example-1
Content-Location
http://www.ietf.cnri.reston.va.us/more-
Content-Type: multipart/related; boundary="boundary-example-2";
type="text/html
--boundary-example-2
Content-Type: text/html;charset="US-ASCII
Content-ID:
The image reference below will be resolved with the
in the surrounding multipart/related above


The image reference below will be resolved with the
inside the current nested multipart/related below
ALT="IETF logo with transparent background">

--boundary-example-2
Content-Location: http:images/ietflogo2.
Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgANX/ACkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nzc3t7e
SEhIyMjJSUlJycnKWlpa2trbW1tcDAwM7Ozv/eQnNzjHNzlGtrjGNjhFpae1
etc...

--boundary-example-2--
--boundary-example-1
Content-Location
http://www.ietf.cnri.reston.va.us/even-more-
Content-Type: multipart/related; boundary="boundary-example-3";
type="text/html
--boundary-example-3
Content-Type: text/html;charset="US-ASCII
Content-ID: <4@foo@bar.net

The image reference below will be resolved with the
inside the current nested multipart/related below
ALT="IETF logo with shadows">

The image reference below cannot be resolved according
this standard since references between parallel multipart
related structures are not supported
ALT="IETF logo with transparent background">



Palme, et al. Standards Track [Page 20]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


--boundary-example-3
Content-Location: http:images/ietflogo2d.
Content-Type: IMAGE/
Content-Transfer-Encoding: BASE64

R0lGODlhGAGgANX/AMDAwCkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3
c3t7e4SEhIyMjJSUlJycnKWlpa2trbW1tb29vcbGxs7OztbW1t7e3ufn5+/
etc...

--boundary-example-3--
--boundary-example-1--

10. Character encoding issues and end-of-line

For the encoding of characters in HTML documents and other
documents into a MIME-compatible octet stream, the
mechanisms are relevant

- HTML [HTML2], [HTML-I18N] as an application of SGML [SGML]
characters to be denoted by character entities as well as
numeric character references (e.g. "Latin small letter a
acute accent" may be represented by "á" or "á") in
HTML markup

- HTML documents, in common with other documents of the
Content-Type "text", can be represented in MIME using one
several character encodings. The MIME Content-Type "charset
parameter value indicates the particular encoding used. For
exact meaning and use of the "charset" parameter, please
[MIME2] chapter 4.

Note that the "charset" parameter refers only to the
character encoding. For example, the string "á" can be
in MIME with "charset=US-ASCII", while the raw character "
small letter a with acute accent" cannot

The above mechanisms are well defined and documented, and
not further explained here. In sending a message, all the
mentioned mechanisms MAY be used, and any mixture of them MAY
when sending the document in MIME format. Receiving user
(together with any Web browser they may use to display the document
MUST be capable of handling any combinations of these mechanisms

Also note that

- Any documents including HTML documents that contain octet
outside the 7-bit range need a content-transfer-encoding
before transmission over certain transport protocols [MIME1,



Palme, et al. Standards Track [Page 21]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


chapter 5].

- The MIME standard [MIME2] requires that e-mailed documents
"Content-Type: Text/ MUST be in canonical form before a Content
Transfer-Encoding is applied, i.e. that line breaks are encoded
CRLFs, not as bare CRs or bare LFs or something else. This is
contrast to [HTTP] where section 3.6.1 allows
representations of line breaks

Note that this might cause problems with integrity checks based
checksums, which might not be preserved when moving a document
the HTTP to the MIME environment. If a document has to be
in such a way that a checksum based message integrity check
invalid, then this integrity check header SHOULD be removed from
document

Other sources of problems are Content-Encoding used in HTTP but
allowed in MIME, and character sets that are not able to
line breaks as CRLF. A good overview of the differences between
and MIME with regards to Content-Type: "text" can be found in [HTTP],
appendix C

Some transport mechanisms may specify a default "charset"
if none is supplied [HTTP, MIME1]. Because the default differs
different mechanisms, when HTML is transferred through e-mail,
charset parameter SHOULD be included, rather than relying on
default

11. Security

11.1 Security considerations not related to

It is possible for a message sender to misrepresent the source of
multipart/related body part to a message recipient by labeling
with a Content-Location URI that references another resource
Therefore, message recipients should only interpret Content-
URIs as labeling a body part for the resolution of references
body parts in the same multipart/related message structure, and
as the source of a resource, unless this can be verified by
means

URIs, especially File URIs, if used without change in a message,
inadvertently reveal information that was not intended to be
outside a particular security context. Message senders should
care when constructing messages containing the new header fields
defined in this standard, that they are not revealing
outside of any security contexts to which they belong




Palme, et al. Standards Track [Page 22]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


Some resource servers hide passwords and tickets (access tokens
information which should not be reveled to others) and
sensitive information in non-visible fields or URIs within
text/html resource. If such a text/html resource is forwarded in
email message, this sensitive information may be
revealed to others

Since HTML documents can either directly contain executable
(i.e., JavaScript) or indirectly reference executable content (
"INSERT" specification, Java). It is exceedingly dangerous for
receiving User Agent to execute content received in a mail
without careful attention to restrictions on the capabilities of
executable content

HTML-formatted messages can be used to investigate user behaviour
for example to break anonymity, in ways which invade the privacy
individuals. If you send a message with a inline link to an
which is not itself included in the message, the recipients mailer
browser may request that object through HTTP. The HTTP
will then reveal who is reading the message. Example: A person
wants to find out who is behind an anonymous user identity, or
which workstation a user is reading his mail, can do this by
a message with an inline link and then observe from where this
is used to request the object

11.2 Security considerations related to

There is a well-known problem with the caching of directly
web resources. A resource retrieved from a cache may differ from
re-retrieved from its source. This problem, also manifests
when a copy of a resource is delivered in a multipart/
structure

When processing (rendering) a text/html body part in an
multipart/related structure, all URIs in that text/html body
which reference subsidiary resources within the
multipart/related structure SHALL be satisfied by those resources
not by resources from any another local or remote source

Therefore, if a sender wishes a recipient to always retrieve an
referenced resource from its source, an URI labeled copy of
resource MUST NOT be included in the same multipart/
structure

In addition, since the source of a resource received in
multipart/related structure can be misrepresented (see 11.1 above),
if a resource received in multipart/related structure is stored in
cache, it MUST NOT be retrieved from that cache other than by



Palme, et al. Standards Track [Page 23]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


reference contained in a body part of the same multipart/
structure. Failure to honor this directive will allow
multipart/related structure to be employed as a Trojan Horse.
example, to inject bogus resources (i.e. a misrepresentation of
competitor's Web site) into a recipient's generally accessible
cache

12. Differences as compared to the previous version of this
standard in RFC 2110

The specification has been changed to show that the formats
do not only apply to multipart MIME in email, but also to
MIME transferred through other protocols such as HTTP or FTP

In order to agree with [RELURL], Content-Location headers
multipart Content-Headings can now be used as a base to
relative URIs in their component parts, but only if no base URI
be derived from the component part itself. Base URIs in Content
Location header fields in inner headings have precedence over
URIs in outer multipart headings

The Content-Base header, which was present in RFC 2110, has
removed. A conservative implementor may choose to accept this
in input for compatibility with implementations of RFC 2110, but
never send any Content-Base header, since this header is not any
a part of this standard

A section 4.4.1 has been added, specifying how to handle the case
sending a body part whose URI does not agree with the correct
syntax

The handling of relative and absolute URIs for matching between
parts have been merged into a single description, by specifying
relative URIs, which cannot be resolved otherwise, should be
as if they had been given the URL "thismessage:/".

13.

Harald T. Alvestrand, Richard Baker, Isaac Chan, Dave Crocker,
J. Duerst, Lewis Geer, Roy Fielding, Ned Freed, Al Gilman,
Hoffman, Andy Jacobs, Richard W. Jesmajian, Mark K. Joseph,
Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed Levinson, Jay Levitt
Albert Lunde, Larry Masinter, Keith Moore, Gavin Nicol, Martyn W
Peck, Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski,
Zilles and several other people have helped us with preparing
document. We alone take responsibility for any errors which may
be in the document




Palme, et al. Standards Track [Page 24]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


14.

[ABNF] Crocker, D. and P. Overell, "Augmented BNF for
Specifications: ABNF", RFC 2234, November 1997.

[CONDISP] Troost, R. and S. Dorner, "Communicating
Information in Internet Messages: The Content
Disposition Header", RFC 2183, August 1997.

[HOSTS] Braden, R., Ed., "Requirements for Internet Hosts --
Application and Support", STD 3, RFC 1123,
1989.

[HTML-I18N] Yergeau, F., Nicol, G. Adams, G. and M. Duerst
"Internationalization of the Hypertext
Language", RFC 2070, January 1997.

[HTML2] Berners-Lee, T. and D. Connolly: "Hypertext
Language - 2.0", RFC 1866, November 1995.

[HTML3.2] Dave Raggett: HTML 3.2 Reference Specification, W3
Recommendation, January 1997, at
http://www.w3.org/TR/REC-html32.

[HTTP] Berners-Lee, T., Fielding, R. and H. Frystyk
"Hypertext Transfer Protocol -- HTTP/1.0", RFC 1945,
May 1996.

[IETF-TERMS] Bradner, S., "Key words for use in RFCs to
Requirements Levels", BCP 14, RFC 2119, March 1997.

[INFO] J. Palme: Sending HTML in MIME, an
supplement to the RFC: MIME Encapsulation
Aggregate Documents, such as HTML (MHTML), Work
Progress

[MD5] Rivest, R., "The MD5 Message-Digest Algorithm",
1321, April 1992.

[MIDCID] Levinson, E., "Content-ID and Message-ID
Resource Locators", RFC 2387, August 1998.

[MIME1] Freed, N. and N. Borenstein, "Multipurpose
Mail Extensions (MIME) Part One: Format of
Message Bodies", RFC 2045, December 1996.






Palme, et al. Standards Track [Page 25]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


[MIME2] Freed, N. and N. Borenstein, "Multipurpose
Mail Extensions (MIME) Part Two: Media Types",
2046, December 1996.

[MIME3] Moore, K., "MIME (Multipurpose Internet
Extensions) Part Three: Message Header Extensions
Non-ASCII Text", RFC 2047, December 1996.

[MIME4] Freed, N., Klensin, J. and J. Postel, "
Internet Mail Extensions (MIME) Part Four
Registration Procedures", RFC 2048, January 1997.

[MIME5] Freed, N. and N. Borenstein, "Multipurpose
Mail Extensions (MIME) Part Five:
Criteria and Examples", RFC 2049, November 1996.

[NEWS] Horton, M. and R. Adams: "Standard for interchange
USENET messages", RFC 1036, December 1987.

[PDF] Tim Bienz and Richar Cohn: "Portable Document
Reference Manual", Addison-Wesley, Reading, MA, USA
1993, ISBN 0-201-62628-4.

[REL] Levinson, E., "The MIME Multipart/Related Content
Type", RFC 2389, August 1998.

[RELURL] Fielding, R., "Relative Uniform Resource Locators",
RFC 1808, June 1995.

[RFC822] Crocker, D., "Standard for the format of
Internet text messages." STD 11, RFC 822,
1982.

[SGML] ISO 8879. Information Processing -- Text and Office -
Standard Generalized Markup Language (SGML), 1986.

[SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10,
RFC 821, August 1982.

[URL] Berners-Lee, T., Masinter, L. and M. McCahill
"Uniform Resource Locators (URL)", RFC 1738,
1994.

[URLBODY] Freed, N. and K. Moore, "Definition of the URL
External-Body Access-Type", RFC 2017, October 1996.





Palme, et al. Standards Track [Page 26]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


[VRML] Gavin Bell, Anthony Parisi, Mark Pesce: "
Reality Modeling Language (VRML) Version 1.0
Specification." May 1995,
http://www.vrml.org/Specifications/.

[XML] Extensible Markup Language, published by the
Wide Web Consortium, URL http://www.w3.org/XML

15. Authors'

For contacting the editors, preferably write to Jacob Palme

Jacob
Stockholm University and
Electrum 230
S-164 40 Kista,

Phone: +46-8-16 16 67
Fax: +46-8-783 08 29
EMail: jpalme@dsv.su.


Alex
Microsoft
One Microsoft
Redmond WA 98052

Phone: +1-425-703-8238
EMail: alexhop@microsoft.


Nick
Lotus Development
55 Cambridge
Cambridge MA 02142-1295

EMail: Shelness@lotus.


Working group chairman

Einar
EMail: stef@nma.








Palme, et al. Standards Track [Page 27]

RFC 2557 MIME Encapsulation of Aggregate Documents March 1999


16. Full Copyright

Copyright (C) The Internet Society (1999). All Rights Reserved

This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English

The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns

This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE
























Palme, et al. Standards Track [Page 28]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum