As per Relevance of the word parameter, we have this rfc below:











Network Working Group E.
Request for Comments: 2376 UC
Category: Informational M.
Fuji Xerox Info.
July 1998


XML Media

Status of this

This memo provides information for the Internet community. It
not specify an Internet standard of any kind. Distribution of
memo is unlimited

Copyright

Copyright (C) The Internet Society (1998). All Rights Reserved



This document proposes two new media subtypes, text/xml
application/xml, for use in exchanging network entities which
conforming Extensible Markup Language (XML). XML entities
currently exchanged via the HyperText Transfer Protocol on the
Wide Web, are an integral part of the WebDAV protocol for remote
authoring, and are expected to have utility in many domains

Table of

1 INTRODUCTION ....................................................2
2 NOTATIONAL CONVENTIONS ..........................................3
3 XML MEDIA TYPES .................................................3
3.1 Text/xml Registration ........................................3
3.2 Application/xml Registration .................................6
4 SECURITY CONSIDERATIONS .........................................8
5 THE BYTE ORDER MARK (BOM) AND CONVERSIONS TO/FROM UTF-16 ........9
6 EXAMPLES ........................................................9
6.1 text/xml with UTF-8 Charset .................................10
6.2 text/xml with UTF-16 Charset ................................10
6.3 text/xml with ISO-2022-KR Charset ...........................10
6.4 text/xml with Omitted Charset ...............................11
6.5 application/xml with UTF-16 Charset .........................11
6.6 application/xml with ISO-2022-KR Charset ....................11
6.7 application/xml with Omitted Charset and UTF-16 XML Entity ..12
6.8 application/xml with Omitted Charset and UTF-8 Entity .......12
6.9 application/xml with Omitted Charset and Internal
Declaration.......................................................12



Whitehead & Murata Informational [Page 1]

RFC 2376 XML Media Types July 1998


7 REFERENCES .....................................................13
8 ACKNOWLEDGEMENTS ...............................................14
9 ADDRESSES OF AUTHORS ...........................................14
10 FULL COPYRIGHT STATEMENT ......................................15

1

The World Wide Web Consortium (W3C) has issued a
[REC-XML] which defines the Extensible Markup Language (XML),
1. To enable the exchange of XML network entities, this
proposes two new media types, text/xml and application/xml

XML entities are currently exchanged on the World Wide Web, and
is also used for property values and parameter marshalling by
WebDAV protocol for remote web authoring. Thus, there is a need for
media type to properly label the exchange of XML network entities
(Note that, as sometimes happens between two communities, both
and XML have defined the term entity, with different meanings.)

Although XML is a subset of the Standard Generalized Markup
(SGML) [ISO-8897], and currently is assigned the media
text/sgml and application/sgml, there are several reasons why use
text/sgml or application/sgml to label XML is inappropriate. First
there exist many applications which can process XML, but which
process SGML, due to SGML's larger feature set. Second,
applications cannot always process XML entities, because XML
features of recent technical corrigenda to SGML. Third,
definition of text/sgml and application/sgml [RFC-1874]
parameters for SGML bit combination transformation format (SGML
bctf), and SGML boot attribute (SGML-boot). Since XML does not
these parameters, it would be ambiguous if such parameters were
for an XML entity. For these reasons, the best approach for
XML network entities is to provide new media types for XML

Since XML is an integral part of the WebDAV Distributed
Protocol, and since World Wide Web Consortium Recommendations
conventionally been assigned IETF tree media types, and since
media types (HTML, SGML) have been assigned IETF tree media types
the XML media types also belong in the IETF media types tree












Whitehead & Murata Informational [Page 2]

RFC 2376 XML Media Types July 1998


2 Notational

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
document are to be interpreted as described in [RFC-2119].

3 XML Media

This document introduces two new media types for XML entities
text/xml and application/xml. Registration information for
media types are described in the sections below

Every XML entity is suitable for use with the application/xml
type without modification. But this does not exploit the fact
XML can be treated as plain text in many cases. MIME user
(and web user agents) that do not have explicit support
application/xml will treat it as application/octet-stream,
example, by offering to save it to a file

To indicate that an XML entity should be treated as plain text
default, use the text/xml media type. This restricts the
used in the XML entity to those that are compatible with
requirements for text media types as described in [RFC-2045]
[RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP).

XML provides a general framework for defining sequences of
data. In some cases, it may be desirable to define new media
which use XML but define a specific application of XML, perhaps
to domain-specific security considerations or runtime information
This document does not prohibit future media types dedicated to
XML applications. However, developers of such media types
recommended to use this document as a basis. In particular,
charset parameter should be used in the same manner

Within the XML specification, XML entities can be classified
four types. In the XML terminology, they are called "
entities", "external DTD subsets", "external parsed entities",
"external parameter entities". The media types text/xml
application/xml can be used for any of these four types

3.1 Text/xml

MIME media type name:

MIME subtype name:

Mandatory parameters:




Whitehead & Murata Informational [Page 3]

RFC 2376 XML Media Types July 1998


Optional parameters:

Although listed as an optional parameter, the use of the
parameter is STRONGLY RECOMMENDED, since this information can
used by XML processors to determine authoritatively the
encoding of the XML entity. The charset parameter can also be
to provide protocol-specific operations, such as charset-
content negotiation in HTTP. "UTF-8" [RFC-2279] is
recommended value, representing the UTF-8 charset. UTF-8
supported by all conforming XML processors [REC-XML].

If the XML entity is transmitted via HTTP, which uses a MIME-
mechanism that is exempt from the restrictions on the text top
level type (see section 19.4.1 of HTTP 1.1 [RFC-2068]), "UTF-16"
(Appendix C.3 of [UNICODE] and Amendment 1 of [ISO-10646]) is
recommended. UTF-16 is supported by all conforming XML
[REC-XML]. Since the handling of CR, LF and NUL for text types
most MIME applications would cause undesired transformations
individual octets in UTF-16 multi-octet characters, gateways
HTTP to these MIME applications MUST transform the XML entity
a text/xml; charset="utf-16" to application/xml; charset="utf-16".

Conformant with [RFC-2046], if a text/xml entity is received
the charset parameter omitted, MIME processors and XML
MUST use the default charset value of "us-ascii". In cases
the XML entity is transmitted via HTTP, the default charset
is still "us-ascii".

Since the charset parameter is authoritative, the charset is
always declared within an XML encoding declaration. Thus,
care is needed when the recipient strips the MIME header
provides persistent storage of the received XML entity (e.g., in
file system). Unless the charset is UTF-8 or UTF-16, the
SHOULD also persistently store information about the charset
perhaps by embedding a correct XML encoding declaration within
XML entity

Encoding considerations

This media type MAY be encoded as appropriate for the charset
the capabilities of the underlying MIME transport. For 7-
transports, data in both UTF-8 and UTF-16 is encoded in quoted
printable or base64. For 8-bit clean transport (e.g., ESMTP
8BITMIME, or NNTP), UTF-8 is not encoded, but UTF-16 is base64
encoded. For binary clean transports (e.g., HTTP), no content
transfer-encoding is necessary





Whitehead & Murata Informational [Page 4]

RFC 2376 XML Media Types July 1998


Security considerations

See section 4 below

Interoperability considerations

XML has proven to be interoperable across WebDAV clients
servers, and for import and export from multiple XML
tools

Published specification: see [REC-XML

Applications which use this media type

XML is device-, platform-, and vendor-neutral and is supported
a wide range of Web user agents, WebDAV clients and servers,
well as XML authoring tools

Additional information

Magic number(s):

Although no byte sequences can be counted on to always be present
XML entities in ASCII-compatible charsets (including UTF-8)
begin with hexadecimal 3C 3F 78 6D 6C (" information, see Appendix F of [REC-XML].

File extension(s): .xml, .
Macintosh File Type Code(s): "TEXT

Person & email address for further information

Dan Connolly <connolly@w3.org
Murata Makoto (Family Given)
Intended usage:

Author/Change controller

The XML specification is a work product of the World Wide
Consortium's XML Working Group, and was edited by

Tim Bray Jean Paoli microsoft.com
C. M. Sperberg-McQueen
The W3C, and the W3C XML working group, has change control
the XML specification



Whitehead & Murata Informational [Page 5]

RFC 2376 XML Media Types July 1998


3.2 Application/xml

MIME media type name:

MIME subtype name:

Mandatory parameters:

Optional parameters:

Although listed as an optional parameter, the use of the
parameter is STRONGLY RECOMMENDED, since this information can
used by XML processors to determine authoritatively the charset
the XML entity. The charset parameter can also be used to
protocol-specific operations, such as charset-based
negotiation in HTTP

"UTF-8" [RFC-2279] and "UTF-16" (Appendix C.3 of [UNICODE]
Amendment 1 of [ISO-10646]) are the recommended values
representing the UTF-8 and UTF-16 charsets, respectively.
charsets are preferred since they are supported by all
XML processors [REC-XML].

If an application/xml entity is received where the
parameter is omitted, no information is being provided about
charset by the MIME Content-Type header. Conforming XML
MUST follow the requirements in section 4.3.3 of [REC-XML]
directly address this contingency. However, MIME processors
are not XML processors should not assume a default charset if
charset parameter is omitted from an application/xml entity

Since the charset parameter is authoritative, the charset is
always declared within an XML encoding declaration. Thus,
care is needed when the recipient strips the MIME header
provides persistent storage of the received XML entity (e.g., in
file system). Unless the charset is UTF-8 or UTF-16,
recipient SHOULD also persistently store information about
charset, perhaps by embedding a correct XML encoding
within the XML entity












Whitehead & Murata Informational [Page 6]

RFC 2376 XML Media Types July 1998


Encoding considerations

This media type MAY be encoded as appropriate for the charset
the capabilities of the underlying MIME transport. For 7-
transports, data in both UTF-8 and UTF-16 is encoded in quoted
printable or base64. For 8-bit clean transport (e.g., ESMTP
8BITMIME, or NNTP), UTF-8 is not encoded, but UTF-16 is base64
encoded. For binary clean transport (e.g., HTTP), no content
transfer-encoding is necessary

Security considerations

See section 4 below

Interoperability considerations

XML has proven to be interoperable for import and export
multiple XML authoring tools

Published specification: see [REC-XML

Applications which use this media type

XML is device-, platform-, and vendor-neutral and is supported
a wide range of Web user agents and XML authoring tools

Additional information

Magic number(s):

Although no byte sequences can be counted on to always be present
XML entities in ASCII-compatible charsets (including UTF-8)
begin with hexadecimal 3C 3F 78 6D 6C (" UTF-16 often begin with hexadecimal FE FF 00 3C 00 3F 00 78 00 6
or FF FE 3C 00 3F 00 78 00 6D 00 (the Byte Order Mark (BOM
followed by "information, see Annex F of [REC
XML].

File extension(s): .xml, .
Macintosh File Type Code(s): "TEXT

Person & email address for further information

Dan Connolly <connolly@w3.org
Murata Makoto (Family Given)
Intended usage:




Whitehead & Murata Informational [Page 7]

RFC 2376 XML Media Types July 1998


Author/Change controller

The XML specification is a work product of the World Wide
Consortium's XML Working Group, and was edited by

Tim Bray Jean Paoli microsoft.com
C. M. Sperberg-McQueen
The W3C, and the W3C XML working group, has change control
the XML specification

4 Security

XML, as a subset of SGML, has the same security considerations
specified in [RFC-1874].

To paraphrase section 3 of [RFC-1874], XML entities
information to be parsed and processed by the recipient's XML system
These entities may contain and such systems may permit
system level commands to be executed while processing the data.
the extent that an XML system will execute arbitrary command strings
recipients of XML entities may be at risk. In general, it may
possible to specify commands that perform unauthorized
operations or make changes to the display processor's
that affect subsequent operations

Use of XML is expected to be varied, and widespread. XML is
scrutiny by a wide range of communities for use as a common
for community-specific metadata. For example, the Dublin Core
is using XML for document metadata, and a new effort has begun
is considering use of XML for medical information. Other groups
XML as a mechanism for marshalling parameters for remote
calls. More uses of XML will undoubtedly arise

Security considerations will vary by domain of use. For example,
medical records will have much more stringent privacy and
considerations than XML library metadata. Similarly, use of XML as
parameter marshalling syntax necessitates a case by case
review

XML may also have some of the same security concerns as plain text
Like plain text, XML can contain escape sequences which,
displayed, have the potential to change the display
environment in ways that adversely affect subsequent operations
Possible effects include, but are not limited to, locking
keyboard, changing display parameters so subsequent displayed text
unreadable, or even changing display parameters to



Whitehead & Murata Informational [Page 8]

RFC 2376 XML Media Types July 1998


obscure or distort subsequent displayed material so that its
is lost or altered. Display processors should either filter
material from displayed text or else make sure to reset all
settings after a given display operation is complete

Some terminal devices have keys whose output, when pressed, can
changed by sending the display processor a character sequence.
this is possible the display of a text object containing
character sequences could reprogram keys to perform some illicit
dangerous action when the key is subsequently pressed by the user
In some cases not only can keys be programmed, they can be
remotely, making it possible for a text display operation to
perform some unwanted action. As such, the ability to program
should be blocked either by filtering or by disabling the ability
program keys entirely

Note that it is also possible to construct XML documents which
use of what XML terms "entity references" (using the XML meaning
the term "entity", which differs from the MIME definition of
term), to construct repeated expansions of text. Recursive
are prohibited [REC-XML] and XML processors are required to
them. However, even non-recursive expansions may cause problems
the finite computing resources of computers, if they are
many times

5 The Byte Order Mark (BOM) and Conversions to/from UTF-16

The XML Recommendation, in section 4.3.3, specifies that UTF-16
entities must begin with a byte order mark (BOM), which is the
WIDTH NO-BREAK SPACE character, hexadecimal sequence 0xFEFF (
0xFFFE, depending on endian). The XML Recommendation further
that the BOM is an encoding signature, and is not part of either
markup or the character data of the XML document

Due to the BOM, applications which convert XML from the UTF-16
encoding to another encoding SHOULD strip the BOM before conversion
Similarly, when converting from another encoding into UTF-16, the
SHOULD be added after conversion is complete

6

The examples below give the value of the Content-type MIME header
the XML declaration (which includes the encoding declaration)
the XML entity. For UTF-16 examples, the Byte Order Mark
is denoted as "{BOM}", and the XML declaration is assumed to come
the beginning of the XML entity, immediately following the BOM.
that other MIME headers may be present, and the XML entity




Whitehead & Murata Informational [Page 9]

RFC 2376 XML Media Types July 1998


contain other data in addition to the XML declaration; the
focus on the Content-type header and the encoding declaration
clarity

6.1 text/xml with UTF-8

Content-type: text/xml; charset="utf-8"

encoding="utf-8"?>

This is the recommended charset value for use with text/xml.
the charset parameter is provided, MIME and XML processors must
the enclosed entity as UTF-8 encoded

If sent using a 7-bit transport (e.g. SMTP), the XML entity must
a content-transfer-encoding of either quoted-printable or base64.
For an 8-bit clean transport (e.g., ESMTP, 8BITMIME, or NNTP), or
binary clean transport (e.g., HTTP) no content-transfer-encoding
necessary

6.2 text/xml with UTF-16

Content-type: text/xml; charset="utf-16"

{BOM}encoding='utf-16'?>

This is possible only when the XML entity is transmitted via HTTP
which uses a MIME-like mechanism and is a binary-clean protocol
hence does not perform CR and LF transformations and allows
octets. This differs from typical text MIME type processing (
section 19.4.1 of HTTP 1.1 [RFC-2068] for details).

Since HTTP is binary clean, no content-transfer-encoding
necessary

6.3 text/xml with ISO-2022-KR

Content-type: text/xml; charset="iso-2022-kr

encoding='iso-2022-kr'?>

This example shows text/xml with a Korean charset (e.g., Hangul
encoded following the specification in [RFC-1557]. Since the
parameter is provided, MIME and XML processors must treat
enclosed entity as encoded per [RFC-1557].

Since ISO-2022-KR has been defined to use only 7 bits of data,
content-transfer-encoding is necessary with any transport



Whitehead & Murata Informational [Page 10]

RFC 2376 XML Media Types July 1998


6.4 text/xml with Omitted

Content-type: text/

{BOM}encoding="utf-16"?>

This example shows text/xml with the charset parameter omitted.
this case, MIME and XML processors must assume the charset is "us
ascii", the default charset value for text media types specified
[RFC-2046]. The default of "us-ascii" holds even if the text/
entity is transported using HTTP

Omitting the charset parameter is NOT RECOMMENDED for text/xml.
example, even if the contents of the XML entity are UTF-16 or UTF-8,
or the XML entity has an explicit encoding declaration, XML and
processors must assume the charset is "us-ascii".

6.5 application/xml with UTF-16

Content-type: application/xml; charset="utf-16"

{BOM}

This is a recommended charset value for use with application/xml
Since the charset parameter is provided, MIME and XML processors
treat the enclosed entity as UTF-16 encoded

If sent using a 7-bit transport (e.g., SMTP) or an 8-bit
transport (e.g., ESMTP, 8BITMIME, or NNTP), the XML entity must
encoded in quoted-printable or base64. For a binary clean
(e.g., HTTP), no content-transfer-encoding is necessary

6.6 application/xml with ISO-2022-KR

Content-type: application/xml; charset="iso-2022-kr

encoding="iso-2022-kr"?>

This example shows application/xml with a Korean charset (e.g.,
Hangul) encoded following the specification in [RFC-1557]. Since
charset parameter is provided, MIME and XML processors must treat
enclosed entity as encoded per [RFC-1557], independent of whether
XML entity has an internal encoding declaration (this example
show such a declaration, which agrees with the charset parameter).

Since ISO-2022-KR has been defined to use only 7 bits of data,
content-transfer-encoding is necessary with any transport




Whitehead & Murata Informational [Page 11]

RFC 2376 XML Media Types July 1998


6.7 application/xml with Omitted Charset and UTF-16 XML

Content-type: application/

{BOM}

For this example, the XML entity begins with a BOM. Since
charset has been omitted, a conforming XML processor follows
requirements of [REC-XML], section 4.3.3. Specifically, the
processor reads the BOM, and thus knows deterministically that
charset encoding is UTF-16.

An XML-unaware MIME processor should make no assumptions about
charset of the XML entity

6.8 application/xml with Omitted Charset and UTF-8

Content-type: application/



In this example, the charset parameter has been omitted, and there
no BOM. Since there is no BOM, the XML processor follows
requirements in section 4.3.3, and optionally applies the
described in appendix F (which is non-normative) of [REC-XML]
determine the charset encoding of UTF-8. The XML entity does
contain an encoding declaration, but since the encoding is UTF-8,
this is still a conforming XML entity

An XML-unaware MIME processor should make no assumptions about
charset of the XML entity

6.9 application/xml with Omitted Charset and Internal


Content-type: application/

encoding="ISO-10646-UCS-4"?>

In this example, the charset parameter has been omitted, and there
no BOM. However, the XML entity does have an encoding
inside the XML entity which specifies the entity's charset.
the requirements in section 4.3.3, and optionally applying
mechanism described in appendix F (non-normative) of [REC-XML],
XML processor determines the charset encoding of the XML entity (
this example, UCS-4).





Whitehead & Murata Informational [Page 12]

RFC 2376 XML Media Types July 1998


An XML-unaware MIME processor should make no assumptions about
charset of the XML entity

7

[ISO-10646] ISO/IEC, Information Technology - Universal Multiple
Octet Coded Character Set (UCS) - Part 1:
and Basic Multilingual Plane, May 1993.

[ISO-8897] ISO (International Organization for Standardization)
8879:1986(E) Information Processing -- Text and
Systems -- Standard Generalized Markup Language (SGML).
First edition -- 1986- 10-15.

[REC-XML] T. Bray, J. Paoli, C. M. Sperberg-McQueen, "
Markup Language (XML)" World Wide Web
Recommendation REC- xml-19980210.
http://www.w3.org/TR/1998/REC-xml-19980210.

[RFC-1557] Choi, U., Chon, K., and H. Park. "Korean
Encoding for Internet Messages", RFC 1557. December
1993.

[RFC-1874] Levinson, E., "SGML Media Types", RFC 1874.
1995.

[RFC-2119] Bradner, S., "Key words for use in RFCs to
Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC-2045] Freed, N., and N. Borenstein, "Multipurpose Internet
Extensions (MIME) Part One: Format of Internet
Bodies", RFC 2045, November 1996.

[RFC-2046] Freed, N., and N. Borenstein, "Multipurpose Internet
Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996.

[RFC-2068] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., and T
Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1",
RFC 2068, January 1997.

[RFC-2279] Yergeau, F., "UTF-8, a transformation format of
10646", RFC 2279, January 1998.

[UNICODE] The Unicode Consortium, "The Unicode Standard --
2.0", Addison-Wesley, 1996.





Whitehead & Murata Informational [Page 13]

RFC 2376 XML Media Types July 1998


8

Chris Newman and Yaron Y. Goland both contributed content to
security considerations section of this document. In particular
some text in the security considerations section is copied
from work in progress, draft-newman-mime-textpara-00, by
of the author. Chris Newman additionally contributed content to
encoding considerations sections. Dan Connolly contributed
discussing when to use text/xml. Discussions with Ned Freed and
Connolly helped refine the author's understanding of the text
type; feedback from Larry Masinter was also very helpful
understanding media type registration issues

Members of the W3C XML Working Group and XML Special Interest
have made significant contributions to this document, and the
would like to specially recognize James Clark, Martin Duerst,
Jelliffe, Gavin Nicol for their many thoughtful comments

9 Addresses of

E. James Whitehead, Jr
Dept. of Information and Computer
University of California,
Irvine, CA 92697-3425

EMail: ejw@ics.uci.


Murata Makoto (Family Given
Fuji Xerox Information Systems
KSP 9A7, 2-1, Sakado 3-chome, Takatsu-ku
Kawasaki-shi, Kanagawa-ken
213

EMail: murata@fxis.fujixerox.co.
















Whitehead & Murata Informational [Page 14]

RFC 2376 XML Media Types July 1998


10 Full Copyright

Copyright (C) The Internet Society (1998). All Rights Reserved

This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English

The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns

This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE
























Whitehead & Murata Informational [Page 15]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum