As per Relevance of the word standard, we have this rfc below:











Network Working Group D.
Request for Comments: 1641 M.
Category: Experimental Taligent, Inc
July 1994


Using Unicode with

Status of this

This memo defines an Experimental Protocol for the
community. This memo does not specify an Internet standard of
kind. Distribution of this memo is unlimited



The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E
jointly define a 16 bit character set (hereafter referred to
Unicode) which encompasses most of the world's writing systems
However, Internet mail (STD 11, RFC 822) currently supports only 7-
bit US ASCII as a character set. MIME (RFC 1521 and RFC 1522)
Internet mail to support different media types and character sets
and thus could support Unicode in mail messages. MIME neither
Unicode as a permitted character set nor specifies how it would
encoded, although it does provide for the registration of
character sets over time

This document specifies the usage of Unicode within MIME



Since Unicode is starting to see widespread commercial adoption
users will want a way to transmit information in this character
in mail messages and other Internet media. Since MIME was
designed to allow such extensions and is on the standards track
the Internet, it is the most appropriate means for encoding Unicode
RFC 1521 and RFC 1522 do not define Unicode as an allowed
set, but allow registration of additional character sets

In addition to allowing use of Unicode within MIME bodies,
goal is to specify a way of using Unicode that allows text
consists largely, but not entirely, of US-ASCII characters to
represented in a way that can be read by mail clients who do
understand Unicode. This is in keeping with the philosophy of MIME
Such an encoding is described in another document, "UTF-7: A
Safe Transformation Format of Unicode" [UTF-7].





Goldsmith & Davis [Page 1]

RFC 1641 Using Unicode with MIME July 1994




Several ways of using Unicode are possible. This document
both guidelines for use of Unicode within MIME, and a specific usage
The usage specified in this document is a straightforward use
Unicode as specified in "The Unicode Standard, Version 1.1".

This encoding is intended for situations where sender and
do not want to do a lot of processing, when the text does not
primarily of characters from the US-ASCII character set, or
sender and receiver are known in advance to support Unicode

Another encoding is intended for situations where the text
primarily of US-ASCII, with occasional characters from other parts
Unicode. This encoding allows the US-ASCII portion to be read by
recipients without having to support Unicode. This encoding
specified in another document, "UTF-7: A Mail Safe
Format of Unicode" [UTF-7].

Finally, in keeping with the principles set forth in RFC 1521,
which can be represented using the US-ASCII or ISO-8859-x
sets should be so represented where possible, for
interoperability



The definition of character set Unicode

The 16 bit character set Unicode is defined by "The
Standard, Version 1.1". This character set is identical with
character repertoire and coding of the international
ISO/IEC 10646-1:1993(E); Coded Representation Form=UCS-2;
Subset=300; Implementation Level=3.

Note. Unicode 1.1 further specifies the use and interaction
these character codes beyond the ISO standard. However, any
10646 BMP (Basic Multilingual Plane) sequence is a valid
sequence, and vice versa; Unicode supplies interpretations
sequences on which the ISO standard is silent as
interpretation

This character set is encoded as sequences of octets, two per 16-
bit character, with the most significant octet first. Text with
odd number of octets is ill-formed

Rationale. ISO/IEC 10646-1:1993(E) specifies that when
in the UCS-2 form are serialized as octets, that the
significant octet appear first. This is also in keeping



Goldsmith & Davis [Page 2]

RFC 1641 Using Unicode with MIME July 1994


common network practice of choosing a canonical format
transmission

General Specification of Unicode Character Sets Within

The Unicode Standard is currently at version 1.1. Although
versions should be compatible with old implementations if
implementation is compliant with the standard, some
may choose to check the version of the character set that is
used. In order to allow some implementations to check the
number and allow others to ignore it, all registrations of
variants and versions for MIME usage should have MIME charset
which conform to one of the two following patterns

UNICODE-major-
UNICODE-major-minor-

Where major and minor are strings of decimal digits (0 through 9)
specifying the major and minor version number of the Unicode
to which the text in question conforms. In the interests
interoperability, the lowest version number compatible with the
should be used. The lowest acceptable version number is UNICODE-1-1,
corresponding to "The Unicode Standard, Version 1.1". The
trailing string "variant" describes the particular
format of Unicode that the registration describes; its content is
to the particular registration. If there is no trailing
string, the charset name refers to the basic two octet form
Unicode as described in "The Unicode Standard".

Example. A hypothetical charset which referred to the UTF-8
transformation format of Unicode/10646 (also known as UTF-2 or UTF
FSS) might be named UNICODE-1-1-UTF-8.

Encoding Character Set Unicode Within

Character set Unicode uses 16 bit characters, and therefore
normally be used with the Binary or Base64 content transfer
of MIME. In header fields, it would normally be used with the
content transfer encoding. The MIME character set identifier
UNICODE-1-1.

Example. Here is a text portion of a MIME message containing
Japanese word "nihongo" (hexadecimal 65E5,672C,8A9E) written in
characters

Content-Type: text/plain; charset=UNICODE-1-1
Content-Transfer-Encoding: base64




Goldsmith & Davis [Page 3]

RFC 1641 Using Unicode with MIME July 1994




Example. Here is a text portion of a MIME message containing
Unicode sequence "A." (
0041,2262,0391,002E

Content-Type: text/plain; charset=UNICODE-1-1
Content-Transfer-Encoding: base64

AEEiYgORAC4=



Many thanks to the following people for their contributions
comments, and suggestions. If we have omitted anyone it was
oversight and not intentionally

Glenn
Harald T.
Nathaniel
Lee
Jim
Dave
Steve
Dana S.
Ned
John H.
John C.
Valdis
Keith
Masataka
Einar

Security

Security issues are not discussed in this memo



[UNICODE 1.1] "The Unicode Standard, Version 1.1": Version 1.0, Volume 1
(ISBN 0-201-56788-1), Version 1.0, Volume 2 (ISBN 0-201-
60845-6), and "Unicode Technical Report #4, The
Standard, Version 1.1" (available from The
Consortium, and soon to be published by Addison-Wesley).

[ISO 10646] ISO/IEC 10646-1:1993(E) Information Technology--
Multiple-octet Coded Character Set (UCS).




Goldsmith & Davis [Page 4]

RFC 1641 Using Unicode with MIME July 1994


[UTF-7] Goldsmith, D., and M. Davis, "UTF-7: A Mail
Transformation Format of Unicode", RFC 1642, Taligent
Inc., July 1994.

[US-ASCII] Coded Character Set--7-bit American Standard Code
Information Interchange, ANSI X3.4-1986.

[ISO-8859] Information Processing -- 8-bit Single-Byte Coded
Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-
1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, 1987.
Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part 4:
Latin alphabet No. 4, ISO 8859-4, 1988. Part 5:
Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6:
Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7:
Latin/Greek alphabet, ISO 8859-7, 1987. Part 8:
Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9:
alphabet No. 5, ISO 8859-9, 1990.

[RFC822] Crocker, D., "Standard for the Format of ARPA
Text Messages", STD 11, RFC 822, UDEL, August 1982.

[RFC-1521] Borenstein N., and N. Freed, "MIME (Multipurpose
Mail Extensions) Part One: Mechanisms for Specifying
Describing the Format of Internet Message Bodies",
1521, Bellcore, Innosoft, September 1993.

[RFC-1522] Moore, K., "Representation of Non-Ascii Text in
Message Headers" RFC 1522, University of Tennessee
September 1993.

[UTF-8] X/Open Company Ltd., "File System Safe UCS
Format (FSS_UTF)", X/Open Preliminary Specification
Document Number: P316. This information also appears
Unicode Technical Report #4, and in a forthcoming annex
ISO/IEC 10646.
















Goldsmith & Davis [Page 5]

RFC 1641 Using Unicode with MIME July 1994


Authors'

David
Taligent, Inc
10201 N. DeAnza Blvd
Cupertino, CA 95014-2233

Phone: 408-777-5225
Fax: 408-777-5081
EMail: david_goldsmith@taligent.


Mark
Taligent, Inc
10201 N. DeAnza Blvd
Cupertino, CA 95014-2233

Phone: 408-777-5116
Fax: 408-777-5081
EMail: mark_davis@taligent.































Goldsmith & Davis [Page 6]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum