As per Relevance of the word internet, we have this rfc below:











Network Working Group J.
Request for Comments: 1468 Keio
M.
Panda
E. van der
June 1993


Japanese Character Encoding for Internet

Status of this

This memo provides information for the Internet community. It
not specify an Internet standard. Distribution of this memo
unlimited



This document describes the encoding used in electronic mail [RFC822]
and network news [RFC1036] messages in several Japanese networks.
was first specified by and used in JUNET [JUNET]. The encoding is
also widely used in Japanese IP communities

The name given to this encoding is "ISO-2022-JP", which is
to be used in the "charset" parameter field of MIME headers (
[MIME1] and [MIME2]).



The text starts in ASCII [ASCII], and switches to Japanese
through an escape sequence. For example, the escape sequence ESC $
(three bytes, hexadecimal values: 1B 24 42) indicates that the
following this escape sequence are Japanese characters, which
encoded in two bytes each. To switch back to ASCII, the
sequence ESC ( B is used

The following table gives the escape sequences and the character
used in ISO-2022-JP messages. The ISOREG number is the
number in ISO's registry [ISOREG].

Esc Seq Character Set

ESC ( B ASCII 6
ESC ( J JIS X 0201-1976 ("Roman" set) 14
ESC $ @ JIS X 0208-1978 42
ESC $ B JIS X 0208-1983 87

Note that JIS X 0208 was called JIS C 6226 until the name was



Murai, Crispin & van der Poel [Page 1]

RFC 1468 Japanese Character Encoding for Internet Messages June 1993


on March 1st, 1987. Likewise, JIS C 6220 was renamed JIS X 0201.

The "Roman" character set of JIS X 0201 [JISX0201] is identical
ASCII except for backslash () and tilde (~). The backslash
replaced by the Yen sign, and the tilde is replaced by overline.
set is Japan's national variant of ISO 646 [ISO646].

The JIS X 0208 [JISX0208] character sets consist of Kanji, Hiragana
Katakana and some other symbols and characters. Each character
up two bytes

For further details about the JIS Japanese national character
standards, refer to [JISX0201] and [JISX0208]. For
information about the escape sequences, see [ISO2022] and [ISOREG].

If there are JIS X 0208 characters on a line, there must be a
to ASCII or to the "Roman" set of JIS X 0201 before the end of
line (i.e., before the CRLF). This means that the next line starts
the character set that was switched to before the end of the
line

Also, the text must end in ASCII

Other restrictions are given in the Formal Syntax below

Formal

The notational conventions used here are identical to those used
RFC 822 [RFC822].

The * (asterisk) convention is as follows

l*m

meaning at least l and at most m somethings, with l and m
default values of 0 and infinity, respectively


message = headers 1*( CRLF *single-byte-char *
single-byte-seq *single-byte-char )
; see also [MIME1] "body-part
; note: must end in

headers =

segment = single-byte-segment / double-byte-

single-byte-segment = single-byte-seq 1*single-byte-



Murai, Crispin & van der Poel [Page 2]

RFC 1468 Japanese Character Encoding for Internet Messages June 1993


double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )

single-byte-seq = ESC "(" ( "B" / "J" )

double-byte-seq = ESC "$" ( "@" / "B" )

CRLF = CR

; ( Octal, Decimal.)

ESC = ; ( 33, 27.)

SI = ; ( 17, 15.)

SO = ; ( 16, 14.)

CR = ; ( 15, 13.)

LF = linefeed> ; ( 12, 10.)

one-of-94 = ; (41-176, 33.-126.)

7BIT = ; ( 0-177, 0.-127.)

single-byte-char = including CRLF, and not including ESC, SI, SO

MIME

The name given to the JUNET character encoding is "ISO-2022-JP".
name is intended to be used in MIME messages as follows

Content-Type: text/plain; charset=iso-2022-

The ISO-2022-JP encoding is already in 7-bit form, so it is
necessary to use a Content-Transfer-Encoding header. It should
noted that applying the Base64 or Quoted-Printable encoding
render the message unreadable in current JUNET software

ISO-2022-JP may also be used in MIME Part 2 headers. The "B
encoding should be used with ISO-2022-JP text

Background

The JUNET encoding was described in the JUNET User's Guide [JUNET
(JUNET Riyou No Tebiki Dai Ippan).

The encoding is based on the particular usage of ISO 2022



Murai, Crispin & van der Poel [Page 3]

RFC 1468 Japanese Character Encoding for Internet Messages June 1993


by 4/1 (see [ISO2022] for details). However, the escape
normally used for this announcement is not included in ISO-2022-
messages

The Kana set of JIS X 0201 is not used in ISO-2022-JP messages

In the past, some systems erroneously used the escape sequence ESC (
H in JUNET messages. This escape sequence is officially
for a Swedish character set [ISOREG], and should not be used in ISO
2022-JP messages

Some systems do not distinguish between ESC ( B and ESC ( J
between ESC $ @ and ESC $ B for display. However, when relaying
message to another system, the escape sequences must not be
in any way

The human user (not implementor) should try to keep lines within 80
display columns, or, preferably, within 75 (or so) columns, to
insertion of ">" at the beginning of each line in excerpts. Each
X 0208 character takes up two columns, and the escape sequences
not take up any columns. The implementor is reminded that JIS X 0208
characters take up two bytes and should not be split in the middle
break lines for displaying, etc

The JIS X 0208 standard was revised in 1990, to add two characters
the end of the table. Although ISO 2022 specifies special
escape sequences to indicate the use of revised character sets, it
suggested here not to make use of this special escape sequence
ISO-2022-JP text, even if the two characters added to JIS X 0208
1990 are used

For further information about Japanese character encodings such as
codes, FTP locations of implementations, etc, see "
Handling of Japanese Text" [JPN.INF].



[ASCII] American National Standards Institute, "Coded character
-- 7-bit American national standard code for
interchange", ANSI X3.4-1986.

[ISO646] International Organization for Standardization (ISO),
"Information technology -- ISO 7-bit coded character set
information interchange", International Standard, Ref. No. ISO/
646:1991.

[ISO2022] International Organization for Standardization (ISO),
"Information processing -- ISO 7-bit and 8-bit coded character



Murai, Crispin & van der Poel [Page 4]

RFC 1468 Japanese Character Encoding for Internet Messages June 1993


-- Code extension techniques", International Standard, Ref. No.
2022-1986 (E).

[ISOREG] International Organization for Standardization (ISO),
"International Register of Coded Character Sets To Be Used
Escape Sequences".

[JISX0201] Japanese Standards Association, "Code for
Interchange", JIS X 0201-1976.

[JISX0208] Japanese Standards Association, "Code of the
graphic character set for information interchange", JIS X 0208-1978,
-1983 and -1990.

[JPN.INF] Ken R. Lunde , "Electronic Handling
Japanese Text", March 1992,
msi.umn.edu(128.101.24.1):pub/lunde/japan[123].

[JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's
Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("
User's Guide (First Edition)"), February 1988.

[MIME1] Borenstein N., and N. Freed, "MIME (
Internet Mail Extensions): Mechanisms for Specifying
Describing the Format of Internet Message Bodies", RFC 1341,
Bellcore, Innosoft, June 1992.

[MIME2] Moore, K., "Representation of Non-ASCII Text in
Message Headers", RFC 1342, University of Tennessee, June 1992.

[RFC822] Crocker, D., "Standard for the Format of ARPA
Text Messages", STD 11, RFC 822, UDEL, August 1982.

[RFC1036] Horton M., and R. Adams, "Standard for Interchange of
Messages", RFC 1036, AT&T Bell Laboratories, Center for
Studies, December 1987.



Many people assisted in drafting this document. The authors wish
thank in particular Akira Kato, Masahiro Sekiguchi and Ken'
Handa

Security

Security issues are not discussed in this memo





Murai, Crispin & van der Poel [Page 5]

RFC 1468 Japanese Character Encoding for Internet Messages June 1993


Authors'

Jun
Keio
5322 Endo,
Kanagawa 252

Fax: +81 466 49 1101
EMail: jun@wide.ad.


Mark
Panda
6158 Lariat Loop
Bainbridge Island, WA 98110-2098


Phone: +1 206 842 2385
EMail: MRC@PANDA.


Erik M. van der
A-105 Park
4-4-10 Ohta,
Chiba 292

Phone: +81 438 22 5836
Fax: +81 438 22 5837
EMail: erik@poel.juice.or.






















Murai, Crispin & van der Poel [Page 6]







if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum