As per Relevance of the word structure, we have this rfc below:











Network Working Group M.
Request for Comments: 3072 March 2001
Category:


Structured Data Exchange Format (SDXF

Status of this

This memo provides information for the Internet community. It
not specify an Internet standard of any kind. Distribution of
memo is unlimited

Copyright

Copyright (C) The Internet Society (2001). All Rights Reserved

IESG

This document specifies a data exchange format and, partially, an
that can be used for creating and parsing such a format. The
notes that the same problem space can be addressed using formats
the IETF normally uses including ASN.1 and XML. The document
is strongly encouraged to carefully read section 13 before
SDXF over ASN.1 or XML. Further, when storing text in SDXF, the
is encourage to use the datatype for UTF-8, specified in section 2.5.



This specification describes an all-purpose interchange format
use as a file format or for net-working. Data is organized in
which can be ordered in hierarchical structures. This format
self-describing and CPU-independent

Table of

1. Introduction ................................................. 2
2. Description of the SDXF data format .......................... 3
3. Introduction to the SDXF functions ........................... 5
3.1 General remarks .............................................. 5
3.2 Writing a SDXF buffer ........................................ 5
3.3 Reading a SDXF buffer ........................................ 6
3.4 Example ...................................................... 6
4. Platform independence ........................................ 8
5. Compression .................................................. 9
6. Encryption ...................................................11
7. Arrays........................................................11
8. Description of the SDXF functions ............................12



Wildgrube Informational [Page 1]

RFC 3072 Structured Data Exchange Format March 2001


8.1 Introduction .................................................12
8.2 Basic definitions ............................................13
8.3 Definitions for C++ ..........................................15
8.4 Common Definitions ...........................................16
8.5 Special functions ............................................17
9. 'Support' of UTF-8 ...........................................19
10. Security Considerations .....................................19
11. Some general hints ..........................................20
12. IANA Considerations .........................................20
13. Discussion ..................................................21
13.1 SDXF vs. ASN.1 ..............................................21
13.2 SDXF vs. XML ................................................22
14. Author's Address ............................................24
15. Acknowledgements ............................................24
16. References ..................................................24
17. Full Copyright Statement ....................................26

1.

The purpose of the Structured Data eXchange Format (SDXF) is
permit the interchange of an arbitrary structured data block
different kinds of data (numerical, text, bitstrings). Because
is normalized to an abstract computer architecture
"network format", SDXF is usable as a network interchange
format

This data format is not limited to any application, the demand
this format is that it is usable as a text format for word
processing, as a picture format, a sound format, for remote
calls with complex parameters, suitable for document formats,
interchanging business data, etc

SDXF is self-describing, every program can unpack every SDXF-
without knowing the meaning of the individual data elements

Together with the description of the data format a set of
will be introduced. With the help of these functions one can
and access the data elements of SDXF. The idea is that a
should only use these functions instead of maintaining the
by himself on the level of bits and bytes. (In the speech
object-oriented programming these functions are methods of an
which works as a handle for a given SDXF data block.)

SDXF is not limited to a specific platform, along with a
preparation of the SDXF functions the SDXF data can be
(via network or data carrier) across the boundaries of
architectures (specified by the character code like ASCII, ANSI
EBCDIC and the byte order for binary data).



Wildgrube Informational [Page 2]

RFC 3072 Structured Data Exchange Format March 2001


SDXF is also prepared to compress and encrypt parts or the
block of SDXF data

2. Description of SDXF data format

2.1 First we introduce the term "chunk". A chunk is a data
with a fixed set of components. A chunk may be "elementary"
"structured". The latter one contains itself one or more
chunks

A chunk consists of a header and the data body (content):

+----------+-----+-------+-----------------------------------+
| Name | Pos.| Length| Description |
+----------+-----+-------+-----------------------------------+
| chunk-ID | 1 | 2 | ID of the chunk (unsigned short) |
| flags | 3 | 1 | type and properties of this chunk |
| length | 4 | 3 | length of the following data |
| content | 7 | *) | net data or a list of of chunks |
+----------+-----+-------+-----------------------------------+

(* as stated in "length". total length of chunk is length+6.
chunk ID is a non-zero positive number

or more visually

+----+----+----+----+----+----+----+----+----+-...
| chunkID | fl | length |
+----+----+----+----+----+----+----+----+----+-...

or in ASN.1 syntax

chunk ::=
{
chunkID INTEGER (1..65535),
flags BIT STRING
length OCTET STRING SIZE 3, -- or: INTEGER (0..16777215)
content OCTET
}

2.2 Structured chunk

A structured chunk is marked as such by the flag byte (see 2.5).
Opposed to an elementary chunk its content consists of a list
chunks (elementary or structured):






Wildgrube Informational [Page 3]

RFC 3072 Structured Data Exchange Format March 2001


+----+-+---+-------+-------+-------+-----+-------+
| id |f|len| chunk | chunk | chunk | ... | chunk |
+----+-+---+-------+-------+-------+-----+-------+

With the help of this concept you can reproduce every
structured data into a SDXF chunk

2.3 Some Remarks about the internal representation of the chunk'
elements

Binary values are always in high-order-first (big endian) format
like the binary values in the IP header (network format). A
of 300 (=256 + 32 + 12) is stored

+----+----+----+----+----+----+----+----+----+--
| | | 00 01 2C |
+----+----+----+----+----+----+----+----+----+--

in hexadecimal notation

This is also valid for the chunk-ID

2.4 Character values in the content portion are also an object
adaptation: see chapter 4.

2.5 Meaning of the flag-bits: Let us represent the flag byte in
manner

+-+-+-+-+-+-+-+-+
|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+
| | | | | | | |
| | | | | | | +--
| | | | | | +----
| | | | | +------ short
| | | | +-------- encrypted
| | | +---------- compressed
| | |
+-+-+------------ data type (0..7)

data types are

0 -- pending structure (chunk is inconsistent, see also 11.1)
1 --
2 -- bit
3 --
4 --
5 -- float (ANSI/IEEE 754-1985)



Wildgrube Informational [Page 4]

RFC 3072 Structured Data Exchange Format March 2001


6 -- UTF-8
7 --

2.6 A short chunk has no data body. The 3 byte Length field is used
data bytes instead. This is used in order to save space when
are many small chunks

2.7 Compressed and encrypted chunks are explained in chapter 5 and 6.

2.8 Arrays are explained in chapter 7.

2.9 Handling of UTF-8 is explained in chapter 9.

2.10 Not all combinations of bits are allowed or reasonable

- the flags 'array' and 'short' are mutually exclusive
- 'short' is not applicable for data type 'structure' and 'float'.
- 'array' is not applicable for data type 'structure'.

3. Introduction to the SDXF

3.1 General

The functionality of the SDXF concept is not bounded to
programming language, but of course the functions themselves must
coded in a particular language. I discuss these functions in C
C++, because in the meanwhile these languages are available on
all platforms

All these functions for reading and writing SDXF chunks uses only
parameter, a parameter structure. In C++ this parameter structure
part of the "SDXF class" and the SDXF functions are methods of
class

An exact description of the interface is given in chapter 8.

3.2 Writing a SDXF

For to write SDXF chunks, there are following functions

init -- initialize the parameter
create -- create a new
leave -- "close" a structured








Wildgrube Informational [Page 5]

RFC 3072 Structured Data Exchange Format March 2001


3.3 Reading a SDXF

For to read SDXF chunks, there are following functions

init -- initialize the parameter
enter -- "go into" a structured
next -- "go to" the next chunk inside a structured
extract -- extract the content of an elementary chunk
user's data
leave -- "go out" off a structured

3.4 Example

3.4.1 Writing

For demonstration we use a reduced (outlined) C++ Form of
functions with polymorph definitions

void create (short chunkID); // opens a new structure
void create (short chunkID, char *string);
// creates a new chunk with dataType character, etc.)

The sequence

SDXF x(new); // create the SDXF object "x" for a new
// includes the "init
x.create (3301); // opens a new
x.create (3302, "first chunk");
x.create (3303, "second chunk");
x.create (3304); // opens a new
x.create (3305, "chunk in a structure");
x.create (3306, "next chunk in a structure");
x.leave (); // closes the inner
x.create (3307, "third chunk");
x.leave (); // closes the outer
















Wildgrube Informational [Page 6]

RFC 3072 Structured Data Exchange Format March 2001


creates a chunk which we can show graphically like

3301
|
+--- 3302 = "first chunk
|
+--- 3303 = "second chunk
|
+--- 3304
| |
| +--- 3305 = "chunk in a structure
| |
| +--- 3306 = "next chunk in a structure
|
+--- 3307 = "last chunk

3.4.2

A typically access to a structured SDXF chunk is a selection
a loop

SDXF x(old); // defines a SDXF object "x" for an old
x.enter (); // enters the

while (x.rc == 0) // 0 == ok, rc will set by the SDXF
{
switch (x.chunkID
{
case 3302:
x.extract (data1, maxLength1);
// extr. 1st chunk into data
break

case 3303:
x.extract (data2, maxLength2);
// extr. 2nd chunk into data
break

case 3304: // we know this is a
x.enter (); // enters the inner

while (x.rc == 0) // inner
{
switch (x.chunkID
{
case 3305:
x.extract (data3, maxLength3);
// extr. the chunk inside struct



Wildgrube Informational [Page 7]

RFC 3072 Structured Data Exchange Format March 2001


break
case 3306:
x.extract (data4, maxLength4);
// extr. 2nd chunk inside struct
break
}
x.next (); // returns x.rc == 1 at end of
} // end-
break

case 3307:
x.extract (data5, maxLength5);
// extract last chunk into
break
// default: none - ignore unknown chunks !!!

} // end-
x.next (); // returns x.rc = 1 at end of
} // end-

4. Platform

The very most of the computer platforms today have a 8-Bits-in-a-
architecture, which enables data exchange between these platforms
But there are two significant points in which platforms may
different

a) The representation of binary numerical (the short and long int
floats).

b) The representation of characters (ASCII/ANSI vs. EBCDIC

Point (a) is the phenomenon of "byte swapping": How is a short
value 259 = 0x0103 = X'0103' be stored at address 4402?

The two flavours are

4402 4403
01 03 the big-endian,
03 01 the little-endian

Point (b) is represented by a table of the assignment of the 256
possible values of a Byte to printable or control characters. (
ASCII the letter "A" is assigned to value (or position) 0x41 = 65,
EBCDIC it is 0xC1 = 193.)






Wildgrube Informational [Page 8]

RFC 3072 Structured Data Exchange Format March 2001


The solution of these problems is to normalize the data

We fix

(a) The internal representation of binary numerals are 2-
in big-endian order

(b) The internal representation of characters is ISO 8859-1 (
known as Latin 1).

The fixing of point (b) should be regarded as a first strike.
some environment 8859-1 seems not to be the best choice, in a
or russian environment 8859-5 or 8859-7 are appropriate

Nevertheless, in a specific group (or world) of applications, that
to say all the applications which wants to interchange data with
defined protocol (via networking or diskette or something else),
internal character table must be unique

So a possibility to define a translation table (and his inversion
should be given

Important: You construct a SDXF chunk not for a specific addressee
but you adapt your data into a normalized format (or network format).

This adaption is not done by the programmer, it will be done by
create and extract function. An administrator has take care
defining the correct translation tables

5.

As stated in 2.5 there is a flag bit which declares that
following data (elementary or structured) are compressed. This
is not further interpretable until it is decompressed.
is transparently done by the SDXF functions: "create" does
compression for elementary chunks, "leave" for structured chunks
"extract" does the decompression for elementary chunks, "enter"
structured chunks

Transparently means that the programmer has only to tell the
functions that he want compress the following chunk(s).

For choosing between different compression methods and
controlling the decompressed (original) length, there is
additional definition






Wildgrube Informational [Page 9]

RFC 3072 Structured Data Exchange Format March 2001


After the chunk header for a compressed chunk, a compression
is following

+-----------------------+---------------+---------------->
| chunk header | compr. header | compressed
+---+---+---+---+---+---+---+---+---+---+---------------->
|chunkID|flg| length |md | orglength |
+---+---+---+---+---+---+---+---+---+---+---------------->

- 'orglength' is the original (decompressed) length of the data

- 'md' is the "compression method": Two methods are described here

# method 01 for a simple (fast but not very effective
"Run Length 1" or "Byte Run 1" algorithm. (More then
consecutive identical characters are replaced by the number
these characters and the character itself.)

more precisely

The compressed data consists of several sections of
length. Every section starts with a "counter" byte, a
"tiny" (8 bit) integer, which contains a length information

If this byte contains the value "n",
with n >= 0 (and n <128), the next n+1 bytes will be
unchanged
with n < 0 (and n > -128), the next byte will be
-n+1 times
n = -128 will be ignored

Appending blanks will be cutted in general. If these
necessary, they can be reconstructed while "extract"ing
the parameter field "filler" (see 8.2.1) set to
character

# method 02 for the wonderful "deflate" algorithm which
from the "zip"-people
The authors are
Jean-loup Gailly (deflate routine),
Mark Adler (inflate routine), and others

The deflate format is described in [DEFLATE].

The values for the compression method number are maintained
IANA, see chap. 12.1.





Wildgrube Informational [Page 10]

RFC 3072 Structured Data Exchange Format March 2001


6.

As stated in 2.5 there is a flag bit which declares that
following data (elementary or structured) is encrypted. This data
not interpretable until it is decrypted. En/Decryption
transparently done by the SDXF functions, "create" does
encryption for elementary chunks, "leave" for structured chunks
"extract" does the decryption for elementary chunks, "enter"
structured chunks. (Yes it sounds very similar to chapter 5.)
then one encryption method for a given range of applications is
very reasonable. Some encryption algorithms work with block
algorithms. That means that the length of the data to encrypt must
rounded up to the next multiple of this block length. This
(zero means non-blocking) is reported by the encryption
routine (addressed by the option field *encryptProc, see chapter 8.5)
with mode=3. If blocking is used, at least one byte is added,
last byte of the lengthening data contains the number of added
minus one. With this the decryption interface routine can
the real data length

If an application (or network connect handshaking protocol) needs
negotiate an encryption method it should be used a method
maintained by IANA, see chap. 12.2.

Even the en/decryption is done transparently, an encryption
(password) must be given to the SDXF functions. Encryption is
after translating character data into, decryption is done
translation from the internal ("network-") format

If both, encryption and compression are applied on the same chunk
compression is done first - compression on good encrypted data (
strings appears as different after encryption) tends to
compression rates

7.

An array is a sequence of chunks with identical chunk-ID, length
data type

At first a hint: in principle a special definition in SDXF for
an array is not really necessary

It is not forbidden that there are more than one chunk with
chunk-ID within the same structured chunk

Therefore with a sequence of SDX_next / SDX_extract calls one
fill the destination array step by step




Wildgrube Informational [Page 11]

RFC 3072 Structured Data Exchange Format March 2001


If there are many occurrences of chunks with the same chunk-ID (and
comparative small length), the overhead of the chunk-packages may
significant

Therefore the array flag is introduced. An array chunk has only
chunk header for the complete sequence of elementary chunks.
the chunk header for an array chunk, an array header is following

This is a short integer (big endian!) which contains the number
the array elements (CT). Every element has a fixed length (EL),
the chunklength (CL) is CL = EL * CT + 2.

The data elements follows immediately after the array header

The complete array will be constructed by SDX_create, the
array will be read by SDX_extract

The parameter fields (see 8.2.1) 'dataLength' and 'count' are
for the SDXF functions 'extract' and 'create':

Field 'dataLength' is the common length of the array elements
'count' is the actual dimension of the array for 'create' (input).

For the 'extract' function 'count' acts both as an input and
parameter

Input : the maximum
output: the actual array dimension

(If output count is greater than input count, the 'data cutted
warning will be responded and the destination array is filled up
the maximum dimension.)

8. Description of the SDXF

8.1

Following the principles of Object Oriented Programming, not only
description of the data is necessary, but also the functions
manipulate data - the "methods".

For the programmer knowing the methods is more important than
the data structure, the methods has to know the exact
of the data and guarantees the consistence of the data while
them






Wildgrube Informational [Page 12]

RFC 3072 Structured Data Exchange Format March 2001


A SDXF object is an instance of a parameter structure which acts as
programming interface. Especially it points to an actual SDXF
chunk, and, while processing on this data, there is a pointer to
actual inner chunk which will be the focus for the next operation

The benefit of an exact interface description is the same as
for example the standard C library functions: By using
interfaces your code remains platform independent

8.2 Basic

8.2.1 The SDXF Parameter

All SDXF access functions need only one parameter, a pointer to
SDXF parameter structure

First 3 prerequisite definitions

typedef short int ChunkID
typedef unsigned char Byte

typedef struct
{
ChunkID chunkID
Byte flags
char length [3];
Byte data
} Chunk

And now the parameter structure

typedef
{
ChunkID chunkID; // name (ID) of
Byte *container; // pointer to the whole
long bufferSize; // size of
Chunk *currChunk; // pointer to actual
long dataLength; // length of data in
long maxLength; // max. length of Chunk for SDX_
long remainingSize; // rem. size in cont. after SDX_
long value; // for data type
double fvalue; // for data type
char *function; // name of the executed SDXF
Byte *data; // pointer to
Byte *cryptkey; // pointer to Crypt
short count; // (max.) number of elements in an
short dataType; // Chunk data type / init open
short ec; // extended return-



Wildgrube Informational [Page 13]

RFC 3072 Structured Data Exchange Format March 2001


short rc; // return-
short level; // level of
char filler; // filler char for SDX_
Byte encrypt; // Indication if data to encrypt (0 / 1)
Byte compression; // compression
// (00=none, 01=RL1, 02=zip/deflate
} SDX_obj, *SDX_handle

Only the "public" fields of the parameter structure which acts
input and output for the SDXF functions is described here. A
implementation may add some "private" fields to this structure

8.2.2 Basic

All these functions works with a SDX_handle as the only
parameter. Every function returns as output ec and rc as a report
success. For the values for ec, rc and dataType see chap. 8.4.

1. SDX_init : Initialize the parameter structure

input : container, dataType, bufferSize (for dataType =
SDX_NEW only
output: currChunk, dataLength (for dataType = SDX_OLD only),
ec, rc
the other fields of the parameter structure will
initialized

2. SDX_enter : Enter a structured chunk
You can access the first chunk inside this structured chunk
input :
output: currChunk, chunkID, dataLength, level, dataType
ec,

3. SDX_leave : Leave the actual entered structured chunk
input :
output: currChunk, chunkID, dataLength, level, dataType
ec,

4. SDX_next : Go to the next chunk inside a structured chunk
input :
output: currChunk, chunkID, dataLength, dataType, count, ec,

At the end of a structured chunk SDX_next returns rc =
SDX_RC_failed and ec = SDX_EC_eoc (end of chunk
The actual structured chunk is SDX_leave'd automatically






Wildgrube Informational [Page 14]

RFC 3072 Structured Data Exchange Format March 2001


5. SDX_extract : Extract data of the actual chunk
(If actual chunk is structured, only a copy is done,
the data is converted to host format.)
input / output depends on the dataType

if dataType is structured, binary or char
input : data, maxLength, count,
output: dataLength, count, ec,

if dataType is numeric (float resp.):
input :
output: value (fvalue resp.), ec,

6. SDX_select : Go to the (next) chunk with a given chunkID
input :
output: currChunk, dataLength, dataType, ec,

7. SDX_create : Creating a new chunk (at the end of the
structured chunk).
input : chunkID, dataLength, data, (f)value, dataType
compression, encrypt,
update: remainingSize,
output: currChunk, dataLength, ec,

8. SDX_append : Append a complete chunk at the end of the
structured chunk).
input : data, maxLength,
update: remainingSize,
output: chunkID, chunkLength, maxLength, dataType, ec,

8.3 Definitions for C++

This is the specification of the SDXF class in C++: (The type 'Byte
is defined as "unsigned char" for bitstrings, opposed to "
char" for character strings

class C_
{
public

// constructors and destructor
C_SDXF (); //
C_SDXF (Byte *cont); // old
C_SDXF (Byte *cont, long size); // new
C_SDXF (long size); // new
~C_SDXF ();
// methods




Wildgrube Informational [Page 15]

RFC 3072 Structured Data Exchange Format March 2001


void init (void); // old
void init (Byte *cont); // old
void init (Byte *cont, long size); // new
void init (long size); // new

void enter (void);
void leave (void);
void next (void);
long extract (Byte *data, long length); // chars,
long extract (void); // numeric
void create (ChunkID); //
void create (ChunkID, long value); //
void create (ChunkID, double fvalue); //
void create (ChunkID, Byte *data, long length);//
void create (ChunkID, char *data); //
void set_compression (Byte compression_method);
void set_encryption (Byte *encryption_key);

// interface

ChunkID id; // see 8.4.1
short dataType; // see 8.4.2
long length; // length of data or

long value
double fvalue
short rc; // the raw return code see 8.4.3
short ec; // the extended return code see 8.4.4

protected
// implementation dependent ...

};

8.4 Common Definitions

8.4.1 Definition of ChunkID

typedef short ChunkID

8.4.2 Values for dataType

SDX_DT_inconsistent = 0
SDX_DT_structured = 1
SDX_DT_binary = 2
SDX_DT_numeric = 3
SDX_DT_char = 4
SDX_DT_float = 5



Wildgrube Informational [Page 16]

RFC 3072 Structured Data Exchange Format March 2001


SDX_DT_UTF8 = 6

data types for SDX_init
SDX_OLD = 1
SDX_NEW = 2

8.4.3 Values for rc

SDX_RC_ok = 0
SDX_RC_failed = 1
SDX_RC_warning = 1
SDX_RC_illegalOperation = 2
SDX_RC_dataError = 3
SDX_RC_parameterError = 4
SDX_RC_programError = 5
SDX_RC_noMemory = 6

8.4.4 Values for ec

SDX_EC_ok = 0
SDX_EC_eoc = 1 // end of
SDX_EC_notFound = 2
SDX_EC_dataCutted = 3
SDX_EC_overflow = 4
SDX_EC_wrongInitType = 5
SDX_EC_comprerr = 6 // compression
SDX_EC_forbidden = 7
SDX_EC_unknown = 8
SDX_EC_levelOvflw = 9
SDX_EC_paramMissing = 10
SDX_EC_magicError = 11
SDX_EC_not_consistent = 12
SDX_EC_wrongDataType = 13
SDX_EC_noMemory = 14
SDX_EC_error = 99 // rc is

8.5 Special

Besides the basic definitions there is a global
(SDX_getOptions) which returns a pointer to a global table
options

With the help of these options you can adapt the behaviour of SDXF
Especially you can define an alternative pair of translation
or an alternative function which reads these tables from an
resource (p.e. from disk).





Wildgrube Informational [Page 17]

RFC 3072 Structured Data Exchange Format March 2001


Within this table of options there is also a pointer to the
which is used for encryption / decryption: You can install your
encryption algorithm by setting this pointer

The options pointer is received by

SDX_TOptions *opt = SDX_getOptions ();

With

typedef
{
Byte *toHost; // Trans tab net ->
Byte *toNet; // Trans tab host ->
int maxlevel; // highest possible
int translation; // translation net <->
// is in effect=1 or not=0
TEncryptProc *encryptProc; // alternate encryption
TGetTablesProc *getTablesProc; // alternate routine
// translation
TcvtUTF8Proc *convertUTF8; // routine to convert to/from UTF-8
} SDX_TOptions

typedef long TencryptProc (
int mode, // 1= to encrypt, 2= to decrypt, 3= encrypted
Byte *buffer, // data to en/
long len, // len: length of
char *passw); //

// returns length of en/de-crypted
// (parameter buffer and passw are ignored for mode=3)
// returns blocksize for mode=3 and len=0.
// blocksize is zero for non-blocking

typedef int TGetTablesProc (Byte **toNet, Byte **toHost);
// toNet, toHost: pointer to output params. Both
// points to translation tables of 256 Bytes
// returns success: 1 = ok, 0 = error

typedef int TcvtUTF8
( int mode, // 1 = to UTF-8, 2 = from UTF-8
Byte *target, int *targetlength, //
Byte *source, int sourcelength); //
// targetlength contains maximal size as input param
// returns success: 1 = ok, 0 = no






Wildgrube Informational [Page 18]

RFC 3072 Structured Data Exchange Format March 2001


9. 'Support' of UTF-8.

Many systems supports [UTF-8] as a character format for
data. The benefit is that no fixing of a specific character set
an application is needed because the set of 'all' characters is used
represented by the 'Universal Character Set' UCS-2 [UCS], a
byte coding for characters

SDXF does not really deal with UTF-8 by itself, there are
possibilities to interprete an UTF-8 sequence: The application may

- reconstruct the UCS-2 sequence
- accepts only the pure ASCII character and maps non-ASCII to
special 'non-printable' character
- target is pure ASCII, non-ASCII is replaced in a senseful
(French accented vowels replaced by vowels without accents, etc.).
- target is a specific ANSI character set, the non-ASCII chars
mapped as possible, other replaced to a 'non-printable'.
- etc

But SDXF offers an interface for the 'extract' and 'create
functions

A function pointer may be specified in the options table to
this possibility (see 8.5). Default for this pointer is NULL:
further conversions are done by SDXF, the data are copied 'as is',
is treated as a bit string as for data type 'binary'.

If this function is specified, it is used by the 'create'
with the 'toUTF8' mode, and by the 'extract' function with the '
fromUTF8' mode. The invoking of these functions is done by
transparently

If the function returns zero (no conversion) SDXF copies the
without conversion

10. Security

Any corruption of data in the chunk headers denounce the
SDXF structure

Any corruption of data in a encrypted or compressed SDXF
makes this chunk unusable. An integrity check after decryption
decompression should be done by the "enter" function

While using TCP/IP (more precisely: IP) as a transmission medium
can trust on his CRC check on the transport layer




Wildgrube Informational [Page 19]

RFC 3072 Structured Data Exchange Format March 2001


11. Some general

1. A consistent construction of a SDXF structure is done if
"create" to a structured chunk is closed by a paired "leave".
While a structured chunk is under construction, his data type
set to zero - that means: this chunk is inconsistent.
SDX_leave function sets the datatype to "structured".

2. While creating an elementary chunk a platform
transformation to a platform independent format of the data
performed - at the end of construction the content of the
is ready to transport to another site, without any
translation

3. As you see no data definition in your programming language
needed for to construct a specific SDXF structure. The data
created dynamically by function calls

4. With SDXF as a base you can define protocols for client /
applications. These protocols may be extended in
compatibility manner by following two rules

Rule 1: Ignore unknown chunkIDs

Rule 2: The sequence of chunks should not be significant

12. IANA

The compression and encryption algorithms for SDXF is not fixed,
is open for various algorithms. Therefore an agreement is
to interprete the compression and encryption algorithm
numbers. (Encryption methods are not a semantic part of SDXF,
may be used for a connection protocol to negotiate the
method to use.)

Following two items are registered by IANA

12.1 COMPRESSION METHODS FOR

The compressed SDXF chunk starts with a "compression header".
header contains the compression method as an unsigned 1-Byte
(1-255). These numbers are assigned by IANA and listed here









Wildgrube Informational [Page 20]

RFC 3072 Structured Data Exchange Format March 2001



method Description
--------- ------------------------------- -------------
01 RUN-LENGTH algorithm see chap. 5
02 DEFLATE (ZIP) see [DEFLATE
03-239 IANA to
240-255 private or application

12.2 ENCRYPTION METHODS FOR

An unique encryption method is fixed or negotiated by handshaking
For the latter one a number for each encryption method is necessary
These numbers are unsigned 1-Byte integers (1-255). These
are assigned by IANA and listed here


method
--------- ------------------------------
01-239 IANA to
240-255 private or application

12.3 Hints for assigning a number

Developers which want to register a compression or encrypt method
SDXF should contact IANA for a method number. The ASSIGNED
document should be referred to for a current list of METHOD
and their corresponding protocols, see [IANA]. The new method
be a standard published as a RFC or by a established
organization (as OSI).

13.

There are already some standards for Internet data exchanging,
prefers ASN.1 and XML therefore. So the reasons for establish a
data format should be discussed

13.1 SDXF vs. ASN.1

The demand of ASN.1 (see [ASN.1]) is to serve program
independent means to define data structures. The real data
which is used to send the data is not defined by ASN.1 but
BER or PER (or some derivates of them like CER and DER) are used
this context, see [BER] and [PER].








Wildgrube Informational [Page 21]

RFC 3072 Structured Data Exchange Format March 2001


The idea behind ASN.1 is: On every platform on which a
application is to develop descriptions of the used data
are available in ASN.1 notation. Out off these notations the
language dependent definitions are generated with the help of
ASN.1-compiler

This compiler generates also transform functions for these
structures for to pack and unpack to and from the BER (or other
format

A direct comparison between ASN.1 and SDXF is somehow inappropriate
The data format of SDXF is related rather to BER (and relatives).
The use of ASN.1 to define data structures is no contradiction
SDXF, but: SDXF does not require a complete data structure to
the message to send, nor a complete data structure will be
out off the received message

The main difference lies in the concept of building
interpretation of the message, I want to name it the "static"
"dynamic" concept

o ASN.1 uses a "static" approach: The whole data structure
exists before the message can be created

o SDXF constructs and interpretes the message in a "dynamic" way
the message will be packed and unpacked step by step by
functions

The use of static structures may be appropriate for a series
applications, but for complex tasks it is often impossible to
the message as a whole. As an example try to define an ASN.1
description for a complex structured text document which is
in XML: There are sections and paragraphs and text elements
may recursively consist of sections with specific text attributes

13.2 SDXF vs.

On the one hand SDXF and XML are similar as they can handle
recursive complex data stream. The main difference is the kind
data which are to be maintained

o XML works with pure text data (though it should be noted that
character representation is not standardized by XML). And: a
document with all his tags is readable by human. Binary data
graphic is not included directly but may be referenced by
external link as in HTML





Wildgrube Informational [Page 22]

RFC 3072 Structured Data Exchange Format March 2001


In XML there is no strong separation between informational
control data, escape characters (like "<" and "&") and
construction are used to distinguish between
two types of data

o SDXF maintains machine-readable data, it is not designed to
readable by human nor to edit SDXF data with a text editor (
more if compression and encryption is used). With the help of
SDXF functions you have a quick and easy access to every
element. The standard parser for a SDXF data structure
always a simple template, the "while - switch -case ID -
enter/extract" pattern as outlined in chap. 3.4.2.

Because of the complete different philosophy behind XML and SDXF (
even ASN.1) a direct comparison may not be very senseful, as XML
its own right to exist next to ASN.1 (and even SDXF).

Nevertheless there is a chance to convert a XML data stream into
SDXF structure: As a first strike, every XML tag becomes a
chunk ID. An elementary sequence pure text can
transformed into an elementary (non-structured) chunk with data
"character". Tags with attributes and sequences with nested tags
transformed into structured chunks. Because XML allows a
sequence everywhere in a text stream, an artificially "
text" tag must be introduced
If is the tag for text elements, the sequence

this is a text with attributes
is to be "in thought" replaced by

this is a text with attributes
(With "et" as the "elementary text" tag
















Wildgrube Informational [Page 23]

RFC 3072 Structured Data Exchange Format March 2001


This results in following SDXF structure

ID_
|
+-- ID_et = " this is a text "
|
+-- ID_
| |
| +-- ID_value = "bold
| |
| +-- ID_et = "with
|
+-- ID_et = " attributes

ID_t and ID_et may be represented by the same chunk ID,
distinguished by the data type ("structured" for and "character
for )

Binary data as pictures can be directly imbedded into a
structure instead referencing them as an external link like in HTML

14. Author's

Max
Schlossstrasse 120
60486


EMail: max@wildgrube.

15.

I would like to thank Michael J. Slifcak (mslifcak@iss.net) for
supporting discussions

16.

[ASN.1] Information processing systems - Open
Interconnection, "Specification of Abstract Syntax
One (ASN.1)", International Organization
Standardization, International Standard 8824,
1987.

[BER] Information Processing Systems - Open
Interconnection - "Specification of Basic Encoding
for Abstract Notation One (ASN.1)",
Organization for Standardization, International
8825-1, December 1987.



Wildgrube Informational [Page 24]

RFC 3072 Structured Data Exchange Format March 2001


[DEFLATE] Deutsch, P., "DEFLATE Compressed Data Format
version 1.3", RFC 1951, May 1996.

[IANA] Internet Assigned Numbers Authority
http://www.iana.org/numbers.

[PER] Information Processing Systems - Open
Interconnection -"Specification of Packed Encoding
for Abstract Syntax Notation One (ASN.1)",
Organization for Standardization, International
8825-2.

[UCS] ISO/IEC 10646-1:1993. International Standard --
technology -- Universal Multiple-Octet Coded Character
(UCS

[UTF8] Yergeau, F., "UTF-8, a transformation format of ISO 10646",
RFC 2279, January 1998.

































Wildgrube Informational [Page 25]

RFC 3072 Structured Data Exchange Format March 2001


17. Full Copyright

Copyright (C) The Internet Society (2001). All Rights Reserved

This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English

The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns

This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE



Funding for the RFC Editor function is currently provided by
Internet Society



















Wildgrube Informational [Page 26]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum