[Nym3-devel] Alternative packet format.

Laurent Fousse laurent at komite.net
Sat Nov 19 18:45:37 CET 2005


Hello list,

As a followup to a recent discussion on irc, I'd like to put the
current thoughts about an alternative packet format on the list.

It all started when we wanted to add a mean for the server to tell the
nymuser the email domain the account is hosted on; that's where we
realized the packet format is not nearly extensible enough: any change
(like adding a field in the CREATED message) will result in an
incompatible change in the packet format. This calls for a more
flexible format.

The contenders for a new packet format were ad-hoc dialects of XML or
YAML[1]. XML wastes bytes with starting/ending tags, YAML wastes less,
but this is mostly a non-issue since everything is bound to be gzip'ed
at the mixminion layer.

From the anonymity POV however, XML like YAML allow several
representations of the same data (if only with non significant
whitespaces in XML). Where users have a choice in representation they
can be segmented, and this is bad. I could come up with a constrained
version of YAML that allows only one canonical representation of a
given data structure; but this will probably not be used because Nick
proposed SEXP[2].

As Nick puts it :

<quote>
   It has the advantages of
	1) being even easier to parse and
	2) having a "canonical" representation [to avoid distinguishability
	   attacks]. 
   It has the disadvantages of
	A) not directly representing non-tree data structures (but we
	dont' care about that)
	B) being octet-oriented, and as such requiring a little
	   thought about unicode issues (but that thought needs to
	   happen anyway)
	C) Not mapping directly to python objects as well as yaml.
    As a concrete example, writing a non-recursive parser/generator
    for canonical-form sexp took about 25 minutes, 100 lines and had
    no bugs.
</quote>

We can solve A) by embedding an encoding of a python dictionary as a
list of length-2 lists, possibly with a "start-dict" marker, if we
need to. Or we could say that we don't need dictionary, and that we
solve the issue by specifying positional parameters in the format.
This solve C) too because SEXP deals with lists and raw octets tokens,
and if all we need in the format is lists and "strings", then SEXP
maps very well to python objects.

B) is about specifying the encoding for the string values in the
format. Do we need more than ascii ? It seems to me that for most
parameters non-ascii values would be invalid in any case (for the nym
username for example, since it maps to an email address). Allowing
utf-8 encoding could lead to a distinguishability issue: I can't even
produce most of the characters that can be utf-8 encoded !

If we want to allow more than ascii nonetheless, should we let the
users specify the encoding used once at the start of the control
message ? Or maybe §4.6 in the SEXP specs can be used for a per-value
choice of encoding. (for the moment I'd say ascii is good enough).

[1] http://yaml.org/
[2] http://theory.lcs.mit.edu/~rivest/sexp.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 481 bytes
Desc: Digital signature
Url : http://lists.noreply.org/pipermail/nym3-devel/attachments/20051119/333adabc/attachment.pgp


More information about the Nym3-devel mailing list