reworking to a slightly more concise syntax:
[argument name="arg0" value="value"]
[argument name="arg1" value="value"]
[argument name="arg0" value="value"]]
[argument name="arg0" value="anothervalue"]]]]
so, yeah, not a huge difference...
a bigger saving is (for performance) eliminating things like free-floating text, omitting support for full namespaces, and adding support for explicit numeric values (this is essentially what my XML-based compiler AST formats did, though retaining the normal external syntax). also sometimes useful is options for encoding raw binary data (in ASCII form, generally dumped out as Base64 or a Base85 variant). a lot here depends on the exact in-memory node representation, ...
as for reducing size (via compression/serialization), there are a few options:
XML+Deflate, which is relatively straightforward, and compresses fairly well, but is slightly more expensive to encode/decode;
WBXML, basically works, but has some limitations, and results in bigger files than XML+Deflate;
EXI, never got around to fully evaluating, compresses well but I found the spec difficult to make much sense out of;
I had a few of my own variants, one example was SBXE, which was a "slightly improved" alternative to WBXML (slightly more compact, and more features).
SBXE+Deflate was generally slightly more compact than XML+Deflate, but the difference was fairly small.
another was related to the (never fully implemented or used) XML-coding mode of my "BSXRP" protocol, which would have used Huffman compression and VLC coding for values. (as-is, the protocol is mostly used for encoding S-Expression like data...).
both formats were loosely based (in concept) on LZP, in particular the encoding tries to predict the following tag or attribute (based on recent history), allowing this case to be coded more efficiently (and does not depend on the use of a schema), otherwise (should this prediction fail) there is the option of reusing a recently-coded value, or (as-needed) explicitly encoding the tag or attribute name (as a string). SBXE used an LZP variant for strings, whereas BSXRP used LZ77 (and an otherwise Deflate-like data representation).