# Friday, 24 June 2005

We’ve got some XML documents that are getting written out with way too many namespace declarations.  That probably wouldn’t be too much of a problem, except we then use those XML documents as templates to generate other documents, many with repetitive elements.  So we’re ending up with namespace bloat.  Scott and I found an example that was coming across the network at about 1.5Mb.  That’s a lot.  A large part of that turned out to be namespace declarations.  Because of the way XmlTextWriter does namespace scoping, it doesn’t write out a namespace declaration until it first sees it, which means for leaf nodes with a different namespace than their parent node, you end up with a namespace declaration on every element, like this…

<?xml version="1.0" encoding="UTF-8"?>

<ns0:RootNode xmlns:ns0="http://namespace/0">

            <ns1:FirstChild xmlns:ns1="http://namespace/1">

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

            </ns1:FirstChild>

</ns0:RootNode>

With our actual namespace strings, that’s like an additional 60 btyes per element that we don’t really need.  What we’d like to see is the namespaces declared once at the top of the file, then referenced elsewhere, like this…

<?xml version="1.0" encoding="UTF-8"?>

<ns0:RootNode xmlns:ns0="http://namespace/0" xmlns:ns1="http://namespace/1"  xmlns:ns2="http://namespace/2">

            <ns1:FirstChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

                        <ns2:SecondChild>Value</ns2:SecondChild>

            </ns1:FirstChild>

</ns0:RootNode>

When we edited the templates manually to achieve this effect, the 1.5Mb document went to like 660Kb.  Much better.

There doesn’t seem to be any way to get XmlTextWriter to do this, however.  Even if you explicitly write out the extra namespaces on the root element, you still get them everywhere, since the writer sees those as just attributes you chose to write, and not namespace declarations. 

Curses!  I’ve spent all day on this and have no ideas.  Anyone have any input?

Work | XML