# Thursday, June 30, 2005

I finally solved the namespace issue I was having, although I’ll probably burn for all eternity for the solution.  In short, because of the behavior of XmlTextWriter, the only solution that could be implemented in a reasonable amount of time was to post-process the XML and strip out the extra namespace declarations. 

So I started down the path of using XmlTextReader to spin through and collect up all the namespaces that I needed, then add those to the root node.  After that I could use a regular expression to strip out all the unneeded ones.  Turns out I had overlooked the fact that the input isn’t guaranteed to be well-formed XML.  :-(

The “XML” is actually a template that our system uses to do some tag replacement.  So the output of that process is well-formed, but the input can contain the “@” character inside element names.  A no-no according to the XML spec. 

So here it is, the all-regular-expression solution.  I wouldn’t suggest you try this at home, but it does actually work, and seems to be quite fast (sub 1/4 second for a 1.5Mb input, and the typical input is more like 10K). 

Note: this is made a little simpler because I know (since I just wrote out the “XML”) that all the namespace prefixes we care about start with ns, e.g. ns0, ns1, etc.

                    #region begin hairy namespace rectifying code here

                    //this is necessary because the XmlTextWriter puts in more namespace

                    //declarations than we want, which causes file bloat.


                    Regex strip = new Regex(@"xmlns\:ns\d=""[^""]*""");

                    ArrayList names = new ArrayList();

                    MatchCollection matches = strip.Matches(result);

                    foreach(Match match in matches)


                        string val = match.Value;






                    string fixedNamespaces = null;

                    StringBuilder sb = new StringBuilder();

                    foreach(string name in names)


                        sb.AppendFormat(" {0}",name);



                    fixedNamespaces = result;


                    int pos = fixedNamespaces.IndexOf(">",0);//should be the end of the xml declaration

                    pos = fixedNamespaces.IndexOf(">",pos+1);//should be end of root node.


                    fixedNamespaces = fixedNamespaces.Insert(pos,sb.ToString());


                    pos = fixedNamespaces.IndexOf(">",0);//should be the end of the xml declaration

                    pos = fixedNamespaces.IndexOf(">",pos+1);//should be end of root node.

                    result = strip.Replace(fixedNamespaces, "", -1, pos);



Thursday, June 30, 2005 12:53:56 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [1]  | 
# Friday, June 24, 2005

We’ve got some XML documents that are getting written out with way too many namespace declarations.  That probably wouldn’t be too much of a problem, except we then use those XML documents as templates to generate other documents, many with repetitive elements.  So we’re ending up with namespace bloat.  Scott and I found an example that was coming across the network at about 1.5Mb.  That’s a lot.  A large part of that turned out to be namespace declarations.  Because of the way XmlTextWriter does namespace scoping, it doesn’t write out a namespace declaration until it first sees it, which means for leaf nodes with a different namespace than their parent node, you end up with a namespace declaration on every element, like this…

<?xml version="1.0" encoding="UTF-8"?>

<ns0:RootNode xmlns:ns0="http://namespace/0">

            <ns1:FirstChild xmlns:ns1="http://namespace/1">

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>

                        <ns2:SecondChild xmlns:ns2="http://namespace/2">Value</ns2:SecondChild>



With our actual namespace strings, that’s like an additional 60 btyes per element that we don’t really need.  What we’d like to see is the namespaces declared once at the top of the file, then referenced elsewhere, like this…

<?xml version="1.0" encoding="UTF-8"?>

<ns0:RootNode xmlns:ns0="http://namespace/0" xmlns:ns1="http://namespace/1"  xmlns:ns2="http://namespace/2">












When we edited the templates manually to achieve this effect, the 1.5Mb document went to like 660Kb.  Much better.

There doesn’t seem to be any way to get XmlTextWriter to do this, however.  Even if you explicitly write out the extra namespaces on the root element, you still get them everywhere, since the writer sees those as just attributes you chose to write, and not namespace declarations. 

Curses!  I’ve spent all day on this and have no ideas.  Anyone have any input?

Work | XML
Friday, June 24, 2005 2:53:54 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [10]  | 

Vikki and I went to see Batman Begins last weekend up in Seattle, and really enjoyed it.  I was pondering the phenomenon that is Batman during the movie, and started thinking that Batman has become such an iconic figure in our contemporary mythos that it really frees the director.  It’s like making a Robin Hood movie.  You don’t have to worry about telling the story, because everyone already knows the story.  So the directory can focus on the details. 

Christian Bale was fantastic as the brooding playboy-without-conscience who beats up bad guys in his spare time.  He really brought a lot of detail to the character, and you can really start to understand what kind of guy Bruce Wayne would have to be to become Batman. 

Great supporting cast too.  Liam Neeson makes a great villain.  Good, atmospheric physical culture.  They did a good job of bringing the brooding Gothic/Art Deco style of Gotham into the modern age.  Definitely worth seeing. 

Friday, June 24, 2005 2:25:36 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
# Wednesday, June 15, 2005
The New York Times has a very positive review (reg. required) of Batman Begins.  So maybe there is hope.  It’s amazing what you can do with a directory who cares, and some actors who can really act.  I’ve been a fan of Christian Bale ever since he was “Falstaff’s Boy” in Henry V.  Maybe I’ll get a chance to see it this weekend…
Wednesday, June 15, 2005 3:03:35 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
# Tuesday, June 14, 2005

I haven’t seen too many new films lately, but here’s a quick rundown on a few…

  • Hitchhiker’s Guide to the Galaxy: if you’ve never read the books or seen the BBC series, not a bad film.  If you are hoping that it will represent the genius of the book (or even the BBC series) forget it.  Yet another terrible adaptation to the screen.  But in it’s own right I thought it had its moments.  My biggest complaint was that they lost much of the great linguistic jokes Adams was so good at and replaced it with slapstick pie-in-the-face antics.  Still amusing, but no where near as satisfying.
  • Merchant of Venice: totally fell asleep.  The parts I did see seemed to be a bit over acted.  I’ll have to try to watch the whole thing and see what I think.
  • Elektra: in short, it blew.  Very disappointing.  I’m a big fan of Alias, so I was hoping for more.  The story line was so disjoint that it was hard to follow, but intrusive enough to spoil the martial bits.  Yuck!
  • A Series of Unfortunate Events: one of the better ones I’ve seen of late.  My kids really liked it too.  Funny both intellectually and in typical over-the-top Jim Carry style.  He was fantastic, as were many of the character parts.  The children also did very well.  They did a great job of maintaining the ambiance.  Easily accessible for children, but plenty there for the grown-ups.
  • Angel (Season 3): I started re-watching Season 3, and I think this is the one where they were really firing on all cylinders.  The lost their way a bit in Season 4, but 3 was great.  All the characters had settled into their parts, the story arc was good, and hadn’t gotten too wacky.  Introduces some great bit characters, like Skip the Demon, who commutes to his job in Hell.  Good stuff.

I’d like to see Mr. & Mrs. Smith, as it seems to be getting some good reviews.  Maybe this weekend…

Tuesday, June 14, 2005 4:27:56 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
The new 2.0 version of MaxiVista is out, and as usual it kicks complete a**.  It supports additional monitors, so I’ve got it running across 3 right now.  Plus, the new remote control feature is fantastic for times when you need to get at your other machines but don’t want to have to shuffle keyboards, etc.  If you are doing any debugging in VS.NET, you owe it to yourself to run on at least two monitors.  I’ve found it better than one big one.
Tuesday, June 14, 2005 4:15:01 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
# Monday, June 13, 2005

Sigh.  It’s a constant battle.  I knew full well that XmlSerializer leaks temp assemblies if you don’t use the right constructor.  (The one that takes only a type will cache internally, so it’s not a problem.)  And then I went and wrote some code that called one of the other constructors without caching the resulting XmlSerializer instances. 

The result:  one process I looked at had over 1,500 instances of System.Reflection.Assembly on the heap.  Not so good. 

The fix?  Not as simple as I would have hoped.  The constructor that I’m using takes the Type to serialize, and an XmlRootAttribute instance.  It would be nice to be able to cache the serializers based on that XmlRootAttribute, since that’d be simple and all.  Unfortunately, two instances of an XmlRootAttribute with the same parameters return different values to GetHashCode(), so it’s not that easy.  I ended up using a string key compounded from the type’s full name and the parameters I’m using on XmlRootAttribute.  Not the most beautiful, but it’ll work.  Better than having 1,500 temp assemblies hanging around.

Work | XML
Monday, June 13, 2005 4:09:22 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [1]  | 
# Wednesday, May 25, 2005
Wired has a short piece on hams and hamfests that’s a good read for those not familiar with the subject.  I love to see the word ├╝bernerd in print. :-)
Wednesday, May 25, 2005 9:55:51 AM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
# Monday, May 23, 2005

Several times recently I’ve been bitten by “code-freeze”.  Granted, I’ve been bitten because I wasn’t tracking the “freeze” closely enough, my bad.  But there are alternatives to this (in my mind) counter productive process.  Developers need to be free to check in changes.  They fix bugs, or add new features, and need to get those things checked in.  Their dev machine could collapse catastrophically, they could forget to check them in, or the fixes could become irrelevant if they wait too long.  Plus, the farther you get away from the point where you made the changes, the more likely you are to get merge conflicts.  If the “freeze” lasts too long, multiple people may make changes to the same files, so that when the freeze is lifted, merge conflicts ensue, and that’s not fun for anyone.  Not good for continuous integration.  Code freezes pretty much take the continuous right out of it.

The solution, you ask?  Labels.  I think most people don’t realize how many groovy things you can do with good labeling policy.  Part of this stems from the widespread use of Visual SourceSafe.  If you are still using VSS, stop.  VSS and continuous integration just don’t belong in the same sentence.  Labeling doesn’t work right, branches are hard and have lasting consequences, etc.  The common excuse is “it comes for ‘free’ with my development environment”.  Of course, it’s not really “free” but that’s another issue.  Even if it were, there are several (much better) free alternatives.  CVS and Subversion top the list.  Much better choices, since they do labeling and branching right, and they discourage the exclusive check-out, which is pure evil anyway.  (Can you tell I have an opinion or two on this topic?)

What I have done (quite successfully) in the past is to use labels to “freeze” the tree rather than actually preventing people from making check-ins.  It runs something like this…  Every developer is responsible for checking in changes, and for deciding when those changes are stable.  It’s a common mistake to assume that the tip or HEAD of your source tree must always be pristine.  That keeps people from checking things in when they should.  So, decide on a label (like “BUILDABLE”) that represents that stable state.  That way developers can continue to check in changes, even potentially destabilizing ones, on the tip without causing havoc.  When the developer decides that his/her changes are stable, he or she can move the label out to encompass those changes. 

How does this work with continuous integration?  Your build server always builds from the label you’ve chosen to represent stability.  Not from the tip.  That way you should always get the latest stable build, even if that doesn’t represent the latest changes.  When it’s done successfully building, the build server should label whatever it just built with something like “BUILT” and the time/date (in another label).  Remember, each revision can have as many labels as you need, so extras don’t hurt. 

Another big benefit of this process is that if someone is just joining the team, or needs to build the software for the first time, they can pull from the BUILT label, and be assured that they are getting the most recent version that is known to be build-able. 

The end result is that you have two labels that move, leapfrogging each other as new things get marked as stable and then built, and some that don’t move indicating what was successfully built for each date/time/version.  That’s good information to have.  It’s also good for developers to be able to freely check in changes without having to worry about “freezes” or breaking the build.

Give it a shot.  It’s work up front, and requires a little training and discipline, but you’ll be glad you made the effort.

Monday, May 23, 2005 10:58:15 AM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
# Friday, May 20, 2005

By now, no doubt, the three of you who read this have heard plenty of stuff about Episode III already, but I feel compelled to add my $.02.

It’s pretty darn good.

One of the “great films”?  No.

Entertaining?  Yes.  Sucks less?  Yes.  In fact, didn’t suck at all. 

It would have been an added bonus if there had been any acting by the principals.  I would have found Annikan’s descent into darkness way more plausible if he had even a shred of acting ability.  Sadly, he does not.  But I found I could suspend that bit of disbelief. 

Visually stunning, brings all the loose ends together in a way that’s plausible, no annoying comic relief in sight.  Very textured environments. 

Again, it would have been nice if the best acting in the film hadn’t been done by Yoda, but what can you realistically expect from Lucas?

Definitely worth seeing.  And just in case you’ve missed out, check out Darth Vader’s blog at www.darthside.com.  Genius.

Friday, May 20, 2005 10:13:59 AM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [1]  |