Friday, July 23, 2004

Back in the day I went to High School in what passes for "inner city" in Seattle (Garfield HS: Go Bulldogs!) and was therefore exposed to a pretty fascinating cultural milieu. :-)  Anyway, out of pure nostalgia for days bygone, I couldn't resist picking up copies of Breakin' (1 and 2) on DVD.  While not great pieces of cinematic history, they sure are fun to watch.  My 9 year old son is totally into them.  He says #1 is better.

I wonder what every happened to the lovely Lucinda Dickey?  Her career just wan't the same after Ninja III: The Domination.

Now I just need a copy of Krush Groove:-D

Friday, July 23, 2004 9:53:36 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [1]  | 

This should be totally obvious to those with XML experience, but to those who don't fall into that category, keep in mind that it's of utmost importance to not mix data and meta-data when designing your XML.  For example, when creating an XML document for a purchase order, I've often seen stuff like

<PO>
 <lineItems>
  <item1>
    <name>widget x</name>
    <price>$4.99</price>
  </item1>
  <item2>
    <name>widget z</name>
    <price>$.99</price>
  </item2>
 </lineItems>
</PO>

This is what i mean by mixing data and meta-data.  By naming elements "item1" and "item2" you've mixed data (ordinal values "1" and "2") with meta-data (the description "item").  Now when you go to write a schema to match this document, what do you do?  Explicitly name elements item1 and item2?  What happens when you get a PO with 3 items.  You're screwed. 

Again, to those who are used to working with XML, this is readily apparent, but I found out from the class I taught this summer that it isn't obvious to everyone.  A much better solution would be something like

<PO>
 <lineItems>
  <item number="1">
    <name>widget x</name>
    <price>$4.99</price>
  </item>
  <item number="2">  <!--[Update:] fixed.  Thanks Haacked-->
    <name>widget z</name>
    <price>$.99</price>
  </item2>
 </lineItems>
</PO>

In that case the meta-date is property separated.  In this particular case, you don't actually have to specify the number at all, since the elements are inherently ordered, but you get the idea. 

Remember, friends don't let friends write bad XML.

Friday, July 23, 2004 8:10:29 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [3]  | 

Our continuous integration efforts came crashing down around our ears yesterday.  Our build started failing on Wednesday evening, after someone checked in a ton of changes all at once (not very continuous, I know).  That turned into a snowball of issues that took Scott and I all day yesterday to unravel.  Even after we'd fixed all the consequences of such large code changes (that was done maybe by noon) things continued not to work.  To make matters more dicey, at some point during the day our CVS server went TU, and had to be rebooted.  Builds continued to fail, mostly with weird timeout problems.  We were also seeing lots of locks being held open which was slowing down the build, contributing to the timeouts, etc. 

We ended up building manually just to get a build out, which was suboptimal but necessary. 

Thankfully, Scott was able to track down the problem last night.  Turns out that when the CVS box went down, our build server was in the middle of a CVS "tag" operation, a.k.a. labeling.  That left a bunch of crud on the CVS box that meant that subsequent tagging operations failed miserably, thus causing the timeouts, etc.  A few well placed file deletions on the CVS server cleaned things up, and we're (relatively) back to normal now. 

While I think that continuous integration is a fabulous idea, it's times like these that bring home just how many things have to go right at the same time for it to work.  What a tangled web we weave. :-)

Friday, July 23, 2004 5:54:00 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Monday, July 19, 2004

Just in case I wasn't a big enough geek before, I'm now officially KE7BJG.  That's right.  A licensed amateur radio operator (Technician class). 

I got interested in the idea of amateur radio through the emergency responder training I had last year.  I'd previously had no idea, but it turns out that in times of disaster/emergency, hams are instrumental in providing emergency communications through programs such as ARES and RACES.  That's a pretty important service, and I decided I'd like to be able to help out. 

In the process of studying for the licensing exam, I found out that the whole art and science of radio wave propagation is pretty darned fascinating. 

Ah well, it's not like anyone didn't know I was a nerd before :-).  My wife just shakes her head and sighs.  She says if I ever put up a tower in the backyard for antennas it's all over between us. 

Monday, July 19, 2004 6:33:16 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [1]  | 
 Thursday, July 15, 2004

I finished up my Web Services Theory class at OIT last night.  Just the final to go on Monday (mwah ha ha).

We ended with some stuff on WS-* and all the various specs.  I tried to spend minimal time on the actual syntax of WS-*, since some of them are pretty hairy, and spent more time on the business cases for WS-*.  That seemed to go over pretty well.  I think it's easier to understand the business case for why we need WS-Security than it is to understand the spec itself.  Unfortunately, on of the underlying assumptions about all the GXA/WS-* specs is that eventually they will just fade into the background, and you'll never see the actual XML, since some framework piece (like WSE 2.0) will just "take care of it" for you.  What that means is that the actual XML can be pretty complex.  The unfortunate part is that we don't have all those framework bits yet, so we have to deal with all the complexity ourselves.  Thankfully more tools like WSE 2 are available to hide some of that from the average developer.  On the other hand, I'm a great believer in taking the red pill and understanding what really goes on underneath our framework implemenations. 

Thursday, July 15, 2004 11:40:53 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Wednesday, July 14, 2004

James Avery points out a site about "Metric Time", which describes (in great detail) a way of defining a rational, base 10 time system.  What a great idea!  As I've watched my son (who just turned 9) learning math, I can see how confusing it is that all our math is done in base 10 units, with the exception of time keeping, in which we still rely on Babylonian base 60 math.  It's a hard system to keep in your head if you are used to decimal math.  Of course, it would be nice if we backwards Americans just got over the fact that we need to manufacture incompatible car parts and adopt the metric system for everything else.  It's such a strange way we deal with the metric system here.  I grew up knowing about the metric system, understanding how to convert standard to metric, etc.  And yet I don't have that intuitive "gut" feeling about the metric system.  It's like a second language.  I know there are 1.6 kilometers in a mile, but I don't have that intuitive sense of how far a kilometer is in space.  I can conceptualize how far a mile is, or how much something weighs in pounds, and understand rationally how to convert to metric, but I didn't learn it early enough or it's not common enough for it to be intuitive. 

Which is too bad, since as everyone agrees the metric system makes way more sense. 

Unfortunately, I think most people would suffer the same problem with switching to metric time.  People have an intuitive sense of how long an hour is, but it would take some time to get an intuitive feeling for the deciday.  I'm surprised that with all the other geek watches there are out there that none of them tell metric time.  Maybe it's a product opportunity waiting to happen.  I'd buy one.

Wednesday, July 14, 2004 4:40:26 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Monday, July 12, 2004

[Update]:  Turns out that this problem was caused by one and only one COM class, that happens to be a COM singleton, meaning that no matter how many times you call CoCreateInstance, you always get back the same instance.  It gets wrapped by the first interop assembly that gets loaded (which makes sense) and when another bit of code asks for a reference it gets the same wrapper back.  Which also makes sense.  You can't have two wrapper objects pointing to the same COM instance, or chaos whould insue.  So, the problem is not that you can't load two interop assemblies with different versions, it's that you can't wrap the same instance twice with two differently-versioned wrapper classes. 

We ran afoul of an interesting problem at work today, which upon reflection (no pun intended) makes sense, but I'd never thought about before. 

Due to an historical accident, we've got an application that's using our libraries, and also some primary interop assemblies to our core product.  Unfortunately, they aren't the same version of the primary interop assemblies that we're using for our libraries.  So primary is kind of fuzzy in this case.  The problem is that only one of them ever gets loaded.  They don't work in standard .NET side-by-side fashion.  The have different version numbers, etc. so it should.  However, when we started thinking about it, there can really only be one LoadLib and CoCreateInsance happening deep in the bowels some where, so having two primary interop assemblies pointing to the same COM objects doesn't really make any sense.  Dang.  Anyway, once they don't both get loaded, we get a nasty invalid cast exception where there shouldn't be one if we were dealing with properly side-by-sideable .NET assemblies.

If it's not one thing it's another.

 

Tuesday, July 13, 2004 1:07:38 AM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Friday, July 09, 2004

Dare Obasanjo posits that the usefulness of the W3C might be at an end, and I couldn't agree more.  Yes, the W3C was largely behind the standards that "made" the Web, but they've become so bloated and slow that they can't get anything done.

There's no reason why XQuery, XInclude, and any number of other standards that people could be using today aren't finished other than the fact that all the bureaucrats on the committee all want their pet feature in the spec, and the W3C process is all about consensus.  What that ends up meaning is that no one is willing to implement any of these specs seriously until they are full recommendations.  6 years now, and still no XQuery.  It's sufficiently complex that nobody is going to try to implement anything other than toy/test implementations until the spec is a full recommendation.

By contrast, the formally GXA now WS-* specs have been coming along very quickly, and we're seeing real implementation because of it.  The best thing that ever happened to Web Services was the day that IBM and Microsoft agreed to "agree on standards, compete on implementations".  That's all it took.  As soon as you get not one but two 800 lb. gorillas writing specs together, the reality is that the industry will fall behind them.  As a result, we have real implementations of WS-Security, WS-Addressing, etc.  When we in the business world are still working on "Internet time", we can't wait around 6-7 years for a real spec just so every academic in the world gets his favorite thing in the spec.  That's how you get XML Schema, and all the irrelevant junk that's in that spec. 

The specs that have really taken off and gotten wide acceptance have largely been defacto, non-W3C blessed specs, like SAX, RSS, SOAP, etc.  It's time for us to move on and start getting more work done with real standards based on the real world.

 |  |  | 
Friday, July 09, 2004 5:35:44 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Tuesday, June 29, 2004

I'm into the second week of my Web Services Theory class at OIT (Portland).  It's been a lot of fun so far.  We've gone over XML modeling, DOM, XmlTextReader, and last night some XPath/XQuery.  Not in too much depth, since what I'm really shooting for is a grounding in the idea of Web Services, rather than the technical details, but I think it's important to do some practical exercises to really understand the basics. 

Next were on to Xml Schema, then the joy that is WSDL.  I'm a little worried about WSDL.  It's a hard sell, and it takes a lot of time to explain the problems that WSDL was designed to solve that it turned out 95% of people didn't understand or care about.  Ah well.  It's what we have for now. 

 

Tuesday, June 29, 2004 9:16:38 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
I took my son backpacking this past weekend with some friends of mine and some of their boys.  It was the first time my son had been backpacking (he's 8) and it's the first time I've been in probably 12-13 years.  It was a great time.  We were up on the Southern slopes of Mt. Hood, on Timothy Lake.  The weather was nice, not too hot.  Far enough from parking lots to cut down on the crowds, but not so far that you felt like you had to struggle to get there and back. There are some images of the spot here.   
Tuesday, June 29, 2004 9:10:45 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 
 Thursday, June 24, 2004
Jim Newkirk posts a fabulous use of the much overlooked alias feature in C# to make existing NUnit test classes compile with Team System.  That's just cool.
Thursday, June 24, 2004 6:32:55 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  | 

I started teaching a class at OIT this week on "Web Services Theory", in which I'm trying to capture not only reality, but the grand utopian vision that Web Services were meant to solve (more on that later).  That got me thinking about the way the industry as a whole has approached file formats over the last 15 years or so. 

There was a great contraction of file formats in the early 90s, which resulted in way more problems than anyone had anticipated I think, followed by a re-expansion in the late 90s when everyone figured out that the whole Internet thing was here to stay and not just a fad among USENET geeks. 

Once upon a time, back when I was in college I worked as a lab monkey in a big room full on Macs as a "support technician".  What that mostly meant was answering questions about how to format Word documents, and trying to recover the odd thesis paper from the 800k floppy that was the only copy of the 200 page paper and had somehow gotten beer spilled all over it.  (This is back when I was pursuing my degree in East Asian Studies and couldn't imagine why people wanted to work with computers all day.)

Back then, Word documents were RTF.  Which meant that Word docs written on Windows 2.0 running on PS/2 model 40s were easily translatable into Word docs running under System 7 on Mac SEs.  Life was good.  And when somebody backed over a floppy in their VW bug and just had to get their thesis back, we could scrape most of the text off the disc even if had lost the odd sector here and there.  Sure, the RTF was trashed and you had to sift out the now-useless formatting goo, but the text was recoverable in large part.  In other sectors of the industry, files were happily being saved in CSV or fixed length text files (EDI?) and it might have been a pain to write yet another CSV parser, but with a little effort people could get data from one place to another. 

Then the industry suddenly decided that it could add lots more value to documents by making them completely inscrutable.  In our microcosm example, Word moved from RTF to OLE Structured Storage.  We support monkeys rued the day!  Sure, it made it really easy to serialize OLE embedded objects, and all kinds of neat value added junk that most people didn't take advantage of anyway.  On the other hand, we now had to treat our floppies as holy relics, because if so much as one byte went awry, forget ever recovering anything out of your document.  Best to just consider it gone.  We all learned to be completely paranoid about backing up important documents on 3-4 disks just to make sure.  (Since the entire collection of all the papers I ever wrote in college fit on a couple of 1.4Mb floppies, not a big deal, but still a hassle.)

Apple and IBM were just as guilty.  They were off inventing "OpenDoc" which was OLE Structured Storage only invented somewhere else.  And OpenDoc failed horribly, but for lots of non-technical reasons.  The point is, the industry in general was moving file formats towards mutually incomprehensible binary formats.  In part to "add value" and in part to assure "lock in".  If you could only move to another word processing platform by losing all your formatting, it might not be worth it. 

When documents were only likely to be consumed within one office or school environment, this was less of an issue, since it was relatively easy to standardize on a single platform, etc.  When the Internet entered the picture, it posed a real problem, since people now wanted to share information over a much broader range, and the fact that you couldn't possibly read a Word for Windows doc on the Mac just wasn't acceptable. 

When XML first started to be everyone's buzzword of choice in the late 90s, there were lots of detractors who said things like "aren't we just going back to delimited text files? what a lame idea!".  In some ways it was like going back to CSV text files.  Documents became human readable (and machine readable) again.  Sure, they got bigger, but compression got better too, and disks and networks became much more capable.  It was hard to shake people loose from proprietary document formats, but it's mostly happened.  Witness WordML.  OLE structured storage out, XML in.  Of course, WordML is functionally RTF, only way more verbose and bloated, but it's easy to parse and humans can understand it (given time). 

So from a world of all text, we contracted down to binary silo-ed formats, then expanded out to text files again (only with meta-data this time).  It's like a Big Bang of data compatibility.  Let's hope it's a long while before we hit another contracting cycle.  Now if we could just agree on schemas...

 | 
Thursday, June 24, 2004 6:24:31 PM (Pacific Daylight Time, UTC-07:00)  #    Disclaimer  |  Comments [0]  |