The experiences of a software developer as he wades through the dynamic world of technology. Discussions of new industry developments and current technologies he finds himself wrapped up in.

Wednesday, July 26, 2006

The XPath to Cleaner Java Code

It was about three years ago that I got heavily involved in XML. The company I was working for at the time took the leap into SOA, and XML was a major component. I can remember the dilemmas I faced when trying to parse the XML messages being passed around from system to system. There were many options to dealing with the XML and after evaluating many of them, I decided to use the Document Object Model.

Granted, using the DOM gives you a lot of power, but writing the code to actually navigate an XML document can be a real hassle. And once you do finally get your logic in place, trying to come back to look at that code a few months later can bring on a serious migraine. No one said maintaining such code would be easy.

Now that I look back on those days, I often wonder why I didn't choose another API such as Xalan, or Saxon. Sure, I'd have introduced a dependency on an external engine, thereby locking me into their API, but I would have had the advantage of using XML Path Language, which is more commonly referred to as XPath. Since joing my new company, I have done extensive work using XPath (mostly while writing XSLT in a .NET 2.0 environment) and it was only at this time that I fully realized its power. It's safe to say that I didn't fully appreciate it during my initial evaluation.

For those who aren't familiar with XPath it is a powerful query language for extracting information from an XML document. Like SQL is a query language optimized for extracting data from a relational database, XPath is optimized for easily navigating an XML document to find the information you're interested in. IBM's technical XML library on their developerWorks network made a great analogy.


If you send someone out to purchase a gallon of milk, what would you rather tell that person? "Please go buy a gallon of milk." Or, "Exit the house through the front door. Turn left at the sidewalk. Walk three blocks. Turn right. Walk one half block. Turn right and enter the store. Go to aisle four. Walk five meters down the aisle. Turn left. Pick up a gallon jug of milk. Bring it to the checkout counter. Pay for it. Then retrace your steps home." That's ridiculous. Most adults are intelligent enough to procure the milk on their own with little more instruction than "Please go buy a gallon of milk."

This gives a great sense as to how much simpler it can be to write an XPath expression, as opposed to having to write complicated DOM code. Let's say I had an XML document that contained a list of cities, in a list of territories, and countries. If I wanted to find all of the cities listed in Ontario, my XPath statement would look something like this (assuming 'world' is the root element of the XML document:

//world/[country='canada']/[territory='ontario']/city

I'm not even going to get into writing the DOM code for doing something like this. I think you can imagine what it would look like. I sure can, because I got my practice writing a lot of it during the project I was referring to earlier. I guess if you were not going to have a lot of lookups, using the DOM would be fine and as a developer it is your decision to weigh your options. Do you want to write some code that will be difficult to maintain, or do you want to get locked into an external API and it's particular engine?

Well, with Sun's introduction of Java 5 this decision may have gotten a lot easier. This release of the Java platform now includes a package javax.xml.xpath that has everything you need to perform XPath queries right out of the box. No longer will you have to rely on an external engine to take advantage of XPath. I just wish it was available when I was trying to make that decision three years ago.

For further information on using javax.xml.xpath, check out IBM article, The The Java XPath API and Sun's official JavaDocs for Java 5.

Labels: , ,

0 Comments:

Post a Comment

<< Home