| ewtroan ( @ 2008-12-23 16:34:00 |
A couple of weeks ago, I was talking with Brett Adam about XML decoding. We're looking at how to handle OVF in python, and I was talking about the approach the open-ovf had taken and my desire to get real python classes around the OVF objects instead of a python layer around the XML. Something which turns the XML into an artifcat of the object model instead of an object model which allows access to the XML.
Brett told be about the approach to XML he'd used for the ActionScript implementation of the rPath Management Console, and I thought it was a good approach. He enhanced the XML parser provided with ActionScript by making it type aware; elements would get parsed into classes of the right type which could then be serialized on the output. This provides a natural place to hang methods as well as a nice way of providing typing hints into the parser.
I decided to code up a similar approach for python. As I worked on it, I kept walking into Brett's office with corner cases, XML oddities, and things I just didn't understand. We decided to work together on a common approach to XML parsing in the two languages, and release the work as the XObj open source project under an MIT license. The code hasn't been released yet, but it is available via mercurial at http://hg.rpath.com/xobj/".
You should really think of what we've built as an object reflector. It's goal is to either take the objects described by an XML document and turn those into a set of classes (either python or AS3) consistent with that document, or to take a set of objects and serialize those in XML. It allows type information for both elements and attributes, the caller to specify which objects to use where, and places to put serialization hints.
One of this projects primary goals is to be easy to get started with and let the programmer use the more complicated features as he finds the need. Let me show you a couple of examples from python.
>>> from xobj import xobj
>>> class Foo(object):
... pass
...
>>> class Bar(object):
... def sum(self):
... return self.first + self.second
...
>>> foo = Foo()
>>> foo.val = "value"
>>> foo.bar = Bar()
>>> foo.bar.first = 1
>>> foo.bar.second = 2
>>> print xobj.toxml(foo, 'foo', xml_declaration = False)
<foo>
<bar>
<first></first>
<second>2</second>
</bar>
<<val>value</val>
</foo>
See? Nice and simple. Now, let's say that same hunk of XML is stored in the string varaible xml. Here's how you turn that into a python object tree:
>>> doc = xobj.parse(xml)
The object which is returned, doc, represents the entire document. If includes some housekeeping elements which make generation cleaner, as well as objects which repesent the XML document. The top level element in the XML was called foo, so that's where the element representing the top level element was stored.
>>> doc.foo.val 'value' >>> doc.foo.bar.first '1' >>> doc.foo.bar.second '2'
Simple in and out object serialization with XML in between. Nothing fancy, and there are certainly other ways of doing this. Let's look at what else we can do now though. We have lost the class information for Bar though since we didn't tell the parser what object to use for the bar element. We can fix that though.
>>> doc = xobj.parse(xml, typeMap = {'bar' : Bar})
>>> doc.foo.bar.sum()
'12'
Okay, so maybe that's not quite what we wanted. We need to tell the parser that first and second are integers, not strings.
>>> doc = xobj.parse(xml, typeMap = {'bar' : Bar, 'first' : int, 'second' : int})
>>> doc.foo.bar.sum()
3
That typeMap specifies the type for bar elements and it's elements first and second. To avoid those maps getting overly complicated, here is another way of doing the same thing:
>>> class Bar2(Bar):
... first = int
... second = int
>>> doc = xobj.parse(xml, typeMap = {'bar' : Bar2})
>>> doc.foo.bar.sum()
3
Here we're using class variables as form of prototyping. We could carry this even further, and specify a class for foo which tells what
There are lots of other things you can do with prototypes like this. The last I'm going to show here (as it is 5pm two days before Christmas!) shows how you can force an item to be a list.
By making mapping the bar element to a list of </tt>Bar</tt> classes, we've told the parser to always use a list (normally an element creates a list of objects only if that element appears more than once. (Note that I used a typeMap here instead of the prototype in the Foo class, but both methods are identical).
There are lots more things XObj can do, all of which are demonstrated in the python test case. Give it a try, and tell me what you think!
>>> doc = xobj.parse(xml, typeMap = {'bar' : [ Bar2 ] } )
>>> doc.foo.bar[0].sum()
3