For my first blog I thought I would start with something easy - parsing XML using VMware Orchestrator (a.k.a vCO)! I started playing with vCO in September 2012 for a "Cloud" project so I still consider myself a newbie - if you happen across this post and find something incorrect or something that could be done better then please don't hesitate to speak up.
Since I can't post our actual XML, I'll be using the following XML which will give the gist of how to parse for elements & attributes.
<?xml version="1.0" encoding="UTF-8" ?>
<people>
<person firstname="Jack" lastname="Smith" age="40">
<phone type="home" number="1234567890" />
<phone type="cell" number="1234567891" />
<sport name="basketball" position="shooting guard" />
</person>
<person firstname="Jill" lastname="Smith" age="39">
<phone type="home" number="1234567890" />
<phone type="cell" number="1234567892" />
</person>
</people>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:people majorVersion="1" minorVersion="0" xmlns:ns2="http://xmlns.local.com/shared/people" xmlns:ns3="http://xmlns.local.com/common/address">
<ns2:person firstname="Jack" lastname="Smith" age="40">
<ns2:phone type="home" number="1234567890" />
<ns2:phone type="cell" number="1234567891" />
<ns2:sport name="basketball" position="shooting guard" />
</ns2:person>
<ns2:person firstname="Jill" lastname="Smith" age="39">
<ns2:phone type="home" number="1234567890" />
<ns2:phone type="cell" number="1234567892" />
</ns2:person>
</ns2:people>
// lastname attribute
// age attribute
Since I can't post our actual XML, I'll be using the following XML which will give the gist of how to parse for elements & attributes.
<?xml version="1.0" encoding="UTF-8" ?>
<people>
<person firstname="Jack" lastname="Smith" age="40">
<phone type="home" number="1234567890" />
<phone type="cell" number="1234567891" />
<sport name="basketball" position="shooting guard" />
</person>
<person firstname="Jill" lastname="Smith" age="39">
<phone type="home" number="1234567890" />
<phone type="cell" number="1234567892" />
</person>
</people>
My initial attempt was to use the XMLManager API class that is part of vCO. This parser was pretty simple but lacking on so error handling as you can see:
var errorCode = "success";
var document = XMLManager.fromString(XMLString);
if (!document) {
errorCode = "Invalid XML Document";
throw "Invalid XML document";
}
// make sure we have at least one <person> element
var peopleElementList = document.getElementsByTagName("people");
var numOfPeople = peopleElementList.length;
System.log("numOfPeople : "+ numOfPeople);
if (numOfPeople == 0) {
errorCode = "Invalid XML Document - people element missing";
throw "Invalid XML document";
}
// loop through the people
for (var i = 0; i < numOfPeople; i++) {
// get the person
var person = peopleElementList.item(i);
//get the attributes of the person element
var personAttributes = person.attributes;
// get each of the attributes on the element
var firstname = person.getAttribute("firstname");
var lastname = person.getAttribute("lastname");
var age = person.getAttribute("age");
// get the <phone> element and attributes
var phoneElementList = person.getElementsByTagName("phone");
for (var j = 0; j < phoneElementList.length; j++) {
var phone = phoneElementList.item(j);
var phoneType = phone.getAttribute("type");
var phoneNumber = phone.getAttribute("number");
System.log("phone type : "+ phoneType);
System.log("phone number : "+ phoneNumber);
}
// get the <sport> element and attributes
var sportElementList = person.getElementsByTagName("sport");
for (var j = 0; j < sportElementList.length; j++) {
var sport = sportElementList.item(j);
var sportName = sport.getAttribute("name");
var sportPosition = sport.getAttribute("position");
System.log("sport name : "+ sportName);
System.log("sport position : "+ sportPosition);
}
} //end the loop
System.log("XML Parsing Completed");
var document = XMLManager.fromString(XMLString);
if (!document) {
errorCode = "Invalid XML Document";
throw "Invalid XML document";
}
// make sure we have at least one <person> element
var peopleElementList = document.getElementsByTagName("people");
var numOfPeople = peopleElementList.length;
System.log("numOfPeople : "+ numOfPeople);
if (numOfPeople == 0) {
errorCode = "Invalid XML Document - people element missing";
throw "Invalid XML document";
}
// loop through the people
for (var i = 0; i < numOfPeople; i++) {
// get the person
var person = peopleElementList.item(i);
//get the attributes of the person element
var personAttributes = person.attributes;
// get each of the attributes on the element
var firstname = person.getAttribute("firstname");
var lastname = person.getAttribute("lastname");
var age = person.getAttribute("age");
// get the <phone> element and attributes
var phoneElementList = person.getElementsByTagName("phone");
for (var j = 0; j < phoneElementList.length; j++) {
var phone = phoneElementList.item(j);
var phoneType = phone.getAttribute("type");
var phoneNumber = phone.getAttribute("number");
System.log("phone type : "+ phoneType);
System.log("phone number : "+ phoneNumber);
}
// get the <sport> element and attributes
var sportElementList = person.getElementsByTagName("sport");
for (var j = 0; j < sportElementList.length; j++) {
var sport = sportElementList.item(j);
var sportName = sport.getAttribute("name");
var sportPosition = sport.getAttribute("position");
System.log("sport name : "+ sportName);
System.log("sport position : "+ sportPosition);
}
} //end the loop
System.log("XML Parsing Completed");
This worked fine until we tied into a 3rd-party application which generated XML based on namespaces (http://www.w3schools.com/xml/xml_namespaces.asp) which turned the XML into something like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:people majorVersion="1" minorVersion="0" xmlns:ns2="http://xmlns.local.com/shared/people" xmlns:ns3="http://xmlns.local.com/common/address">
<ns2:person firstname="Jack" lastname="Smith" age="40">
<ns2:phone type="home" number="1234567890" />
<ns2:phone type="cell" number="1234567891" />
<ns2:sport name="basketball" position="shooting guard" />
</ns2:person>
<ns2:person firstname="Jill" lastname="Smith" age="39">
<ns2:phone type="home" number="1234567890" />
<ns2:phone type="cell" number="1234567892" />
</ns2:person>
</ns2:people>
My first thought was that the namespace is ns2: so just append that to the beginning of each element names I'm searching for. While that would work for this particular case, we are dealing with multiple namespaces and there's no guarantee that namespaces will always be the same - remember, these messages are generated from a 3rd-party app that we have little control over. Back to square one. I remembered reading about Javascript;s E4X (http://wso2.org/project/mashup/0.2/docs/e4xquickstart.html) when I was first looking at parsing XML so time to see what this is all about. The big question - does vCO's Javascript Engine support E4X ... the answer is YES! The key here is the .*:: as part of the inline query. This will get the element regardless of the namespace prefix. For our case the element names are unique even without the namespaces so I was safe to go with .*::. If this isn't the case for you then check out http://communities.vmware.com/thread/391844 on the vmware community site for some additional information on dealing with namespaces. Now for the updated code:
var document = new XML(XMLString);
if (!document) {
var errorCode = "Invalid XML Document";
throw "Invalid XML document";
}
// make sure we have at least one <person> element
var numOfPeople = document.*::person.length();
System.log("numOfPeople : "+ numOfPeople);
if (numOfPeople == 0) {
System.error("Invalid XML - no people provided");
errorCode = "Invalid XML - no people provided";
throw "Invalid XML";
}
// ass-u-me the XML is correct
var isValidXML = true;
// set the errorCode which is used if we throw an exception
var errorCode = "Invalid XML submitted";
// parse the <person> element
for (var i=0; i<numOfPeople ; i++) {
var person = document.*::person[i];
// populate the local variables for the attributes
// these are required so make sure they exist
// they dont then we log what's missing
// firstname attribute
// these are required so make sure they exist
// they dont then we log what's missing
// firstname attribute
if (person.hasOwnProperty('@firstname')) {
var firstname= person.@firstname;
System.log("firstname : "+ firstname);
} else {
System.error("no firstname attribute found");
errorCode += "\n firstname attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
var firstname= person.@firstname;
System.log("firstname : "+ firstname);
} else {
System.error("no firstname attribute found");
errorCode += "\n firstname attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
// lastname attribute
if (person.hasOwnProperty('@lastname')) {
var lastname = person.@lastname;
System.log("lastname : "+ lastname);
} else {
System.error("no lastname attribute found");
errorCode += "\n lastname attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
var lastname = person.@lastname;
System.log("lastname : "+ lastname);
} else {
System.error("no lastname attribute found");
errorCode += "\n lastname attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
// age attribute
if (person.hasOwnProperty('@age')) {
var age = person.@age;
System.log("age : "+ age);
} else {
System.error("no age attribute found");
errorCode += "\n age attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
var age = person.@age;
System.log("age : "+ age);
} else {
System.error("no age attribute found");
errorCode += "\n age attribute not found on <person> element number "+ (i+1);
isValidXML = false;
}
// if anything is invalid then throw the exception
if (isValidXML === false) {
System.error(errorCode);
throw "Invalid XML submitted";
}
// get child elements of this particular element
// these are all optional so we dont throw an exception if they are missing
var numChildren = person.*.length();
System.log("numChildren: "+ numChildren);
for (var j=0 ; j<numChildren ; j++) {
var tag = person.*[j];
var tagName = tag.localName(); // localName() gets element name without the namespace
System.log("found "+ tagName);
// check for phone element
if (tagName == "phone") {
var phoneType = tag.@type;
var phoneNumber = tag.@number;
System.log("phone type : "+ phoneType);
System.log("phone number : "+ phoneNumber);
// check for sport element
} else if (tagName == "sport") {
System.log("found a sport element")
var sportName = tag.@name;
var sportPosition = tag.@position;
System.log("sport name : "+ sportName);
System.log("sport position : "+ sportPosition);
// throw away anything else
} else {
System.log("cannot handle "+ tagName +" : skipping");
}
}
} //end the loop
System.log("XML Parsing Completed");
This new approach will now work regardless of the namespace so we are back in business - plus, I was able to add some much needed XML validation and error logging. Happy Parsing!
UPDATE:
After posting I received a "tweet" from @vCOTeam with the following three lines of code:
var document = new XML(XMLString);
var ns = new Namespace("ns", document.namespace());
default xml namespace = ns;
This sets the default namespace so we no longer need the .*::. The following lines should replace the lines in bold above:
var numOfPeople = document.person.length();
var person = document.person[i];
Thanks to the vCOTeam for improving the code. One thing I forgot to include - the values returned by the E4X parsing methods are XMLList type. You'll need to use the toString() function to use cast them to String type.
This new approach will now work regardless of the namespace so we are back in business - plus, I was able to add some much needed XML validation and error logging. Happy Parsing!
UPDATE:
After posting I received a "tweet" from @vCOTeam with the following three lines of code:
var document = new XML(XMLString);
var ns = new Namespace("ns", document.namespace());
default xml namespace = ns;
This sets the default namespace so we no longer need the .*::. The following lines should replace the lines in bold above:
var numOfPeople = document.person.length();
var person = document.person[i];
Thanks to the vCOTeam for improving the code. One thing I forgot to include - the values returned by the E4X parsing methods are XMLList type. You'll need to use the toString() function to use cast them to String type.
Hello, an amazing Information dude. Thanks for sharing this nice information with us. Software cloud BPM
ReplyDelete