It surprises many users that the output from SQL Server 2000's new built in XML features
do not produce well formed XML. However it is possible to produce well formed XML
without editing the XML externally, just by modifying the SQL query. This
article describes how to do it.
SQL Server provides 3 ways of producing the output of a query as XML, these are
RAW, AUTO and EXPLICIT which have differing levels of complexity and flexibility.
One thing that all of these have in common is that none of them usually produce
'well formed' XML. The reason is that they don't have a pair of opening and
closing tags which is a requirement. This might seem surprising but it has been
done for a reason - it means that result sets can be concatenated together
before adding the opening and closing tags. However doing this externally can
be a pain or impossible. However the EXPLICIT option is sufficiently flexible
for us to format the output so that it has a single pair of root tags.
select pub_name,city,state,country from publishers FOR XML AUTO
This will return the following XML:
<publishers pub_name="New Moon Books" city="Boston" state="MA" country="USA"/>
<publishers pub_name="Binnet & Hardley" city="Washington" state="DC" country="USA"/>
<publishers pub_name="Algodata Infosystems" city="Berkeley" state="CA" country="USA"/>
<publishers pub_name="Five Lakes Publishing" city="Chicago" state="IL" country="USA"/>
<publishers pub_name="Ramona Publishers" city="Dallas" state="TX" country="USA"/>
<publishers pub_name="GGG&G" city="München" country="Germany"/>
<publishers pub_name="Scootney Books" city="New York" state="NY" country="USA"/>
<publishers pub_name="Lucerne Publishing" city="Paris" country="France"/>
Figure 1
If we paste this into XML Spy or any other tool that verifies whether XML is
well formed then it will confirm that it isn't. However all we need to do is
to add a root tag to the beginning and end of the XML to rectify this :
<root>
<publishers pub_name="New Moon Books" city="Boston" state="MA" country="USA"/>
<publishers pub_name="Binnet & Hardley" city="Washington" state="DC" country="USA"/>
<publishers pub_name="Algodata Infosystems" city="Berkeley" state="CA" country="USA"/>
<publishers pub_name="Five Lakes Publishing" city="Chicago" state="IL" country="USA"/>
<publishers pub_name="Ramona Publishers" city="Dallas" state="TX" country="USA"/>
<publishers pub_name="GGG&G" city="München" country="Germany"/>
<publishers pub_name="Scootney Books" city="New York" state="NY" country="USA"/>
<publishers pub_name="Lucerne Publishing" city="Paris" country="France"/>
</root>
Figure 2
As an aside note that if we can guarantee that there will be only one root
level node then this step becomes unnecessary. For example, the following
XML is deemed to be well formed because there is only one row in the recordset:
<publishers pub_name="New Moon Books" city="Boston" state="MA" country="USA"/>
However there is no harm in adding a pair of root nodes.
To return to the problem we need to use the advanced features of the
FOR XML EXPLICIT option. To start lets rewrite the same SQL query shown
above but using XML EXPLICIT instead of XML AUTO:
select
1 as tag,
null as parent,
pub_name as [publishers!1!pub_name],
city as [publishers!1!city],
state as [publishers!1!state],
country as [publishers!1!country]
from publishers FOR XML EXPLICIT
Figure 3
To add a pair of root nodes we just alter this query to add a
parent root node, which has no attributes, as follows:
select
1 as tag,
null as parent,
'' as [root!1!],
null as [publishers!2!pub_name],
null as [publishers!2!city],
null as [publishers!2!state],
null as [publishers!2!country]
UNION
select
2 as tag,
1 as parent,
'',
pub_name,
city,
state,
country
from publishers
FOR XML EXPLICIT
Figure 4
This produces the required XML with the data within a pair of root nodes as shown in
figure 2. The first part of the query just adds an extra parent node around the main
data. The second part returns the main bulk of the data which will be enclosed within
the parent nodes. Note that it's only necessary to define the column names in the
first select statement of a union statement - they do not need to be repeated in
subsequent parts.
|