Sigh. That's not the most attractive kind of question title, methinks...

I'm about to use openxml() to import from a typical XML file, and I'm looking how to parse subtrees. Basically, I want to select several nodes with particular entries and all subnotes below them.

In SQL, the tree structure would be designed via a set of FK-related tables (say, named "level_1" to "level_7"), where most child tables can have several rows per parent (i.e. as one-to-many relationships), so I would query something like

select level_4.name, level_5.name, level_7.value
from level_4 key join level_5 key join level_6 key join level_7
where level_4.name in ('value_1', 'value_2', 'value_10')
order by 1;

Of course, I could import the whole XML nodes into according tables, and then query that in a similar way, but that's not desired here.

Ideally, I would try to use one according openxml() call - but unfortunaley I'm running into several issues:

  • When the main xpath expression (i.e. the 2nd parameter of openxml) is focussed on a higher level (say, level_4), and I'm trying to request the lower level nodes (say, level_7.value) in the WITH clause, I will only get the first match per level_4 element, as the docs tell correctly
    When there are multiple matches for a column expression, the first match in the document order (the order of the original XML document before it was parsed) is used.

So the main xpath expression has to focus on the lowest level, if there are one-to-many relationships involved.

  • I can only set very simply filters in the main xpath expression, e.g. "level_1/[level_2="value_1"]". In particular, I seemingly cannot set multiple filters or combine them, as asked in this FAQ. So I guess the filtering has to be done afterwards, i.e. by using a WHERE clause on the final result set.

Question: If these limits do hold, are the following general guidelines correct?

  1. Focus the main xpath expression on the lowest level needed, so you get one entry per lowest level node.
  2. Use parent axis expressions in the WITH clause to access the desired parent nodes.
  3. Instead of filtering via xpath expression, use ordinary WHERE conditions to filter the result set.

So, for the hinted sample (assuming the values are designed as elements, not attributes):

select * from openxml(myXmlFile),
  '//level_7')
   with (level_4_name long varchar '../../../name',
         level_5_name long varchar '../../name',
         level_7_value long varchar 'value')
where level_4_name in ('value_1', 'value_2', 'value_10')
order by 1;

(The apparent drawback compared to directly filtering via xpath would be the fact that the whole XML file needs to be parsed, and openxml() may return a much bigger interim result set than needed.)

Or am I missing the point here completely?


I guess this is kind of a general question on how to "flatten" XML structures in a way similar to joining tables...

asked 30 Jan '13, 16:54

Volker%20Barth's gravatar image

Volker Barth
30.6k304455662
accept rate: 32%

edited 30 Jan '13, 17:07

I certainly confess that this is a complex (and possibly not too comprehensible) question - but I would still be glad to get advice...

(22 Feb '13, 09:40) Volker Barth
Be the first one to answer this question!
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×412
×19
×12

question asked: 30 Jan '13, 16:54

question was seen: 3,073 times

last updated: 22 Feb '13, 09:40