tags:

views:

1543

answers:

2

Is it possible to open 2 documents from an xQuery and do a join on them?

A: 

In XQuery if you write something like the following:

for $x in doc('doc1.xml')//a
for $y in doc('doc2.xml')//a
where $x/@name = $y/@name
return $x

then your XQuery processor should be smart enough to spot that this is a join.

You never explicitly specify in XQuery that something is a join. The common theme in XQuery is that your program says what information you want, not how to compute it.

Although it may look like you are repeatedly looping over the second document in practice a real XQuery processor would perform this more intelligently, roughly analogous to the following SQL statement (my SQL is very rusty so I apologise if this syntax is completely wrong)

SELECT doc1.a
FROM doc1 INNER JOIN doc2
WHERE doc1.name = doc2.name

The XMark benchmark contains several sample queries, these are well worth a look. In particular queries 9 through 12 perform joins.

Oliver Hallam
+3  A: 

Yes, here is an example from the XQuery spec.:

"Joins, which combine data from multiple sources into a single result, are a very important type of query. In this section we will illustrate how several types of joins can be expressed in XQuery. We will base our examples on the following three documents:

  1. A document named parts.xml that contains many part elements; each part element in turn contains partno and description subelements.
  2. A document named suppliers.xml that contains many supplier elements; each supplier element in turn contains suppno and suppname subelements.
  3. A document named catalog.xml that contains information about the relationships between suppliers and parts. The catalog document contains many item elements, each of which in turn contains partno, suppno, and price subelements.

A conventional ("inner") join returns information from two or more related sources, as illustrated by the following example, which combines information from three documents. The example generates a "descriptive catalog" derived from the catalog document, but containing part descriptions instead of part numbers and supplier names instead of supplier numbers. The new catalog is ordered alphabetically by part description and secondarily by supplier name.*

<descriptive-catalog>
   { 
     for $i in fn:doc("catalog.xml")/items/item,
         $p in fn:doc("parts.xml")/parts/part[partno = $i/partno],
         $s in fn:doc("suppliers.xml")/suppliers
                  /supplier[suppno = $i/suppno]
     order by $p/description, $s/suppname
     return
        <item>
           {
           $p/description,
           $s/suppname,
           $i/price
           }
        </item>
   }
</descriptive-catalog>

The previous query returns information only about parts that have suppliers and suppliers that have parts. An outer join is a join that preserves information from one or more of the participating sources, including elements that have no matching element in the other source. For example, a left outer join between suppliers and parts might return information about suppliers that have no matching parts."

Do note that XQuery does not have a standard document() function (it is an XSLT function) and instead has the doc() function, which is part of the "XQuery 1.0 and XPath 2.0 Functions and Operators".

There are at least two errors in the answer by Chris:

  1. XQuery is case sensitive -- the capitalised keywords used in Chris' example will not be allowed by a conforming XQuery processor.
  2. It is not necessary to prefix the standard functions like doc(), I am just quoting the XQuery spec, which has the prefix. Otherwise, in my own code I would omit the "fn" prefix.
  3. The function document() is not a standard XQuery/XPath function. The doc() function should be used instead.
Dimitre Novatchev