HXQ with Database Connectivity

Download HXQ-0.10.0.tar.gz

Installation Instructions (HXQ with database connectivity)

You may use either MySQL or sqlite. The best is MySQL through an ODBC driver. The easiest to install is sqlite but it cannot be used to store large XML files.

Installation with MySQL

Here is a MySQL on Linux Tutorial. To install the MySQL database server and the MySQL/ODBC driver on Linux you do:

yum install mysql mysql-devel mysql-server unixODBC-devel mysql-connector-odbc
and you may use the following sample top-level file .odbc.ini:
[ODBC Data Sources]
HXQ     = MyODBC 3.51 Driver DSN

[HXQ]
Driver       = /usr/lib/libmyodbc3.so
Description  = Connector/ODBC 3.51 Driver DSN
user=root
password=xxxxx
option=262144
(Make sure that your username/password works and that the Driver has the correct path.) Then, start the mysql server (using service mysqld start as root on Linux) and create a database using the mysql command create database hxq.

Then, you need to install the Haskell packages: HDBC 1.1.4 (but not version 1.1.5) and the HDBC-odbc driver. Then you do:

runhaskell Setup.lhs configure -fmysql
runhaskell Setup.lhs build
runhaskell Setup.lhs install

Installation with sqlite

To use sqlite, you need to install SQLite. On Linux, you can install it using yum install sqlite. Then you need to install the Haskell packages: HDBC 1.1.4 (but not version 1.1.5) and the HDBC-sqlite3 driver. Then you do:

runhaskell Setup.lhs configure -fsqlite
runhaskell Setup.lhs build
runhaskell Setup.lhs install

Working with Databases

HXQ provides an interface to HDBC to query relational data inside an XQuery. For the HXQ compiler, the main function that allows database connectivity is:

$(xqdb query) :: (IConnection conn) => conn -> IO XSeq
For example, if the database name is "hxq", then
do db <- connect "hxq"
   result <- $(xqdb xquery) db
For the HXQ interpreter, the function is:
xqueryDB :: (IConnection conn) => String -> conn -> IO XSeq
The xquery executable can also run XQueries that use a database by specifying the database name using the -db option, eg. xquery -db hxq.

Querying an Existing Database

An XQuery may contain multiple SQL queries in the form sql(query,args), where query is the sql query that may contain parameters (denoted by ?), which are bound to the values in args (an XSeq). An example can be found in TestDB.hs. To run this example, you need to install the company database (using source data/company.sql in mysql or .read data/company.sql in sqlite3) and then compile and run TestDB.hs.

Shredding

To synthesize a relational schema schemaname to store an XML document located at pathname, use the following Haskell function:

genSchema :: (IConnection conn) => conn -> String -> String -> IO ()
genSchema db pathname schemaname
for a database db. HXQ will find a good relational schema (using hybrid inlining) to store the XML data by scanning the document to extract its structural summary and then deriving a good relational schema from the summary. To actually store the data from the XML document into the relational schema, use the following Haskell function:
shred :: (IConnection conn) => conn -> String -> String -> IO ()
shred db pathname schemaname
For example,
do db <- connect "hxq"
   genSchema db "data/cs.xml" "c"
   shred db "data/cs.xml" "c"
For large XML documents, you better use the compiled version of shred, $(shredC db pathname schemaname).

The Haskell function

printSchema db schemaname
displays the relational schema for the shredded document under the given schemaname, while
createIndex db schemaname tagname
creates a secondary index on tagname for the shredded document.

Publishing

You can query a shredded XML document using the XQuery function:

publish(dbame,schemaname)
where dbname is the database file name and schemaname is the unique schema name assigned to the XML document when was shredded. The translation from XQuery to SQL is done at compile-time, so both dbname and schemaname must be constant strings. HXQ will do its best to push relevant predicates to the generated SQL query (using partial evaluation and code folding), thus deriving an efficient execution. One example is TestDB2.hs.

Example: Installing and Querying the DBPL Database

First download and uncompress dbpl.xml.gz from DBLP. To install the DBPL database using MySQL, compile and execute first TestDBPL1.hs and then TestDBPL2.hs. Then you may evaluete queries, such as data/q4.xq, using the HXQ interpreter, which takes about 4 seconds.


Last modified: 10/26/08 by Leonidas Fegaras