Demo of TwigList

Give an XML tree T, a twig-pattern matching query, Q, represented as a query tree, is to find all the occurrences of such twig pattern in T. We propose a new algorithm, called TwigList, in dasfaa'07, which uses simple lists to compactly represent the matched results. Both time and space complexity of our algorithm are linear with respect to the total number of pattern occurrences and the size of XML tree. We implement a demo using the external algorithm as explained in our paper, and set the buffer size to be 4kb. In order to process pattern matching using our demo, two steps are needed: Regional code generating and Query processing. To clear all the data, just delete all files in the data and temporary directories.

 

Step1 : Regional code generating

The regional code will be generated in a separated folder with the same name of the xml file in the data directory. All code of the labels in the xml document will be saved in that folder as a separated file with the same name of the label. This part is designed using java 5.0. Before generating, you should make sure the java virtual machine is installed in your system. The format of the command is listed below and it must be executed in the twiglist dictionary.

Format: java -DentityExpansionLimit=640000 encoding.Encoding [xml file]

Example: java -DentityExpansionLimit=640000 encoding.Encoding test.xml

 

Step 2: Query processing

The query is an xquery with only labels, "/", "//", ".", "[" and "]". The algorithm of query processing is designed using Visual c++ .Net 2003, the format is as follows, the command must be executed in the twiglist dictionary, and you must make sure of the correctness of the xquery and xml file. The result will be shown in the standard output.

Format: processing\PatternMatching [xml file] [xquery]

Example: processing\PatternMatching test.xml //a[.//b][c]//d

 

Downloads

A demo for xml pattern matching using TwigList

TwigList: Make Twig Pattern Matching Fast