From CS 294-10 Visualization Sp10
Description: Visual Queries for Tree Structures
The Penn Treebank is a collection of parse trees for sentences from the Wall Street Journal and other corpora, used for NLP tasks like part of speech tagging and parsing. Because of the tree structure of the data, querying for patterns or substructures in the treebank is troublesome using standard text-searching tools like regular expressions. Some specialized text-based tools, such as tgrep/tgrep2 and tregex, allow for structure-aware querying of parse trees. However, such text-based queries become unwieldly when spanning multiple tree levels.
For my project, I plan to create an interface for constructing queries visually. The user will create and connect nodes to form a search structure that can be used against a treebank. The visual construction of the search query will more closely match the nature of the search itself, allowing for more complex, yet still comprehensible, tree queries.
Initial Problem Presentation
Midpoint Design Discussion
- Link to slides here
- File:Visualtregex-src.tar.gz - Source code for the program. Require swingfx_all.jar
- File:Visualtregex.zip - Executable. Rename to .jar and run with java.
- File:Visualtregex.pdf - final paper.