Syntactic Analysis Of Natural Language Using Linguistic Rules And Corpus-based Patterns
We are concerned with the syntactic annotation of unrestricted text. We combine a rule-based analysis with subsequent exploitation of empirical data. The rule-based surface syntactic analyser leaves some amount of ambiguity in the output that is resolved using empirical patterns. We have implemented a system for generating and applying corpus-based patterns. Some patterns describe the main constituents in the sentence and some the local context of the each syntactic function. There are several (partly) reduntant patterns, and the ``pattern'' parser selects analysis of the sentence that matches the strictest possible pattern(s). The system is applied to an experimental corpus. We present the results and discuss possible refinements of the method from a linguistic point of view.