Abstract

Grammar is a crucial component in most natural language modeling systems. It bounds the range of constructions which the systems can handle. However, the conventional method of manual grammar construction is labor intensive, time consuming and often leads to errors caused by unwanted rule interactions. To go beyond this traditional approach, more practical and robust techniques for grammar construction become necessary. This talk will introduce an innovative approach to grammar construction. It utilizes a grammatical inference technique to obtain linguistic knowledge that appears in large corpora. Experimental results show that this approach is capable of inferring a broad-coverage linguistically-meaningful stochastic context-free grammar without relying on significant manual work.