Word-level Text Generation from Language Models
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This research constructs and evaluates text generation models created from three different language models, n-gram, a Continuous Bag of Words (CBOW) and gated recurrent unit (GRU), using two training corpora, Berkeley Restaurant (Berkeley) and Alice's Adventures in Wonderland (Alice), and evaluated using two evaluation metrics; perplexity measure and count of grammar errors. The mean perplexities of all three models are comparable for each corpus, the N-gram model produces slightly lower values of perplexity. As for the number of grammatical errors in the Alice corpus, all three models show a slightly higher number of errors than the original corpus. In the Berkeley corpus, the n-gram model had the lowest number of errors, even lower than the original corpus, but the CBOW model had the highest number of errors and the GRU model had the highest number of errors.