![]() |
![]()
| ![]() |
![]()
NAMEvocabulary -- extract vocabularies from Penn treebank files SYNOPSISvocabulary [-NT ntfile] [-POS posfile] [-word wordfile] [-count] [-binarized] [-verbose] file1 [file2...] File1, file2 etc. are the names of Penn treebank files. If none are specified, STDIN is used. OPTIONSDESCRIPTIONGiven a list of Penn treebank files, this script extracts the words, parts of speech, and non-terminal node names and emits each in a separate file in order of frequency. Note that giving a "-" argument for any of ntfile, posfile, or wordfile causes the results to be written to STDOUT. AUTHORW.P. McNeill <billmcn@ssli.ee.washington.edu>
|