vocabulary counts
vocabulary_counts(train_data = NULL, MAX_vocab = 0, MIN_count = 1, output_vocabulary = NULL, trace = FALSE)
train_data | a character string specifying the path to the train text file |
---|---|
MAX_vocab | a value specifying the number of terms in the vocabulary. For instance a MAX_vocab value of 0 includes all the vocab-terms. |
MIN_count | a value greater or equal to 1. It specifies the minimum occurrences (counts of words) for inclusion in the vocabulary |
output_vocabulary | a character string specifying the path to the output text file |
trace | either TRUE or FALSE. If TRUE information will be printed out |
a character string specifying the location of the saved data
https://github.com/stanfordnlp/GloVe
http://nlp.stanford.edu/projects/glove/
http://nlp.stanford.edu/pubs/glove.pdf
# library(GloveR) # res = vocabulary_counts(train_data = '/data_GloveR/dat.txt', MAX_vocab = 0, # MIN_count = 5, output_vocabulary = '/data_GloveR/VOCAB.txt', trace = T)