vocabulary counts

vocabulary_counts(train_data = NULL, MAX_vocab = 0, MIN_count = 1,
  output_vocabulary = NULL, trace = FALSE)

Arguments

train_data	a character string specifying the path to the train text file
MAX_vocab	a value specifying the number of terms in the vocabulary. For instance a MAX_vocab value of 0 includes all the vocab-terms.
MIN_count	a value greater or equal to 1. It specifies the minimum occurrences (counts of words) for inclusion in the vocabulary
output_vocabulary	a character string specifying the path to the output text file
trace	either TRUE or FALSE. If TRUE information will be printed out

Value

a character string specifying the location of the saved data

References

https://github.com/stanfordnlp/GloVe

http://nlp.stanford.edu/projects/glove/

http://nlp.stanford.edu/pubs/glove.pdf

Examples


# library(GloveR)

# res = vocabulary_counts(train_data = '/data_GloveR/dat.txt', MAX_vocab = 0,

#                         MIN_count = 5, output_vocabulary = '/data_GloveR/VOCAB.txt', trace = T)