GloVe (Global Vectors for Word Representation)
Glove(input_cooccurences = NULL, output_vectors = NULL, vocab_input = NULL, model_output = 0, iter_num = 5, learn_rate = 0.1, save_squared_grads_file = NULL, alpha_weight = 0.75, cutoff = 100, binary_output = 0, vectorSize = 10, threads = 1, trace = FALSE)
input_cooccurences | a character string specifying the path to the cooccurences text file |
---|---|
output_vectors | a character string specifying the path to the output-vectors-file(s) (the output depending on the binary_output parameter can be a .bin and/or a .txt file) |
vocab_input | a character string specifying the path to the vocabulary text file (the output file of the vocabulary_counts function) |
model_output | an integer specifying the model-for-word-vector-output (for text output only). [ 0: output all data, for both word and context word vectors, including bias terms; 1: output word vectors, excluding bias terms; 2: output word vectors + context word vectors, excluding bias terms ] |
iter_num | an integer specifying the number of training iterations |
learn_rate | a float number specifying the learning rate |
save_squared_grads_file | either NULL or a character string specifying the location where the save_squared_grads_file data should be saved (accumulated squared gradients) |
alpha_weight | a float number specifying the parameter in exponent of the weighting function |
cutoff | a number specifying the cutoff parameter of the weighting function |
binary_output | an integer specifying the format output of the saved data (0: text, 1: binary, 2: both) |
vectorSize | a number specifying the dimension of word vector representations (excluding the bias term) |
threads | an integer specifying the number of threads to run in parallel |
trace | either TRUE or FALSE. If TRUE information will be printed out |
a character string specifying the location of the saved data and the number of the word vectors
https://github.com/stanfordnlp/GloVe
http://nlp.stanford.edu/projects/glove/
http://nlp.stanford.edu/pubs/glove.pdf
# library(GloveR) # gl = Glove(input_cooccurences = '/data_GloveR/COOCUR_output.bin', # output_vectors = '/data_GloveR/vectors', # vocab_input = '/data_GloveR/VOCAB.txt', # model_output = 2, iter_num = 5, learn_rate = 0.05, save_squared_grads_file = NULL, # alpha_weight = 0.75, cutoff = 10, binary_output = 0, vectorSize = 50, threads = 6, trace = TRUE)