R/wrapper_functions.R
shuffle_cooccurrences.Rd
Function to shuffle the entries of the word-word cooccurrence file
shuffle_cooccurrences(input_cooccurences = NULL, output_cooccurences = NULL, memory_gb = 4, arraySize = 0, trace = FALSE)
input_cooccurences | a character string specifying the path to the cooccurences text file |
---|---|
output_cooccurences | a character string specifying the path to the shuffled-output-cooccurences-text-file |
memory_gb | a float number specifying the limit for memory consumption, in GB -- based on simple heuristic, so not extremely accurate; the default is 4.0 |
arraySize | a number specifying the length-limit to the buffer, which stores chunks of data to shuffle before writing to disk. This value overrides what is automatically produced by memory_gb |
trace | either TRUE or FALSE. If TRUE information will be printed out |
a character string specifying the location of the saved data
https://github.com/stanfordnlp/GloVe
http://nlp.stanford.edu/projects/glove/
http://nlp.stanford.edu/pubs/glove.pdf
# library(GloveR) # shfl = shuffle_cooccurrences(input_cooccurences = '/data_GloveR/COOCUR.bin', # output_cooccurences = '/data_GloveR/COOCUR_output.bin', # memory_gb = 4.0, arraySize = 0, trace = TRUE)