R/wrapper_functions.R
shuffle_cooccurrences.RdFunction to shuffle the entries of the word-word cooccurrence file
shuffle_cooccurrences(input_cooccurences = NULL, output_cooccurences = NULL, memory_gb = 4, arraySize = 0, trace = FALSE)
| input_cooccurences | a character string specifying the path to the cooccurences text file |
|---|---|
| output_cooccurences | a character string specifying the path to the shuffled-output-cooccurences-text-file |
| memory_gb | a float number specifying the limit for memory consumption, in GB -- based on simple heuristic, so not extremely accurate; the default is 4.0 |
| arraySize | a number specifying the length-limit to the buffer, which stores chunks of data to shuffle before writing to disk. This value overrides what is automatically produced by memory_gb |
| trace | either TRUE or FALSE. If TRUE information will be printed out |
a character string specifying the location of the saved data
https://github.com/stanfordnlp/GloVe
http://nlp.stanford.edu/projects/glove/
http://nlp.stanford.edu/pubs/glove.pdf
# library(GloveR) # shfl = shuffle_cooccurrences(input_cooccurences = '/data_GloveR/COOCUR.bin', # output_cooccurences = '/data_GloveR/COOCUR_output.bin', # memory_gb = 4.0, arraySize = 0, trace = TRUE)