Function to shuffle the entries of the word-word cooccurrence file

shuffle_cooccurrences(input_cooccurences = NULL,
  output_cooccurences = NULL, memory_gb = 4, arraySize = 0,
  trace = FALSE)

Arguments

input_cooccurences

a character string specifying the path to the cooccurences text file

output_cooccurences

a character string specifying the path to the shuffled-output-cooccurences-text-file

memory_gb

a float number specifying the limit for memory consumption, in GB -- based on simple heuristic, so not extremely accurate; the default is 4.0

arraySize

a number specifying the length-limit to the buffer, which stores chunks of data to shuffle before writing to disk. This value overrides what is automatically produced by memory_gb

trace

either TRUE or FALSE. If TRUE information will be printed out

Value

a character string specifying the location of the saved data

References

https://github.com/stanfordnlp/GloVe

http://nlp.stanford.edu/projects/glove/

http://nlp.stanford.edu/pubs/glove.pdf

Examples

# library(GloveR) # shfl = shuffle_cooccurrences(input_cooccurences = '/data_GloveR/COOCUR.bin', # output_cooccurences = '/data_GloveR/COOCUR_output.bin', # memory_gb = 4.0, arraySize = 0, trace = TRUE)