Function reference • textTinyR

All functions
`COS_TEXT()`	Cosine similarity for text documents
`Count_Rows()`	Number of rows of a file
`Doc2Vec`	Conversion of text documents to word-vector-representation features ( Doc2Vec )
`JACCARD_DICE()`	Jaccard or Dice similarity for text documents
`TEXT_DOC_DISSIM()`	Dissimilarity calculation of text documents
`big_tokenize_transform`	String tokenization and transformation for big data sets
`bytes_converter()`	bytes converter of a text file ( KB, MB or GB )
`cluster_frequency()`	Frequencies of an existing cluster object
`cosine_distance()`	cosine distance of two character strings (each string consists of more than one words)
`dense_2sparse()`	convert a dense matrix to a sparse matrix
`dice_distance()`	dice similarity of words using n-grams
`dims_of_word_vecs()`	dimensions of a word vectors file
`levenshtein_distance()`	levenshtein distance of two words
`load_sparse_binary()`	load a sparse matrix in binary format
`matrix_sparsity()`	sparsity percentage of a sparse matrix
`read_characters()`	read a specific number of characters from a text file
`read_rows()`	read a specific number of rows from a text file
`save_sparse_binary()`	save a sparse matrix in binary format
`select_predictors()`	Exclude highly correlated predictors
`sparse_Means()`	RowMens and colMeans for a sparse matrix
`sparse_Sums()`	RowSums and colSums for a sparse matrix
`sparse_term_matrix`	Term matrices and statistics ( document-term-matrix, term-document-matrix)
`text_file_parser()`	text file parser
`text_intersect`	intersection of words or letters in tokenized text
`token_stats`	token statistics
`tokenize_transform_text()`	String tokenization and transformation ( character string or path to a file )
`tokenize_transform_vec_docs()`	String tokenization and transformation ( vector of documents )
`utf_locale()`	utf-locale for the available languages
`vocabulary_parser()`	returns the vocabulary counts for small or medium ( xml and not only ) files

Reference

All functions