Skip to contents

This function calculates a d2_S type dissimilarity measurement between the n sequences (which can represent samples) of a FASTA file. See doi:10.1186/s12859-016-1186-3 for more details.

Usage

fasta2dist(
  ...,
  outputFile = NULL,
  threads = 2,
  kmer = 6,
  normalize = FALSE,
  compress = TRUE,
  verbose = FALSE
)

Arguments

...

Input fasta files locations (uncompressed or gzip compressed).

outputFile

Output distances file location.

threads

Number of java threads to use.

kmer

Kmer length to use for analyzing fasta sequences.

normalize

Normalize on sequences length.

compress

Compress output (adds .gz extension).

verbose

Logical. If TRUE, enables verbose output from the Java backend.

Value

A dist distances object of the calculation.

References

Java implementation: https://github.com/gkanogiannis/BioInfoJava-Utils

Author

Anestis Gkanogiannis, anestis@gkanogiannis.com

Examples

my.dist <- fasta2dist(
    inputfile = system.file("extdata", "samples.fasta.gz",
        package = "fastreeR"
    )
)