GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
GENERAND(1) FreeBSD General Commands Manual GENERAND(1)

generand - Generate random genomics data in various formats

generand fasta sequences sequence-length
generand fastq sequences sequence-length
generand sam chromosomes alignments-per-chromosome sequence-length
generand vcf chromosomes calls-per-chromosome samples

generand is a simple program to rapidly generate random genomics data streams in common formats such as FASTA, FASTQ, SAM, and VCF.

This may be useful for generating very short examples for academic purposes or large streams for testing and benchmarking genomics programs.

generand fast[aq] sequences sequence-length generates a FASTA or FASTQ stream of "sequences" sequences, each of length "sequence-length". The sequence content is random with a uniform distribution of bases, so that GC content should be very close to 50%.

PHRED scores in FASTQ streams are generated in blocks of equal scores and are mostly high-quality. The last few scores are lower quality and independent to simulate Illumina sequencing, where quality tends to drop near the end of each read.

generand sam chromosomes alignments-per-chromosome sequence-length generates a SAM stream with chromosomes * alignments-per-chromosome total alignments. It outputs increasing indexes for QNAME and CHROM, randomly increasing POS, random QUAL scores, and random sequences and PHRED scores as stated for FASTQ above.

generand vcf chromosomes calls-per-chromosome samples generates a VCF stream with chromosomes * calls-per-chromosome calls. It outputs chromosomes with increasing indexes, randomly increasing POS, uniformly random REF and ALT, uniformly random QUAL scores, and random sample columns including GT (genotype), AD (allelic depth) and DP (depth). REF counts are always >= ALT counts in the AD data and DP = REF count + ALT count.

bcftools, fastqc, samtools, vcftools

J. Bacon

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.