GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
samtools-cat(1) Bioinformatics tools samtools-cat(1)

samtools cat - concatenate files together

samtools cat [-b list] [-h header.sam] [-o out.bam] in1.bam in2.bam [ ... ]

Concatenate BAMs or CRAMs. Although this works on either BAM or CRAM, all input files must be the same format as each other. The sequence dictionary of each input file must be identical, although this command does not check this. This command uses a similar trick to reheader which enables fast BAM concatenation.

Read the list of input BAM or CRAM files from FOFN. These are concatenated prior to any files specified on the command line. Multiple -b FOFN options may be specified to concatenate multiple lists of BAM/CRAM files.
Uses the SAM header from FILE. By default the header is taken from the first file to be concatenated.
-o FILE
Write the concatenated output to FILE. By default this is sent to stdout.
[CRAM only] Query the number of containers in the CRAM file. The output is the filename, the number of containers, and the first and last container number as an inclusive range, with one file per line.

Note this works in conjunction with the -r RANGE option, in which case the 3rd and 4th columns become useful for identifying which containers span the requested range.

[CRAM only] Filter the CRAM file to a specific RANGE. This can be the usual chromosome:start-end syntax, or "*" for unmapped records at the end of alignments.

If the range is of the form "#:start-end" then the start and end coordinates are interpreted as inclusive CRAM container numbers, starting at 0 and ending 1 less than the number of containers reported by -q. For example -r "#:0-9" is the first 10 CRAM containers of data.

All range types filter data in as fast a manner as possible, using operating system read/write loops where appropriate.

[CRAM only] Filter the CRAM file using a specific fraction. The file is split into B approximately equal parts and returns element A where A is between 1 and B inclusive. If there are more parts specified than CRAM containers then some of the output will be empty CRAMs.

This can also be combined with the range option above to operate of parts of that range. For example -r chr2 -p 1/10 returns the first 1/10th of data aligned against chromosome 2.

[CRAM only] Enable fast mode. When filtering by chromosome range with -r we normally do careful recoding of any containers that overlap the start and end of the range so the record count precisely matches that returned by a samtools view equivalent. Fast mode does no filtering, so may return additional alignments in the same container but outside of the requested region.
Do not add a @PG line to the header of the output file.

Extract a specific chromosome from a CRAM file, outputting to a new CRAM.


samtools cat -o chr10.cram -r chr10 in.cram
    

Split a CRAM file up into separate files, each containing at most 123 containers.


set -- $(samtools cat -q in.cram); nc=$2; s=0
while [ $s -lt $nc ]
do

e=`expr $s + 123`
if [ $e -ge $nc ]
then
e=$nc
fi
r="$s-`expr $e - 1`"; echo $r
fn=/tmp/_part-`printf "%08d" $s`.cram
samtools cat -o $fn in.cram -r "#:$r"
s=$e done

Split any unaligned data from a (potentially aligned) CRAM file into 10 approximately equal sized pieces.


for i in `seq 1 10`
do

samtools cat in.cram -r "*" -p $i/10 -o part-`printf "%02d" $i`.cram done

Written by Heng Li from the Sanger Institute. Updated for CRAM by James Bonfield (also Sanger Institute).

samtools(1)

Samtools website: <http://www.htslib.org/>

30 May 2025 samtools-1.22

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.