How to count the number of reads in each chromosome in a bam file?
Posted on In QAHow to count the number of reads in each chromosome in a bam file? The bam file is already sorted by the chromosome names.
If the bam file is indexed, you may quickly get these info from the index:
samtools idxstats in.bam | awk '{print $1" "$3}'
If the bam file is not indexed, you may “count” it by uniq
:
samtools view in.bam | awk '{print $3}' | uniq -c
(if it is a sam file like in.sam, replace the samtools view in.bam
with cat in.sam
)
In both cases, samtools
provides the tools to parse/show the bam file content.