missing SAM header with minimap2 and samtools

When using minimap2 to map sequencing reads onto a reference, you can use this kind of command (be careful, this is wrong as you will see later):

minimap2 -a -x map-pb test.fastq reference.fasta > minimap.sam

The command is verbose and prints this kind of information. Note here the WARN%ING:

[M::mm_idx_gen::0.338*0.98] collected minimizers
[M::mm_idx_gen::0.464*1.19] sorted minimizers
[WARNING] For a multi-part index, no @SQ lines will be outputted.
[M::main::0.464*1.19] loaded/built the index for 863 target sequence(s)
......

Then, if you try to convert or read this file, you will most problaby get an error. For instance, to convert this SAM file into a BAM format (using samtools), you will get this error message:

[E::sam_parse1] missing SAM header
[W::sam_read1] Parse error at line 2
[main_samview] truncated file.

The solution took me a while but is very simple: if you check the help message of minimap2, you will see that the reference should be provided first. So the top command should be:

minimap2 -a -x map-pb reference.fasta test.fastq > minimap.sam

that is the reference comes first and then the data.

Please follow and like us:
This entry was posted in bioinformatics and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *