Let’s output the first few k-mers and their counts in FASTA format:
For example, this FASTA record:
>2
AAGTTTTCAmeans the k-mer AAGTTTTCA was seen twice.
To query for a particular k-mer of interest, say ACAGTGGAC, you can use jellyfish query:
This tells us that the ACAGTGGAC k-mer is found in Denge but not Chikungunya.
To get k-mers found in Dengue but not Chikungunya, we can use jellyfish count --if:
The distribution of k-mers now looks different:
This means there are 10,764 k-mers from the Chikungunya genome that were not found in the Dengue genome.