Efficient sequence analysis with bqtools
Help 11 / 12
Grep - Pattern Counting

A feature that doesn’t exist on most grep-like tools is the ability to count the number of matches found per-pattern. This is actually quite a useful feature for bioinformatics and shows up often when exploring data.

Let’s pretend that we have a collection of patterns - perhaps these are sections of a transcript, or a list of cell-barcodes, or perhaps sgRNA protospacers - and we want to count how many times each of them appear in our dataset.

Let’s take a look at these patterns:

cat patterns/patterns.txt

bqtools grep will accept a text file containing a list of patterns to search for - so let’s count how many times each of these patterns appear in our dataset:

bqtools grep merged.vbq --file patterns/patterns.txt -P

This will give us a table with 3 columns that reflect the pattern, the number of sequence matches, and the fraction of all records that match that pattern.

Loading...