Extract and filter
The seqkit seq command is used to extract, filter and format your FASTA and FASTQ files.
For example, to extract the sequence names from a FASTA file:
If your FASTA is formatted such that the sequence name contains an ID followed by a space and more information, then you can extract just those IDs using --only-id:
If you are interested in only sequences of a certain size, e.g. >300bp, use the --min-len to filter out shorter sequences:
You can also filter out long sequences with --max-len, and for FASTQ files, you can filter out reads with a certain average quality with --min-qual and --max-qual.
How would you convert the RNA sequences in hairpins.fa to DNA using SeqKit? Use the manual as a reference.