Example input:
==================================================
>123456789.1
AGCT
>123456789.2
AGCT
>222221122.1
AGCT
==================================================
User wants to replace every thing before dot(.) and append every thing after dot including dot post replacement.Value key pair and expected output is as below:
==================================================
123456789 abcde
222221122 ghijk
==================================================
Expected output:
==================================================
>abcde.1
AGCT
>abcde.2
AGCT
>ghijk.1
AGCT
===================================================
code:
===================================================
$ seqkit replace --quiet -p '([0-9]+)(\.[0-9])' -r '{kv}${2}' -k ids.txt test.fa
>abcde.1
AGCT
>abcde.2
AGCT
>ghijk.1
AGCT
=========================================================
Explanation:
- seqkit, by default, replaces headers
- --quiet is not to print errors, logs etc onto screen
- -p is option for pattern
- ([0-9]+)(\.[0-9]) - Two pattern captures (within normal brackets). First pattern ([0-9]+) catches multiple numbers (first part of headers- before dot) and second pattern catches (\.[0-9]) one dot and one number between 0 and 9
- By default, KV (key-value) pair replaces first pattern
- In addition, we added second pattern to append to replaced value (${2} denotes second pattern).