bash - Using Awk and Condition for catching lines when column of a file contains a position between a given range by another file -
i want identify score each gene need put condition identify score (column $3 score list) in 1 position between given range of column $3 , $4 of gene list
gene list:
chr1 tas1r1 6615000 6615100 chr1 tas1r1 6615130 6615200 chr5 tcerg1 145858055 145858216
score list:
rs79923433 chr1 6615060 0.327009537545002 0.177578086220885 rs4908925 chr1 6615107 0.492182375024342 0.278821401692196 rs114220820 chr1 6615172 0.24581165286421 0.129806066087895 rs925345 chr5 145858100 1.22569136462918 0.744498627741366
what desire:
chr1 tas1r1 6615000 6615100 0.327009537545002 chr1 tas1r1 6615130 6615200 0.24581165286421 chr5 tcerg1 145858055 145858216 1.22569136462918
with awk:
awk ' nr == fnr {score[$3] = $4; next} { (key in score) if ($3 <= key && key <= $4) print $0, score[key] } ' score.list gene.list
chr1 tas1r1 6615000 6615100 0.327009537545002 chr1 tas1r1 6615130 6615200 0.24581165286421 chr5 tcerg1 145858055 145858216 1.22569136462918
it's not super efficient, since have iterate on scores each line of genes, it's pretty straightforward.
Comments
Post a Comment