unix - Mass grep but sorting results based on first input file -


The file with a column identifiers to me (or call it File A) including duplicates I it looks like this is:

 GO: 0005515 GO: 0005737 GO: 0005875 GO: 0005884 GO: 0005200 GO: 0005524 GO: 0005737 ... 

I have a file (FileB call it) which There are two columns, the identifier in the first column, the second associated text that looks like this:

 GO: 0000001 mitochondria inherited GO: 0000002 mitochondrial genome maintenance GO: 0000003 reproduction GO : 0000006 High-affinity zinc uptake transmembrane transporter activity GO: 0000007 low affinity zinc ion transmembrane transporter activity GO: 0000009 alpha 1,6-mannosyltransferase activity GO: 0,000,010 Trans hexaprenyltranstransferase activity GO: 0,000,011 vacuole inheritance ... 

I want the grep of identification in fileA to get the lines matching with the identifier and the details from file B and output it to the second file c, as in the order of file A In case of duplicate, file B does not happen.

I've tried a few different things:

fgrep -f fileA fileB> FileC

The order of command fileB because it does not work fileC, File is not there.

 grep "$ name" fileB in `FileA` name >> FileC be 

It should work, but output:

 GO: 0005515 protein binding GO: 0005737 cytoplasm GO: 0005737 cytoplasm GO: 0005737 cytoplasm GO: 0005737 cytoplasm GO: 0005737 cytoplasm GO: 0016301 kinase activity GO: 0005525 GTP binding GO: 0005737 cytoplasm GO: membrane 0016021 integral. .. 

They are not in the order of file A (except for the first two people).

Any thoughts?

Give a liner the awk one should try, follow the order of production fileA.

  awk 'nr == FNR {b [$ 1] = $ 0; Next) $ 1b {print b [$ 1]} 'fileB fileA  

If your fileB was separated into two columns & lt; & Gt; , -F '\ t followed by awk :

  awk -F' \ t '' NR == FNR ...... ` 

Add a test

  Kent $ head FA FB ==> Fa & lt; == GO: 0005515 GO: 0005737 GO: 0005875 GO: 0005884 GO: 0005200 GO: 0005524 GO: 0005737 == & gt; FB & LT; == GOC: 0005875 # 3 Get Fuo: 0005737 # 2 Fuo GO: 0005884 # 4 Fuo Knent $ Akkl 'NR == Fanar {B [$ 1] = $ 0; Next} $ 1B {print B [$ 1]} "FB father GO: 0,005,515 # 1 GO: 0,005,737 # 2 GO: 0,005,875 # 3 GO: 0,005,884 # 4 GO: 0,005,737 # 2  

You can see that the output keeps DP and the identifier follows the order in file A ( fa )


Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -