Hi guys I had two files with below format.....
>contig00001 length=11003 numreads=3312
>contig00002 length=110423 numreads=3323
>contig00003 length=11023 numreads=33233
>contig00001 length=15918 numreads=6266
>contig00002 length=106210 numreads=27839
>contig00003 length=106213 numreads=26839
>contig00004 length=1023433 numreads=23465
The program must compare the sequence in FILE-1 with sequences in FILE-2
i.e., take first sequence in FILE-1...i.e., take only sequence not its name and compare with sequences in FILE-2 if it matches to any of the sequences in FILE-2 then print both sequences names i.e., the sequences which is compared and sequences matched...
one sequence can be matched with many sequences....
one sequence may be part of another sequences in FILE-2....
return all the matched sequences ..
here in the sample sequences....
first sequence (contig00001) of FILE-1 is exactly matched with contig00003 and it is also present in contig0004 as a subpart... so the output is
contig00001--------------------- contig00003, contig00004
the contig00002, contig00003 of FILE-1 is not matched with any sequences of FILE-2 so
contig00002-------------- not matched
guys I asked similar question, but there I did not put in a clear way so not to confuse them I deleted that question .... and posting new question........