A practical app for Molecular Biology


I'm working on a little VB application that will take a short DNA sequence of the letters A,T,C and G and convert them into the complementary sequence and then print this sequence back to front.  

In molecular biology we define the orientation of a cDNA (sense) strand  as 5'->3'.  The antisense strand runs 3'->5' direction.  

Problem scenario:
1.  Input a short sequence of letters A,T,C,G into an array.  If the sequence is eg. "aactgg"  its complementary strand would be "ttgacc".  

2.  Once the strand has been translated to its complementary strand (i.e "ttgacc" in a 3'->5' direction) the orientation is reversed to "ccagtt" to 5'->3'.  

I've written a working solution that converts the  letters to their complement, one at a time:

        If string1 = "a" Then
            string2 = "t"
    ElseIf string1 = "t" Then
            string2 = "a"
    ElseIf string1 = "c" Then
            string2 = "g"
    ElseIf string1 = "g" Then
            string2 = "c"
     Else: string2 = "*"

This If/Else statement doesn't go over each element of the array (string).  I'm working on various nested loop combinations such as:

For x = 0 To x = strLength
    If string1 = "a" Then
            string2 = "t"
    ElseIf string1 = "t" Then
            string2 = "a"
    ElseIf string1 = "c" Then
            string2 = "g"
    ElseIf string1 = "g" Then
            string2 = "c"
    Else: string2 = "*"
    End If    
Next x
were strLength is the length of the input string and/or the number of  elements in the array.   Which leads me to ask is it neccesary to put the letters into an array at all?

Any suggestions are much appreciated.

Snerken
Brisbane
Australia
snerkenAsked:
Who is Participating?
 
igfpConnect With a Mentor Commented:
this will never work. Let's say you've got a string1 = "ATCGATGT"
you want a string2 that goes like: "TAGCTACA", last time I checked biology! :p

here's what you do, make a cicle that detects each char on the string and change it to string2. For this I'll use a function so that you can use it as much times as you want.

CharsInString% = Len (String1)
For i% = 1 to CharsInString
    Char$ = Mid (String1, i%, 1)
    Char$ = DNA (Char$)
    String2 = String2 & DNA
Next i%

Notice that this won't work with very big strings! And string1 must be correct and String2 should be empty.

Now the function DNA that you may include and a module or as a Public function.

Function DNA(DNAChar$) As String
    If DNAChar <> "A" or  DNAChar <> "G" or_
       DNAChar <> "C" or DNAChar <> "T" Then
       ' I hope you aren't using RNA with that U
       goto error:
    Else
    End if

   Select Case DNAChar$

    Case A
        DNA = "T"

    Case G
        DNA = "C"

    Case C
        DNA = "G"

    Case T
        DNA = "A"

End Select

error:
 msgbox "Your String Has illegal chars"

End Function

Tell me how it goes!
0
 
supunrCommented:
Private Function ComplementaryDNA(DNA as String) As String
     Dim i as Long
     Dim charAt as String

     DNA = Ucase(DNA)

     ComplementaryDNA = ""
     For i=1 to Len(ComplementaryDNA)
          charAt = Mid(ComplementaryDNA, i, 1)
          if (charAt = "A") then
               ComplementaryDNA = ComplementaryDNA & "T"
          elseif (charAt = "T") then
               ComplementaryDNA = ComplementaryDNA & "A"
          elseif (charAt = "G") then
               ComplementaryDNA = ComplementaryDNA & "C"
          elseif (charAt = "C") then
               ComplementaryDNA = ComplementaryDNA & "G"
          End If
     Next i
     ComplementaryDNA = ReverseString(ComplementaryDNA)
End Sub

Private Function ReverseString(str as String) as String
     Dim i as long
     
     ReverseString = ""
     For i = Len(str) to 1 Step -1
          ReverseString = ReverseString & Mid(str, i, 1)
     Next i
End Sub

Somewhere in your code...

MsgBox ComplementaryDNA("AAGCTTCGCT")

Good Luck!
0
 
jgvCommented:
snerken,

Although you could use an array to store the DNA strand it is not necessary. All of the examples use the "Len" function which is more suited to this type of situation.  If you did use an array, you would setup a loop using LBound (lower bound) and UBound (upper bound):
For x = LBound(DNAArray) To UBound(DNAArray)

As you can see, there are many ways to  accomplish what you are trying to do. Both igfp and supunr have given working solutions but I figured I'd throw in a couple more. I documented the code as best I could so that you could understand what is happening. The VB help files will give you the rest of the details on the functions being used. You can test these examples by copying and pasting the code into a new VB project; don't forget to add a command button.


Private Sub Command1_Click()
    Dim strDNACompliment As String
   
    'Pass the DNA strand to the ComplimentaryDNA function; it will
    'return the converted complimentary 3->5 strand. Add a "2" to the
    'end of "ComplimentaryDNA" to call the other function; you'll get
    'the same results
    strDNACompliment = ComplimentaryDNA("TAgGCTa")
   
    '"StrReverse" is a built in VB function that will reverse a string...
    MsgBox "3-5 Compliment: " & strDNACompliment & vbCr & _
           "5-3 Compliment: " & StrReverse(strDNACompliment)
           
End Sub

Private Function ComplimentaryDNA(ByVal Strand As String) As String
    Dim intCnt              As Integer
    Dim intMapPosition      As Integer
    Dim strDNAMap1          As String
    Dim strDNAMap2          As String
    Dim strDNAElement       As String
    Dim str3to5Compliment   As String
   
    'The "strDNAMap1" and "strDNAMap2" variables hold the DNA compliment map
    '   Map1 is used to locate each element from the DNA strand
    '   Map2 is used to get the complimentary element for each element from the DNA strand
    '
    'Example: DNA strand is "TGACT"
    '
    'The position of each element of the strand in Map 1 is:
    '   T=2, G=4, A=1, C=3, T=2
    '
    '   If we use these postions to extract elements from Map 2 we get:
    '   2=A, 4=C, 1=T, 3=G, 2=A
    '
    'So, "TGACT" = "ACTGA"
    strDNAMap1 = "ATCG"
    strDNAMap2 = "TAGC"
       
    'Loop through each element in the strand
    'The "Len" function returns the number of characters in a string
    For intCnt = 1 To Len(Strand)
        'Extract the current element from the DNA strand
        'The "Mid" function extracts a portion of a string; in this case we only want 1 character
        ' - Mid(TheString, StartPosition, Length)
        strDNAElement = Mid(Strand, intCnt, 1)
        'Get the current elements position from Map 1
        'The "Instr" function returns the start position of a string within another string
        ' - Instr(StartPosition, StringToSearch, StringToFind, HowToCompare)
        'We are doing a case insensitive search so that "a" will equal "A".
        intMapPosition = InStr(1, strDNAMap1, strDNAElement, vbTextCompare)
        'If InStr returns 0, then the element was not found in the Map. You would probably
        'want to prompt the user at this point but for now we'll substitute an asterisk.
        If intMapPosition = 0 Then
            'Element was not found in the map...
            str3to5Compliment = str3to5Compliment & "*"
        Else
            'Extract the complimentary element from the appropriate position in Map 2 and add it
            'to the Complimentary string variable; again, we use the "Mid" function to accomplish this
            str3to5Compliment = str3to5Compliment & Mid(strDNAMap2, intMapPosition, 1)
        End If
    Next
   
    'Return the complimentary sequence
    ComplimentaryDNA = str3to5Compliment
       
End Function

Private Function ComplimentaryDNA2(ByVal Strand As String) As String
    Dim intCnt              As Integer
    Dim strDNAElement       As String
    Dim strCompliment       As String
    Dim str3to5Compliment   As String
   
    'This example uses "Select Case" to figure out each elements
    'compliment. It is similar to If Then statements.
   
    'Loop through each element in the strand
    For intCnt = 1 To Len(Strand)
        'Extract 1 character from the strand; note that the character
        'being extracted is converted to uppercase ("a" = "A")
        strDNAElement = UCase(Mid(Strand, intCnt, 1))
        'Select the Case where the extracted element is equal
        Select Case strDNAElement
            Case "A": strCompliment = "T"
            Case "T": strCompliment = "A"
            Case "C": strCompliment = "G"
            Case "G": strCompliment = "C"
            Case Else
                'If we get here then we did not find a match to the element (its not A, T ,C or G)
                'You would probably want to prompt the user here but for new we'll substitute an
                'asterisk
                strCompliment = "*"
        End Select
        'Add the new element to the complimentary dna variable
        str3to5Compliment = str3to5Compliment & strCompliment
    Next
   
    'Return the complimentary sequence
    ComplimentaryDNA2 = str3to5Compliment
       
End Function
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
igfpCommented:
I agree with jgv truth is I guess you are looking for an easy code so I gave you a simple and easy way to do it. If you want to add the "*" char you just add it to the case and the if. It's easy and effective to do this in the string because you could send it to a textbox etc so all are working solutions. I hope you are not trying to make a very big string or the program will overflow. But it will work will something like more then 200 chars.
0
 
jgvCommented:
igfp, I'm curious as to why you think that you would get an overflow with a large string? A variable-length string can contain up to 2 billion (2^31) characters (according to the help files).With your example, the string could be as large as 32,767 characters (the integer variable "CharsInString%" is the limitation).
0
 
igfpCommented:
well because I get an overflow! It's wierd people tell me! Actually when I try to encript an rich text box I get an overflow. I have hundreds of lines of formated text and I can't decrypt it! When I encript I use char by char but I can't do that when I'm decrypting (i don't know why!) so i'm using a string! And it overflows! That's why, in C language I had overflows on linux systems with big strings and I'm experiencing the same in VB. But everyone tells me it should not happen! :(
0
 
jgvCommented:
As I mentioned in the last post, if you exceed a numeric data types storage limit you get an overflow. Perhaps you are using too small of a data type in your encrypt/decrypt program and that is where the problem is. To show you this, try this simple test:

Dim s As String
Dim i As Integer

s = String(32768, "a")

MsgBox Len(s)

i = Len(s) '<---Overflow error here

The only other thing I can thing of is that you have a memory leak somewhere. If you keep encountering this you should perhaps ask a question of your own and include some code. I'm sure that if it's a coding issue someone will be able to help you track it down.
0
 
snerkenAuthor Commented:
Thanks folks, I'm overwhelmed by the response.  I'll have to print this out and sit and eyeball the various suggestions.  Once I've got a working model I'll post it here.

Snerken
0
 
CleanupPingCommented:
snerken:
This old question needs to be finalized -- accept an answer, split points, or get a refund.  For information on your options, please click here-> http:/help/closing.jsp#1 
Experts: Post your closing recommendations!  Who deserves points here?
0
All Courses

From novice to tech pro — start learning today.