A practical app for Molecular Biology

Posted on 2003-03-09
Medium Priority
Last Modified: 2010-05-01

I'm working on a little VB application that will take a short DNA sequence of the letters A,T,C and G and convert them into the complementary sequence and then print this sequence back to front.  

In molecular biology we define the orientation of a cDNA (sense) strand  as 5'->3'.  The antisense strand runs 3'->5' direction.  

Problem scenario:
1.  Input a short sequence of letters A,T,C,G into an array.  If the sequence is eg. "aactgg"  its complementary strand would be "ttgacc".  

2.  Once the strand has been translated to its complementary strand (i.e "ttgacc" in a 3'->5' direction) the orientation is reversed to "ccagtt" to 5'->3'.  

I've written a working solution that converts the  letters to their complement, one at a time:

        If string1 = "a" Then
            string2 = "t"
    ElseIf string1 = "t" Then
            string2 = "a"
    ElseIf string1 = "c" Then
            string2 = "g"
    ElseIf string1 = "g" Then
            string2 = "c"
     Else: string2 = "*"

This If/Else statement doesn't go over each element of the array (string).  I'm working on various nested loop combinations such as:

For x = 0 To x = strLength
    If string1 = "a" Then
            string2 = "t"
    ElseIf string1 = "t" Then
            string2 = "a"
    ElseIf string1 = "c" Then
            string2 = "g"
    ElseIf string1 = "g" Then
            string2 = "c"
    Else: string2 = "*"
    End If    
Next x
were strLength is the length of the input string and/or the number of  elements in the array.   Which leads me to ask is it neccesary to put the letters into an array at all?

Any suggestions are much appreciated.

Question by:snerken
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Accepted Solution

igfp earned 150 total points
ID: 8097386
this will never work. Let's say you've got a string1 = "ATCGATGT"
you want a string2 that goes like: "TAGCTACA", last time I checked biology! :p

here's what you do, make a cicle that detects each char on the string and change it to string2. For this I'll use a function so that you can use it as much times as you want.

CharsInString% = Len (String1)
For i% = 1 to CharsInString
    Char$ = Mid (String1, i%, 1)
    Char$ = DNA (Char$)
    String2 = String2 & DNA
Next i%

Notice that this won't work with very big strings! And string1 must be correct and String2 should be empty.

Now the function DNA that you may include and a module or as a Public function.

Function DNA(DNAChar$) As String
    If DNAChar <> "A" or  DNAChar <> "G" or_
       DNAChar <> "C" or DNAChar <> "T" Then
       ' I hope you aren't using RNA with that U
       goto error:
    End if

   Select Case DNAChar$

    Case A
        DNA = "T"

    Case G
        DNA = "C"

    Case C
        DNA = "G"

    Case T
        DNA = "A"

End Select

 msgbox "Your String Has illegal chars"

End Function

Tell me how it goes!
LVL 11

Expert Comment

ID: 8097481
Private Function ComplementaryDNA(DNA as String) As String
     Dim i as Long
     Dim charAt as String

     DNA = Ucase(DNA)

     ComplementaryDNA = ""
     For i=1 to Len(ComplementaryDNA)
          charAt = Mid(ComplementaryDNA, i, 1)
          if (charAt = "A") then
               ComplementaryDNA = ComplementaryDNA & "T"
          elseif (charAt = "T") then
               ComplementaryDNA = ComplementaryDNA & "A"
          elseif (charAt = "G") then
               ComplementaryDNA = ComplementaryDNA & "C"
          elseif (charAt = "C") then
               ComplementaryDNA = ComplementaryDNA & "G"
          End If
     Next i
     ComplementaryDNA = ReverseString(ComplementaryDNA)
End Sub

Private Function ReverseString(str as String) as String
     Dim i as long
     ReverseString = ""
     For i = Len(str) to 1 Step -1
          ReverseString = ReverseString & Mid(str, i, 1)
     Next i
End Sub

Somewhere in your code...

MsgBox ComplementaryDNA("AAGCTTCGCT")

Good Luck!
LVL 12

Expert Comment

ID: 8098258

Although you could use an array to store the DNA strand it is not necessary. All of the examples use the "Len" function which is more suited to this type of situation.  If you did use an array, you would setup a loop using LBound (lower bound) and UBound (upper bound):
For x = LBound(DNAArray) To UBound(DNAArray)

As you can see, there are many ways to  accomplish what you are trying to do. Both igfp and supunr have given working solutions but I figured I'd throw in a couple more. I documented the code as best I could so that you could understand what is happening. The VB help files will give you the rest of the details on the functions being used. You can test these examples by copying and pasting the code into a new VB project; don't forget to add a command button.

Private Sub Command1_Click()
    Dim strDNACompliment As String
    'Pass the DNA strand to the ComplimentaryDNA function; it will
    'return the converted complimentary 3->5 strand. Add a "2" to the
    'end of "ComplimentaryDNA" to call the other function; you'll get
    'the same results
    strDNACompliment = ComplimentaryDNA("TAgGCTa")
    '"StrReverse" is a built in VB function that will reverse a string...
    MsgBox "3-5 Compliment: " & strDNACompliment & vbCr & _
           "5-3 Compliment: " & StrReverse(strDNACompliment)
End Sub

Private Function ComplimentaryDNA(ByVal Strand As String) As String
    Dim intCnt              As Integer
    Dim intMapPosition      As Integer
    Dim strDNAMap1          As String
    Dim strDNAMap2          As String
    Dim strDNAElement       As String
    Dim str3to5Compliment   As String
    'The "strDNAMap1" and "strDNAMap2" variables hold the DNA compliment map
    '   Map1 is used to locate each element from the DNA strand
    '   Map2 is used to get the complimentary element for each element from the DNA strand
    'Example: DNA strand is "TGACT"
    'The position of each element of the strand in Map 1 is:
    '   T=2, G=4, A=1, C=3, T=2
    '   If we use these postions to extract elements from Map 2 we get:
    '   2=A, 4=C, 1=T, 3=G, 2=A
    'So, "TGACT" = "ACTGA"
    strDNAMap1 = "ATCG"
    strDNAMap2 = "TAGC"
    'Loop through each element in the strand
    'The "Len" function returns the number of characters in a string
    For intCnt = 1 To Len(Strand)
        'Extract the current element from the DNA strand
        'The "Mid" function extracts a portion of a string; in this case we only want 1 character
        ' - Mid(TheString, StartPosition, Length)
        strDNAElement = Mid(Strand, intCnt, 1)
        'Get the current elements position from Map 1
        'The "Instr" function returns the start position of a string within another string
        ' - Instr(StartPosition, StringToSearch, StringToFind, HowToCompare)
        'We are doing a case insensitive search so that "a" will equal "A".
        intMapPosition = InStr(1, strDNAMap1, strDNAElement, vbTextCompare)
        'If InStr returns 0, then the element was not found in the Map. You would probably
        'want to prompt the user at this point but for now we'll substitute an asterisk.
        If intMapPosition = 0 Then
            'Element was not found in the map...
            str3to5Compliment = str3to5Compliment & "*"
            'Extract the complimentary element from the appropriate position in Map 2 and add it
            'to the Complimentary string variable; again, we use the "Mid" function to accomplish this
            str3to5Compliment = str3to5Compliment & Mid(strDNAMap2, intMapPosition, 1)
        End If
    'Return the complimentary sequence
    ComplimentaryDNA = str3to5Compliment
End Function

Private Function ComplimentaryDNA2(ByVal Strand As String) As String
    Dim intCnt              As Integer
    Dim strDNAElement       As String
    Dim strCompliment       As String
    Dim str3to5Compliment   As String
    'This example uses "Select Case" to figure out each elements
    'compliment. It is similar to If Then statements.
    'Loop through each element in the strand
    For intCnt = 1 To Len(Strand)
        'Extract 1 character from the strand; note that the character
        'being extracted is converted to uppercase ("a" = "A")
        strDNAElement = UCase(Mid(Strand, intCnt, 1))
        'Select the Case where the extracted element is equal
        Select Case strDNAElement
            Case "A": strCompliment = "T"
            Case "T": strCompliment = "A"
            Case "C": strCompliment = "G"
            Case "G": strCompliment = "C"
            Case Else
                'If we get here then we did not find a match to the element (its not A, T ,C or G)
                'You would probably want to prompt the user here but for new we'll substitute an
                strCompliment = "*"
        End Select
        'Add the new element to the complimentary dna variable
        str3to5Compliment = str3to5Compliment & strCompliment
    'Return the complimentary sequence
    ComplimentaryDNA2 = str3to5Compliment
End Function
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.


Expert Comment

ID: 8098369
I agree with jgv truth is I guess you are looking for an easy code so I gave you a simple and easy way to do it. If you want to add the "*" char you just add it to the case and the if. It's easy and effective to do this in the string because you could send it to a textbox etc so all are working solutions. I hope you are not trying to make a very big string or the program will overflow. But it will work will something like more then 200 chars.
LVL 12

Expert Comment

ID: 8098804
igfp, I'm curious as to why you think that you would get an overflow with a large string? A variable-length string can contain up to 2 billion (2^31) characters (according to the help files).With your example, the string could be as large as 32,767 characters (the integer variable "CharsInString%" is the limitation).

Expert Comment

ID: 8098920
well because I get an overflow! It's wierd people tell me! Actually when I try to encript an rich text box I get an overflow. I have hundreds of lines of formated text and I can't decrypt it! When I encript I use char by char but I can't do that when I'm decrypting (i don't know why!) so i'm using a string! And it overflows! That's why, in C language I had overflows on linux systems with big strings and I'm experiencing the same in VB. But everyone tells me it should not happen! :(
LVL 12

Expert Comment

ID: 8099411
As I mentioned in the last post, if you exceed a numeric data types storage limit you get an overflow. Perhaps you are using too small of a data type in your encrypt/decrypt program and that is where the problem is. To show you this, try this simple test:

Dim s As String
Dim i As Integer

s = String(32768, "a")

MsgBox Len(s)

i = Len(s) '<---Overflow error here

The only other thing I can thing of is that you have a memory leak somewhere. If you keep encountering this you should perhaps ask a question of your own and include some code. I'm sure that if it's a coding issue someone will be able to help you track it down.

Author Comment

ID: 8101527
Thanks folks, I'm overwhelmed by the response.  I'll have to print this out and sit and eyeball the various suggestions.  Once I've got a working model I'll post it here.


Expert Comment

ID: 8900460
This old question needs to be finalized -- accept an answer, split points, or get a refund.  For information on your options, please click here-> http:/help/closing.jsp#1 
Experts: Post your closing recommendations!  Who deserves points here?

Featured Post


Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
When designing a form there are several BorderStyles to choose from, all of which can be classified as either 'Fixed' or 'Sizable' and I'd guess that 'Fixed Single' or one of the other fixed types is the most popular choice. I assume it's the most p…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
Suggested Courses

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question