snerken
asked on
A practical app for Molecular Biology
I'm working on a little VB application that will take a short DNA sequence of the letters A,T,C and G and convert them into the complementary sequence and then print this sequence back to front.
In molecular biology we define the orientation of a cDNA (sense) strand as 5'->3'. The antisense strand runs 3'->5' direction.
Problem scenario:
1. Input a short sequence of letters A,T,C,G into an array. If the sequence is eg. "aactgg" its complementary strand would be "ttgacc".
2. Once the strand has been translated to its complementary strand (i.e "ttgacc" in a 3'->5' direction) the orientation is reversed to "ccagtt" to 5'->3'.
I've written a working solution that converts the letters to their complement, one at a time:
If string1 = "a" Then
string2 = "t"
ElseIf string1 = "t" Then
string2 = "a"
ElseIf string1 = "c" Then
string2 = "g"
ElseIf string1 = "g" Then
string2 = "c"
Else: string2 = "*"
This If/Else statement doesn't go over each element of the array (string). I'm working on various nested loop combinations such as:
For x = 0 To x = strLength
If string1 = "a" Then
string2 = "t"
ElseIf string1 = "t" Then
string2 = "a"
ElseIf string1 = "c" Then
string2 = "g"
ElseIf string1 = "g" Then
string2 = "c"
Else: string2 = "*"
End If
Next x
were strLength is the length of the input string and/or the number of elements in the array. Which leads me to ask is it neccesary to put the letters into an array at all?
Any suggestions are much appreciated.
Snerken
Brisbane
Australia
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
snerken,
Although you could use an array to store the DNA strand it is not necessary. All of the examples use the "Len" function which is more suited to this type of situation. If you did use an array, you would setup a loop using LBound (lower bound) and UBound (upper bound):
For x = LBound(DNAArray) To UBound(DNAArray)
As you can see, there are many ways to accomplish what you are trying to do. Both igfp and supunr have given working solutions but I figured I'd throw in a couple more. I documented the code as best I could so that you could understand what is happening. The VB help files will give you the rest of the details on the functions being used. You can test these examples by copying and pasting the code into a new VB project; don't forget to add a command button.
Private Sub Command1_Click()
Dim strDNACompliment As String
'Pass the DNA strand to the ComplimentaryDNA function; it will
'return the converted complimentary 3->5 strand. Add a "2" to the
'end of "ComplimentaryDNA" to call the other function; you'll get
'the same results
strDNACompliment = ComplimentaryDNA("TAgGCTa" )
'"StrReverse" is a built in VB function that will reverse a string...
MsgBox "3-5 Compliment: " & strDNACompliment & vbCr & _
"5-3 Compliment: " & StrReverse(strDNAComplimen t)
End Sub
Private Function ComplimentaryDNA(ByVal Strand As String) As String
Dim intCnt As Integer
Dim intMapPosition As Integer
Dim strDNAMap1 As String
Dim strDNAMap2 As String
Dim strDNAElement As String
Dim str3to5Compliment As String
'The "strDNAMap1" and "strDNAMap2" variables hold the DNA compliment map
' Map1 is used to locate each element from the DNA strand
' Map2 is used to get the complimentary element for each element from the DNA strand
'
'Example: DNA strand is "TGACT"
'
'The position of each element of the strand in Map 1 is:
' T=2, G=4, A=1, C=3, T=2
'
' If we use these postions to extract elements from Map 2 we get:
' 2=A, 4=C, 1=T, 3=G, 2=A
'
'So, "TGACT" = "ACTGA"
strDNAMap1 = "ATCG"
strDNAMap2 = "TAGC"
'Loop through each element in the strand
'The "Len" function returns the number of characters in a string
For intCnt = 1 To Len(Strand)
'Extract the current element from the DNA strand
'The "Mid" function extracts a portion of a string; in this case we only want 1 character
' - Mid(TheString, StartPosition, Length)
strDNAElement = Mid(Strand, intCnt, 1)
'Get the current elements position from Map 1
'The "Instr" function returns the start position of a string within another string
' - Instr(StartPosition, StringToSearch, StringToFind, HowToCompare)
'We are doing a case insensitive search so that "a" will equal "A".
intMapPosition = InStr(1, strDNAMap1, strDNAElement, vbTextCompare)
'If InStr returns 0, then the element was not found in the Map. You would probably
'want to prompt the user at this point but for now we'll substitute an asterisk.
If intMapPosition = 0 Then
'Element was not found in the map...
str3to5Compliment = str3to5Compliment & "*"
Else
'Extract the complimentary element from the appropriate position in Map 2 and add it
'to the Complimentary string variable; again, we use the "Mid" function to accomplish this
str3to5Compliment = str3to5Compliment & Mid(strDNAMap2, intMapPosition, 1)
End If
Next
'Return the complimentary sequence
ComplimentaryDNA = str3to5Compliment
End Function
Private Function ComplimentaryDNA2(ByVal Strand As String) As String
Dim intCnt As Integer
Dim strDNAElement As String
Dim strCompliment As String
Dim str3to5Compliment As String
'This example uses "Select Case" to figure out each elements
'compliment. It is similar to If Then statements.
'Loop through each element in the strand
For intCnt = 1 To Len(Strand)
'Extract 1 character from the strand; note that the character
'being extracted is converted to uppercase ("a" = "A")
strDNAElement = UCase(Mid(Strand, intCnt, 1))
'Select the Case where the extracted element is equal
Select Case strDNAElement
Case "A": strCompliment = "T"
Case "T": strCompliment = "A"
Case "C": strCompliment = "G"
Case "G": strCompliment = "C"
Case Else
'If we get here then we did not find a match to the element (its not A, T ,C or G)
'You would probably want to prompt the user here but for new we'll substitute an
'asterisk
strCompliment = "*"
End Select
'Add the new element to the complimentary dna variable
str3to5Compliment = str3to5Compliment & strCompliment
Next
'Return the complimentary sequence
ComplimentaryDNA2 = str3to5Compliment
End Function
Although you could use an array to store the DNA strand it is not necessary. All of the examples use the "Len" function which is more suited to this type of situation. If you did use an array, you would setup a loop using LBound (lower bound) and UBound (upper bound):
For x = LBound(DNAArray) To UBound(DNAArray)
As you can see, there are many ways to accomplish what you are trying to do. Both igfp and supunr have given working solutions but I figured I'd throw in a couple more. I documented the code as best I could so that you could understand what is happening. The VB help files will give you the rest of the details on the functions being used. You can test these examples by copying and pasting the code into a new VB project; don't forget to add a command button.
Private Sub Command1_Click()
Dim strDNACompliment As String
'Pass the DNA strand to the ComplimentaryDNA function; it will
'return the converted complimentary 3->5 strand. Add a "2" to the
'end of "ComplimentaryDNA" to call the other function; you'll get
'the same results
strDNACompliment = ComplimentaryDNA("TAgGCTa"
'"StrReverse" is a built in VB function that will reverse a string...
MsgBox "3-5 Compliment: " & strDNACompliment & vbCr & _
"5-3 Compliment: " & StrReverse(strDNAComplimen
End Sub
Private Function ComplimentaryDNA(ByVal Strand As String) As String
Dim intCnt As Integer
Dim intMapPosition As Integer
Dim strDNAMap1 As String
Dim strDNAMap2 As String
Dim strDNAElement As String
Dim str3to5Compliment As String
'The "strDNAMap1" and "strDNAMap2" variables hold the DNA compliment map
' Map1 is used to locate each element from the DNA strand
' Map2 is used to get the complimentary element for each element from the DNA strand
'
'Example: DNA strand is "TGACT"
'
'The position of each element of the strand in Map 1 is:
' T=2, G=4, A=1, C=3, T=2
'
' If we use these postions to extract elements from Map 2 we get:
' 2=A, 4=C, 1=T, 3=G, 2=A
'
'So, "TGACT" = "ACTGA"
strDNAMap1 = "ATCG"
strDNAMap2 = "TAGC"
'Loop through each element in the strand
'The "Len" function returns the number of characters in a string
For intCnt = 1 To Len(Strand)
'Extract the current element from the DNA strand
'The "Mid" function extracts a portion of a string; in this case we only want 1 character
' - Mid(TheString, StartPosition, Length)
strDNAElement = Mid(Strand, intCnt, 1)
'Get the current elements position from Map 1
'The "Instr" function returns the start position of a string within another string
' - Instr(StartPosition, StringToSearch, StringToFind, HowToCompare)
'We are doing a case insensitive search so that "a" will equal "A".
intMapPosition = InStr(1, strDNAMap1, strDNAElement, vbTextCompare)
'If InStr returns 0, then the element was not found in the Map. You would probably
'want to prompt the user at this point but for now we'll substitute an asterisk.
If intMapPosition = 0 Then
'Element was not found in the map...
str3to5Compliment = str3to5Compliment & "*"
Else
'Extract the complimentary element from the appropriate position in Map 2 and add it
'to the Complimentary string variable; again, we use the "Mid" function to accomplish this
str3to5Compliment = str3to5Compliment & Mid(strDNAMap2, intMapPosition, 1)
End If
Next
'Return the complimentary sequence
ComplimentaryDNA = str3to5Compliment
End Function
Private Function ComplimentaryDNA2(ByVal Strand As String) As String
Dim intCnt As Integer
Dim strDNAElement As String
Dim strCompliment As String
Dim str3to5Compliment As String
'This example uses "Select Case" to figure out each elements
'compliment. It is similar to If Then statements.
'Loop through each element in the strand
For intCnt = 1 To Len(Strand)
'Extract 1 character from the strand; note that the character
'being extracted is converted to uppercase ("a" = "A")
strDNAElement = UCase(Mid(Strand, intCnt, 1))
'Select the Case where the extracted element is equal
Select Case strDNAElement
Case "A": strCompliment = "T"
Case "T": strCompliment = "A"
Case "C": strCompliment = "G"
Case "G": strCompliment = "C"
Case Else
'If we get here then we did not find a match to the element (its not A, T ,C or G)
'You would probably want to prompt the user here but for new we'll substitute an
'asterisk
strCompliment = "*"
End Select
'Add the new element to the complimentary dna variable
str3to5Compliment = str3to5Compliment & strCompliment
Next
'Return the complimentary sequence
ComplimentaryDNA2 = str3to5Compliment
End Function
I agree with jgv truth is I guess you are looking for an easy code so I gave you a simple and easy way to do it. If you want to add the "*" char you just add it to the case and the if. It's easy and effective to do this in the string because you could send it to a textbox etc so all are working solutions. I hope you are not trying to make a very big string or the program will overflow. But it will work will something like more then 200 chars.
igfp, I'm curious as to why you think that you would get an overflow with a large string? A variable-length string can contain up to 2 billion (2^31) characters (according to the help files).With your example, the string could be as large as 32,767 characters (the integer variable "CharsInString%" is the limitation).
well because I get an overflow! It's wierd people tell me! Actually when I try to encript an rich text box I get an overflow. I have hundreds of lines of formated text and I can't decrypt it! When I encript I use char by char but I can't do that when I'm decrypting (i don't know why!) so i'm using a string! And it overflows! That's why, in C language I had overflows on linux systems with big strings and I'm experiencing the same in VB. But everyone tells me it should not happen! :(
As I mentioned in the last post, if you exceed a numeric data types storage limit you get an overflow. Perhaps you are using too small of a data type in your encrypt/decrypt program and that is where the problem is. To show you this, try this simple test:
Dim s As String
Dim i As Integer
s = String(32768, "a")
MsgBox Len(s)
i = Len(s) '<---Overflow error here
The only other thing I can thing of is that you have a memory leak somewhere. If you keep encountering this you should perhaps ask a question of your own and include some code. I'm sure that if it's a coding issue someone will be able to help you track it down.
Dim s As String
Dim i As Integer
s = String(32768, "a")
MsgBox Len(s)
i = Len(s) '<---Overflow error here
The only other thing I can thing of is that you have a memory leak somewhere. If you keep encountering this you should perhaps ask a question of your own and include some code. I'm sure that if it's a coding issue someone will be able to help you track it down.
ASKER
Thanks folks, I'm overwhelmed by the response. I'll have to print this out and sit and eyeball the various suggestions. Once I've got a working model I'll post it here.
Snerken
Snerken
snerken:
This old question needs to be finalized -- accept an answer, split points, or get a refund. For information on your options, please click here-> http:/help/closing.jsp#1
Experts: Post your closing recommendations! Who deserves points here?
This old question needs to be finalized -- accept an answer, split points, or get a refund. For information on your options, please click here-> http:/help/closing.jsp#1
Experts: Post your closing recommendations! Who deserves points here?
Dim i as Long
Dim charAt as String
DNA = Ucase(DNA)
ComplementaryDNA = ""
For i=1 to Len(ComplementaryDNA)
charAt = Mid(ComplementaryDNA, i, 1)
if (charAt = "A") then
ComplementaryDNA = ComplementaryDNA & "T"
elseif (charAt = "T") then
ComplementaryDNA = ComplementaryDNA & "A"
elseif (charAt = "G") then
ComplementaryDNA = ComplementaryDNA & "C"
elseif (charAt = "C") then
ComplementaryDNA = ComplementaryDNA & "G"
End If
Next i
ComplementaryDNA = ReverseString(Complementar
End Sub
Private Function ReverseString(str as String) as String
Dim i as long
ReverseString = ""
For i = Len(str) to 1 Step -1
ReverseString = ReverseString & Mid(str, i, 1)
Next i
End Sub
Somewhere in your code...
MsgBox ComplementaryDNA("AAGCTTCG
Good Luck!