Solved

dont write duplicate into array in vb.net

Posted on 2016-08-18
13
47 Views
Last Modified: 2016-08-18
I have some duplicate words in my table data.

I identified them with expert help as part of ID: 41760693

I think removing them post insert is not right and wonder if the duplicate can be removed just after the split process of the string
but before they get inserted into the sql server db itself.

The for loop i have:

For Each drAccessRecord As DataRow In dtRecordsFromAccess.Rows
'create array of words from string
                        Dim StrArray() As String = Split(drAccessRecord(FieldDescription))
'deal with progress
                        Form1.ProgressBar1.Step = 1
                        Form1.ProgressBar1.Minimum = 1
                        Form1.ProgressBar1.Maximum = y
'for each word in the array
                        For index = LBound(StrArray) To UBound(StrArray)
'dont allow unwanted characters
                            StrClientCodeWordPos = drAccessRecord(FieldNameClientCode) & "_" & RemoveUnwantedChr(StrArray(index)) & "_" & index + 1
                            StrClientCode = drAccessRecord(FieldNameClientCode)
                            StrFull = RemoveUnwantedChr(drAccessRecord(FieldDescription))
                            StrClientName = StrClientName
'get the word
                            StrWord = RemoveUnwantedChr(StrArray(index))
                            intWordLen = Len(StrArray(index))
'mark position of word (important to preserve sentence creation'
                            IntWordPosition = index + 1
                            IntNoOfWords = UBound(StrArray) + 1
'insert word and other values into sql db
                            cmdInsert.CommandText = "INSERT INTO TblWords (ClientCodeWordPosition, ClientCode, ClientName, Word, WordLen, StrFull, WordPosition, NoOfWords) VALUES ('" & StrClientCodeWordPos & "','" & StrClientCode & "','" & StrClientName & "','" & StrWord & "'," & intWordLen & ",'" & StrFull & "'," & IntWordPosition & "," & IntNoOfWords & " )"
                            cmdInsert.ExecuteNonQuery()

                        Next index

Open in new window

0
Comment
Question by:PeterBaileyUk
  • 9
  • 4
13 Comments
 

Author Comment

by:PeterBaileyUk
ID: 41761097
maybe something like this
not sure how to make the new array equal to the one with dups

     Dim alDeDup As New ArrayList

                        Dim pos As Integer = 0
                        Do Until pos = alDeDup.Count
                            Dim i As Integer
                            For i = alDeDup.Count - 1 To pos + 1 Step -1
                                If alDeDup(pos).ToString = alDeDup(i).ToString Then
                                    alDeDup.RemoveAt(i)
                                End If
                            Next
                            pos += 1
                        Loop

Open in new window

0
 

Author Comment

by:PeterBaileyUk
ID: 41761118
ive got a loop comparing each element with the previous ive kept the same array for now just a question of creating the new array (i think)

ive put this after the split

   Dim pos As Integer = 0
                        Do Until pos = StrArray.Count
                            Dim i As Integer
                            For i = StrArray.Count - 1 To pos + 1 Step -1
                                If StrArray(pos).ToString = StrArray(i).ToString Then
'do not add row to new array
else
'add row to new array

                                End If
                            Next
                            pos += 1
                        Loop

Open in new window

0
 

Author Comment

by:PeterBaileyUk
ID: 41761124
I am trying to solve here is the latest fragment

   For Each drAccessRecord As DataRow In dtRecordsFromAccess.Rows
                        Dim StrArray() As String = Split(drAccessRecord(FieldDescription))
                        Dim StrArrayNoDup() As String


                        Dim pos As Integer = 0
                        Do Until pos = StrArray.Count
                            Dim i As Integer
                            For i = StrArray.Count - 1 To pos + 1 Step -1
                                If StrArray(pos).ToString = StrArray(i).ToString Then
                                    ' do not add

                                Else

                                    StrArrayNoDup(i) = StrArray(pos)
                                End If
                            Next
                            pos += 1
                        Loop

Open in new window

0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 

Author Comment

by:PeterBaileyUk
ID: 41761133
it didnt work but maybe its just something silly i did the idea seems sound
0
 
LVL 33

Expert Comment

by:it_saige
ID: 41761143
So you are saying that the duplicates are in StrArray?  If that is the case then you can use Enumerable.Distinct in order to remove duplicates; e.g. -
Dim StrArray = Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase)

Open in new window


Proof of concept -
Module Module1
	Private description = "I HAVE duplicates with CAse MiXing the DuPliCates should be removed when I split and use distinct ensuring caSE MixinG is accounted For"

	Sub Main()
		Console.WriteLine("A normal split of: {0}{1}{0}Produces {2} individual verbs with duplicates.", Environment.NewLine, description, Split(description).Count())
		Console.WriteLine()
		Console.WriteLine("On the other hand, a split of the preceeding passed through{0}Enumerable.Distinct() produces {1} individual verbs not considering case mixing.", Environment.NewLine, Split(description).Distinct().Count())
		Console.WriteLine()
		Console.WriteLine("If case mixing is an issue, a split of the preceeding passed through{0}Enumerable.Distinct(StringComparer.OrdinalIgnoreCase) produces {1} individual{0}verbs with no ordinal duplicates.", Environment.NewLine, Split(description).Distinct(StringComparer.OrdinalIgnoreCase).Count())
		Console.ReadLine()
	End Sub
End Module

Open in new window

Produces the following output:Capture.JPGBut it looks like you need to retain the position of the removed duplicates, is this accurate?

-saige-
0
 

Author Comment

by:PeterBaileyUk
ID: 41761157
I take a string of a vehicle description from an access db, its split then the words are stored along with their positions.

I dont need the duplicated word at all so:

if FieldDescription is: "BM BM 125 roadstar"

currently it goes to sql server and stored

bm pos 1
bm pos 2
125 pos 3
roadstar pos 4

the array if its possible which i think so by your description

would create the array as

bm pos 1
125 pos 2
roadstar pos 3

then that can be added to sql as is without the duplicate
0
 
LVL 33

Expert Comment

by:it_saige
ID: 41761162
That is correct.  Using the method I described, Distinct after your split, on "BM BM 125 roadstar" will produce an array { "BM", "125", "roadstar" }.

Which means the resulting entries into sql will be

BM pos 1
125 pos 2
roadstar pos 3

But if the description is "BM bm 125 roadstar", you will end up with an array of { "BM", "bm", "125", "roadstar" }, this is why I add a Comparer (in this case StringComparer) so that I can tell Distinct to remove ordinal duplicates.

-saige-
0
 

Author Comment

by:PeterBaileyUk
ID: 41761166
I am just populating the table hopefully it really was a relatively simple change, vb.net is actually quite amazing.
0
 

Author Comment

by:PeterBaileyUk
ID: 41761188
i tried like this:
 Dim StrArray = Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase)

it ran but added no records

if i do this
 Dim StrArray ()= Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase)

i get an underline
0
 
LVL 33

Expert Comment

by:it_saige
ID: 41761211
Try with your original declaration:
Dim StrArray() As String = Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase).ToArray()

Open in new window


Your Projects configuration may not accept implicit variable declarations or you may have Option Explicit defined at the top of your code file.  Make sure you add .ToArray() after the Distinct method or you will get a runtime error.

-saige-
0
 

Author Comment

by:PeterBaileyUk
ID: 41761224
Its failing on the for loop For index = LBound(StrArray) To UBound(StrArray)


       Using cnSql As New SqlClient.SqlConnection("Data Source=MAIN-PC\SQLEXPRESS;Initial Catalog=Dictionary;Integrated Security=True;MultipleActiveResultSets=True")
                Using cmdInsert As New SqlClient.SqlCommand
                    cmdInsert.Connection = cnSql
                    cnSql.Open()


                    y = dtRecordsFromAccess.Rows.Count
                    For Each drAccessRecord As DataRow In dtRecordsFromAccess.Rows
                        'Dim StrArray() As String = Split(drAccessRecord(FieldDescription))


                        Dim StrArray = Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase)

                        Form1.ProgressBar1.Step = 1
                        Form1.ProgressBar1.Minimum = 1
                        Form1.ProgressBar1.Maximum = y
                        For index = LBound(StrArray) To UBound(StrArray)

                            StrClientCodeWordPos = drAccessRecord(FieldNameClientCode) & "_" & RemoveUnwantedChr(StrArray(index)) & "_" & index + 1
                            StrClientCode = drAccessRecord(FieldNameClientCode)
                            StrFull = RemoveUnwantedChr(drAccessRecord(FieldDescription))
                            StrClientName = StrClientName
                            StrWord = RemoveUnwantedChr(StrArray(index))
                            intWordLen = Len(StrArray(index))
                            IntWordPosition = index + 1
                            IntNoOfWords = UBound(StrArray) + 1

                            cmdInsert.CommandText = "INSERT INTO TblWords (ClientCodeWordPosition, ClientCode, ClientName, Word, WordLen, StrFull, WordPosition, NoOfWords) VALUES ('" & StrClientCodeWordPos & "','" & StrClientCode & "','" & StrClientName & "','" & StrWord & "'," & intWordLen & ",'" & StrFull & "'," & IntWordPosition & "," & IntNoOfWords & " )"
                            cmdInsert.ExecuteNonQuery()

                        Next index

                        Form1.ProgressBar1.PerformStep()
                        Form1.Label3.Text = "# of Files Read = " & Math.Round((Form1.ProgressBar1.Value.ToString / y) * 100, 2) & "%"
                        Form1.Label3.Refresh()
                    Next
                End Using
                cnSql.Close()
            End Using

        Catch ex As Exception


        Finally

            con.Close()
        End Try

Open in new window

0
 
LVL 33

Accepted Solution

by:
it_saige earned 500 total points
ID: 41761254
Try adding .ToArray() after Distinct(StringComparer.OrdinalIgnoreCase); e.g. -
Dim StrArray = Split(drAccessRecord(FieldDescription)).Distinct(StringComparer.OrdinalIgnoreCase).ToArray()

Open in new window

-saige-
0
 

Author Closing Comment

by:PeterBaileyUk
ID: 41761339
thank you i will have a follow on question about "" words but i will ask a new question tomorrow.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Get number of Files in Directory and Sub Directories 2 48
VB.net capture  result of a SQL query in variables 3 21
MailAddress in vb 4 29
Passing data between Forms 3 20
Well, all of us have seen the multiple EXCEL.EXE's in task manager that won't die even if you call the .close, .dispose methods. Try this method to kill any excels in memory. You can copy the kill function to create a check function and replace the …
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question