?
Solved

Optimize Format Functions (DataTable,Vb.Net, VS2005)

Posted on 2007-07-31
14
Medium Priority
?
1,465 Views
Last Modified: 2013-11-26
Hi guys..

Need some help with this.. I like to optimize this function, if it's possible?

The function takes a datatable, finds the datecolumns that have the format YYYYMMDD

But I need to change that format to YYYY-MM-DD

So I use this function.. but it takes alot of time now.. it's about 20 000 rows in the datatable

So any help would be great

Thx: :)

 Private Function CreateSQLTable(ByVal dt As DataTable, ByVal SQLcolumNames() As String) As DataTable

        Dim myDataTable As New DataTable("resa")
        Dim column As DataColumn

        For i As Integer = 0 To m_Test_CSVcolumNames.Length - 1
            column = New DataColumn(m_Test_CSVcolumNames(i))
            column.DataType = GetType(String)
            myDataTable.Columns.Add(column)
        Next

        Dim row As DataRow
        Dim Value As String = String.Empty

        For Each MyRow As DataRow In dt.Rows
            'Create a new row in the SQL table
            row = myDataTable.NewRow()

            ' Copy row's
            For Each myColumn As DataColumn In dt.Columns
                ' Get the value
                Value = MyRow(myColumn).ToString
                ' Fix value if it's a date value
                If String.CompareOrdinal(myColumn.ColumnName, "DateStart") = 0 Then
                    Value = Value.Substring(0, 4) + "-" + Value.Substring(4, 2) + "-" + Value.Substring(6, 2)
                ElseIf String.CompareOrdinal(myColumn.ColumnName, "DateEnd") = 0 Then
                    Value = Value.Substring(0, 4) + "-" + Value.Substring(4, 2) + "-" + Value.Substring(6, 2)
                End If
                ' Add data to row
                row(myColumn.ColumnName) = CastValue(Value, myDataTable.Columns(myColumn.ColumnName).DataType)
            Next
            ' Add row to datatable
            myDataTable.Rows.Add(row)
        Next
        Return myDataTable
    End Function
0
Comment
Question by:AWestEng
  • 7
  • 4
  • 3
14 Comments
 
LVL 96

Expert Comment

by:Bob Learned
ID: 19602690
1) Are you trying to import CSV to an SQL Server table?  

2) Are you using VB.NET 2005?

Bob
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19603277
1: yes but I need to fix them because i'm using them for other stuff to
2: Yes
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 19603312
If you are writing dates to a SQL Server table, it can parse dates in the format yyyyMMdd, and store the values.  Are they date columns or string columns?

Bob
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 1

Author Comment

by:AWestEng
ID: 19603369
date columns in the database, but I use the datatable to create the querys. that's no problem-

 I insert over 8000 rows/sec .so thats optimized to the max, I have tested all types of methods, and this is the fastest. (except LOAD FILE, but I can't use this because how I must handle the data)

the only thing I have a problem with is that it takes alot of time to change the format in the datatable.

So it would be greate if you have any tips to make this faster. :)
0
 
LVL 34

Accepted Solution

by:
Sancler earned 1920 total points
ID: 19607193
For the core operation, using [String].Insert would be quicker than repeatedly using .Substring to build up a String.  I've just tested these two functions

    Private Function doChange(ByVal value As String) As String
        Return value.Substring(0, 4) + "-" + value.Substring(4, 2) + "-" + value.Substring(6, 2)
    End Function

    Private Function doChange2(ByVal value As String) As String
        Return value.Insert(6, "-").Insert(4, "-")
    End Function

and the second is at least 25% faster than the first and can be as much as 50% faster.

As a general rule, if you want to refer to a "calculated" value or reference a number of times the best course is to calculate it once and then use the result.  Every time your code says myColumn.ColumnName, a look up is necessary.  It's fast, but you do it four times in each pass through the inner (myColumn) loop, and that may - you will need to test - make it preferrable to replace it with

    Dim colName As String = ""

before the loops commence, putting

        colName = myColumn.ColumnName

as the first line in the inner loop and then changing the myColumn.ColumnName to colName in the rest of the inner loop code.

But that code, in any event, looks to me to contain unnecessary operations.  All columns in the new datatable are of datatype String.  That is clear from this line

           column.DataType = GetType(String)

when you set the columns up.  Value is of datatype String.  That is clear from this line

        Dim Value As String = String.Empty

So I do not see any real purpose for this part

     CastValue(Value, myDataTable.Columns(myColumn.ColumnName).DataType)

of this line

                row(myColumn.ColumnName) = CastValue(Value, myDataTable.Columns(myColumn.ColumnName).DataType)

The code for CastValue is not shown, but it looks like you are converting what is already a String into a String - and using code that requires complicated look ups to do so.

But I really wonder whether the whole approach is more complicated and time-consuming than it needs to be.  What are the differences in structure between the two tables?  I suppose that, if the CSV-based table does have datatypes in it other than string, some such approach may be necessary.  But if that is not the case I think a datatable.Copy, followed by code to loop just through the two columns - DateStart and DateEnd - that you want to modify would be more efficient.  Indeed, even if there is a change of datatypes involved, it may be more efficient to start with datatable.Copy and then add columns to the target table, loop through the relevant columns putting cast values in the new columns, and then delete the old columns.  And then again, even if you do need a "YYYY-MM-DD" format for string presentation of dates for some purposes, it does not automatically follow that the best approach to achieving that is to create a datatable holding strings in that format.  So, for instance, either of the Functions illustrated at the start of this post could produce such a result specifically when you needed it like that.

All those comments are general.  What is "right" has to be judged in context and we are looking at this particular function out of context.  But I think it might be worth you taking a wider look at the issue of optimization rather than just saying "how can I make this code faster".

Roger
0
 
LVL 96

Assisted Solution

by:Bob Learned
Bob Learned earned 80 total points
ID: 19607487
The SqlBulkCopy for 2005 is very optimized for the bulk insert processing of a DataTable.

Bob
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19616584
Sancler:: Yhx fot all the comments.. :)

You are absolutly correct about the funcntion.

I have a datatable that's in strings and I just ehant to change the "DateStart" and "DateEnd" so it's even ok to change the orginal datatable values and return that.

- : so should I make a new datatable, change ta values
-: or copy the existing one to a new one,, and change the values
-: change the values in the orginal dattable.
0
 
LVL 34

Expert Comment

by:Sancler
ID: 19616658
If you're looking for the quickest, then change the values in the orignal datatable.  Whatever you do you are going to have to change those values.  Moving them anywhere else before you do so is (a) not necessary and (b) takes time and resources.

Roger
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19616755
What do you think about this one ?

    Private Function CreateSQLTable(ByVal dt As DataTable) As DataTable
        Dim ST As New StringBuilder
        Dim Value As String = String.Empty
        Dim ColumnName As String = String.Empty

        For Each MyRow As DataRow In dt.Rows
            ' change values
            For Each myColumn As DataColumn In dt.Columns
   
                ColumnName = myColumn.ColumnName

                ' Fix value if it's a date value
                If String.CompareOrdinal(myColumn.ColumnName, "DateStart") = 0 Then
                    ' Get the value
                    Value = MyRow(myColumn).ToString

                    ST.Append(Value.Substring(0, 4))
                    ST.Append("-")
                    ST.Append(Value.Substring(4, 2))
                    ST.Append("-")
                    ST.Append(Value.Substring(6, 2))
                    Value = ST.ToString
                    ST.Length = 0

                    MyRow.Item(myColumn) = Value

                ElseIf String.CompareOrdinal(myColumn.ColumnName, "DateEnd") = 0 Then
                    ' Get the value
                    Value = MyRow(myColumn).ToString

                    ST.Append(Value.Substring(0, 4))
                    ST.Append("-")
                    ST.Append(Value.Substring(4, 2))
                    ST.Append("-")
                    ST.Append(Value.Substring(6, 2))
                    Value = ST.ToString
                    ST.Length = 0

                    MyRow.Item(myColumn) = Value
                End If

            Next
        Next

        Return dt
    End Function
0
 
LVL 34

Assisted Solution

by:Sancler
Sancler earned 1920 total points
ID: 19616891
I'd need to test to see if using a stringbuilder with substrings was quicker than using [String].Insert.  But you could do that.

This approach

                If String.CompareOrdinal(myColumn.ColumnName, "DateStart") = 0 Then

is less efficient than it needs to be.  Whatever columns contain DateStart and DateEnd they are going to be the same on every row.  So you don't need to search for them every time.  Before you get into the For Each row loops you need to find the indexes for which columns you want and then use those index numbers within the loop.  At the moment you are cycling through every column in dealing with every row.

And if you do that you can get rid of all the ColumnName and myColumn.ColumnName stuff.

Roger
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19616919
It's the index 1 and 2
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19617007
something like this?

    Private Function CreateSQLTable(ByVal dt As DataTable) As DataTable
        Dim ST As New StringBuilder
        Dim Value As String = String.Empty
        Dim ColumnName As String = String.Empty

        For Each MyRow As DataRow In dt.Rows
            Value = MyRow.Item(1).ToString

            ST.Append(Value.Substring(0, 4))
            ST.Append("-")
            ST.Append(Value.Substring(4, 2))
            ST.Append("-")
            ST.Append(Value.Substring(6, 2))
            Value = ST.ToString
            ST.Length = 0

            MyRow.Item(1) = Value

            Value = MyRow.Item(3).ToString

            ST.Append(Value.Substring(0, 4))
            ST.Append("-")
            ST.Append(Value.Substring(4, 2))
            ST.Append("-")
            ST.Append(Value.Substring(6, 2))
            Value = ST.ToString
            ST.Length = 0

            MyRow.Item(3) = Value
        Next

        Return dt
    End Function
0
 
LVL 34

Expert Comment

by:Sancler
ID: 19617457
Yes, something like that.

Roger
0
 
LVL 1

Author Comment

by:AWestEng
ID: 19617985
Thx m8!! :)

I really appreciated all the help.. :)
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
The viewer will learn how to use NetBeans IDE 8.0 for Windows to connect to a MySQL database. Open Services Panel: Create a new connection using New Connection Wizard: Create a test database called eetutorial: Create a new test tabel called ee…
Suggested Courses
Course of the Month15 days, 2 hours left to enroll

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question