string management

my problem is simple.
i need to import data and insert/update my databases tables.
i am keeping track of the adds, inserts and deletes.

My current structure first reads all the key values from the target table into a dataset. During the process i scan this table for the key. If found it is an update, if not found i add the key and update the database, if it is a delete i remove the key from the table.

This works and eliminates hits on my database and web site.  My database is accessed via web services so it is critical not to make to many hits on the database.

I ran into a problem the other day when i tried to import 200,000 records into a blank table. As the data set did repeated adds my preformance went into the tank.

Can anyone suggest the best way to build an in memory table of keys with up o 1/4 of million records that allows fast add delete and searching. Key size can be upto 40 characters.

By the way I know it is the dataset management, because i disabled it on my initial import where the table was blank and I was able to update my table at a rate of 100+ records a second.  With the dataset enabled by peformance dropped to 5 records a second.

hope this all makes sense
thanks
in advance


 






LVL 8
AutoeformsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Bob LearnedCommented:
What is the underlying database system type (SQL Server, Oracle, ...)?

Bob
0
AutoeformsAuthor Commented:
it is mysql but could be mssql it is client selectable.

 but what does it matter

greg
0
Bob LearnedCommented:
Greg,

If you have VB.NET 2005, and SQL Server, then there is an optimized way of doing bulk insert with a single DataTable, and the SqlBulkCopy class, versus doing multiple single-record inserts.

Bob
0
Become a CompTIA Certified Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

AutoeformsAuthor Commented:
thanks but my DB does not need to be SQL Server.

It is also not a direct connect, the updates happen to the remote server via web services.  Unless I am missing something is there a better way to be doing this with out web services?

greg
0
Bob LearnedCommented:
Where is the remote server located?  Web services provide a simple approach to database connectivity, but is not the "perfect" solution, since it requires SOAP serialization/deserialization in order to work.  It can really simplify things if you are working with Oracle, since it requires the client software to be installed on each client machine.  

Bob
0
AutoeformsAuthor Commented:
not using oracle and in our model we don't want the database connectivity from the clients. we have centralized it for management, control and security.

the web services are actually working fine. my problem is more with the localized dataset that i am inserting new keys into. It is dragging the system and the process int the ground after insert about 10000 keys in the dataset.

I am thinking about changing it to a collection but I am looking for the best practice for managing and searching what could be 1/4 million strings that could be up to 40 characters in length. My best guess is the alloc and re-alloc of memory is causing the problem.

is this making sense?
thanks for the help, i am glad you are the one that jumped on this.

greg


0
Bob LearnedCommented:
Oracle was only an example of why you would use a web service.  There are plenty of other reasons for web services, but if it is for security, then there are better approaches then using a web service.

DataSets can add a bit of overhead, and it depends on what the end result of adding data to the database requirements are, as to whether they are the right choice.  With hash table-type structures, you would have O(n) to add, and O(1) to search.

Bob
0
AutoeformsAuthor Commented:
i know i can test it but am under the gun :). will the hash table fragment memory or can i reserve a huge block.

what direction would you point me in for better options then web services?

greg
0
Bob LearnedCommented:
What kind of security are you wanting to achieve with web services?  

With 2005, there is the generics version of the Hashtable structure, and is type-safe (Dictionary(Of String, String)).  One of the constructors for the Dictionary takes a capacity argument.  If you don't specify the initial capacity, then it defaults to 16, and doubles in size when additional capacity is required.  

Example:
Dim dataList As New Dictionary(Of String, String)(250000)

Bob
0
AutoeformsAuthor Commented:
we have two conversations running.

the security is for my databases. that is what the web services are managing.

My client does need the security on the keys. I just need a structure that is super fast.

to recap i started with an empty table and needed to insert all the keys. by the time i got to 10000 the system was crawling. this is where i am guessing memory was really busted into thousands of little pieces (i am old time C programmer and can only image what was going on behind the covers).

I am going to try this
Dim dataList As New Dictionary(Of String, String)(250000)


can i change the realloc point so i could say make the size 5000 and realloc in 5000 blocks automatically so i don't waste size for smaller imports.

thanks for the help by the way
g
0
Bob LearnedCommented:
When you reach the point of allocating memory for the dictionary, do you know the import data size?   That number doesn't need to be a fixed, constant size.  It can be a variable size:

Dim document As New XmlDocument()
document.Load(fileName)
Dim nodeList As XmlNodeList = document.SelectNodes("//question")
Dim dataList As New Dictionary(Of String, String)(nodeList.Count)

Bob
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
AutoeformsAuthor Commented:
no i could guesstimate, it is a delimited file terminated with CRLF.
g
0
Bob LearnedCommented:
Do you read that file into memory, before you process?  You could borrow from a WinForms RichTextBox control to process the text.  Then, you could get the number of lines to determine size.

Bob
0
AutoeformsAuthor Commented:
nah, i read line by line, i was trying to conserve memory realizing the file could be hundreds of thousands of lines.

g
0
Bob LearnedCommented:
I would read one of the large files in, and see what kind of memory usage you get, before you say something like that.  If you use a local RichTextBox control, it should be garbage collected fairly quickly, or you could even implement IDisposable, if you don't want to wait.

Bob
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.