Undersanding DataSets - Under the hood

Posted on 2007-07-29
Last Modified: 2010-04-23
Hello ,

This is regarding VB 2005 as I'm trying to get a clearer understanding of DataSets.

The way I understand it is that when you add a DataSource to your application it creates TableAdapters. These initially have nothing to do with the DataSet since one hasn't been created at the time of adding a DataSource which is why they have their own Namespace.

Now we create a DataSet using the Designer (although we don't have to use the designer as I understand it, I am right now). We can add all our tables and then any relationships we want defined at design time. So far so good.

Now, when we add a table to a form, things start to get fuzzy. What I see is a DataSet created for each form I add data to. It seems wasteful to have a full set a of Data, particularly if you have a DataSet with a lot of tables, say 25 or more created for each form.

Would this mean that each form has a full copy of all the data from the database of all the tables in the DataSet? That would be very wasteful. Or, does the DataSet copied onto the form serve as a mechanism for keep all data used by the form synced up but only data that form uses out of that DataSet is actually pulled from the Database along with any required related data as defined by the relationships in the DataSet?

The second scenario would make more sense from an efficiency point of view unless the Application was relying on some sort of caching of data that all DataSets in the application pull from, which I wouldn't discount with data pooling and so on.

Right now, to keep all forms synced up with each other I replace the DataSet that is on the form with a global DataSet (such as: Me.TRUWESTDataSet = DS) that I setup at the start of my App. This works well but understanding more of the underlying efficiencies on how things work, or lack thereof would certainly help with me knowing whether my architectures where reasonable for the application/usage for which they are designed.

Question by:RegProctor
    LVL 96

    Expert Comment

    by:Bob Learned
    1) You can use the designer to make any changes to the DataSet.

    2) You can have a single class (not a form), that houses the instance of the typed dataset, and provide access to all forms from a single place.

    3) Using the n-tiered approach, all of the functionality of the data access would be in a single class that wraps up database access.

    4) I don't drag tables to forms, but provide access to data by other means.

    LVL 1

    Author Comment

    Thanks Bob,

    I've just done a form where I systematically removed all the components that dragging tables to forms adds to the forms - in my case from linking DataGridViews to data at design time. So now I can create forms without using the dragging. And, I have all my Data routines in a single class.

    It still doesn't explain though what is happening regarding data for each form when you do link up your design time dataset with objects on forms at design time though dragging or otherwise. So far I've figured out that when you run your application the data on the form seems to be pulled into the form independent of other forms. This seems to suggest that the DataSet that is placed on the form is independent of all other DataSets placed on other forms.

    Now, with that being the case, is the DataSet on that form also pulling a complete copy of all Data of all Tables defined in the design time DataSet when you run the app, or just the data that that form needs. If it's pulling all data from all tables in the DataSet, that could easily be a complete copy of all data in your database for every form! Wow, that would be terribly inefficient and I would have to question why it worked that way at all.

    LVL 96

    Accepted Solution

    When you are creating DataSet instances on separate forms, each one is getting a different instance, and that is not efficient.

    It is generally considered that you would only have an instance of a DataSet on the main form, and shared among the other forms.

    If you have multiple TableAdapters defined, the DataSet isn't filled until there is a call to the Fill method from a TableAdapter.  In this way, you can have a partially filled DataSet.

    LVL 34

    Assisted Solution

    The way I understand it is that when you add a DataSource to your application it creates TableAdapters. These initially have nothing to do with the DataSet since one hasn't been created at the time of adding a DataSource which is why they have their own Namespace.

    Now we create a DataSet using the Designer (although we don't have to use the designer as I understand it, I am right now). We can add all our tables and then any relationships we want defined at design time. So far so good.

    My understanding (based on empirical observation only) is a bit different.  When I use "add datasource" it appears immediately to create both the DataSet and the TableAdapters.  I say this because it is immediately possible to inspect the dataset.  Also, if the dataset had not been created, there would not be any detail of tables and fields/columns in the DataSources window that could be dragged and dropped onto a control.  The dataset has no data in it.  But it has the full structure of all the datatables that, when required, the TableAdapters can fill.

    Both the dataset and the tableadapters thus created are CLASS files.  When you "add a table to a form" and "things start to get fuzzy" what is happening is that INSTANCES of the relevant classes are placed on that form.  So, yes the form gets (a) an INSTANCE of the dataset and (b) an INSTANCE of the TableAdapter relevant to the specific DataTable that you've added to the form.  The instance of the DataSet, like the class, is empty of data, but it contains the structure definitions for all DataTables - including those that you have not put on the form.  The TableAdapter will be used to fill that INSTANCE of the DataSet with the relevant data in, and just in, "its own" DataTable.  So, yes, there is inefficiency in that this form has, in its instance of the DataSet, _definitions_ of DataTables that it is not going to use.  But no, the inefficiency does not extend to it having all those DataTables filled with data in its instance of the DataSet.

    But the other side of that coin is that - because each form has its own instance of the dataset - it has no direct knowledge of what is happening to the same or related data in other instances of the dataset on other forms.  It is not unusual, for instance (and taking a very simple example), for there to be a datagridview on one form and a detail view on another form of the record which is "selected" in the datagridview.  This often happens where the developer (or client) wants records to be added or edited other than on the browsing form.  But the standard set up means that the browsing form is looking at one DataTable - that in its own instance of the dataset - but the editing form is looking at a different DataTable - that in its own instance of the dataset.  So changes that are made in one are not _automatically_ reflected in the other even though, so far as the user is concerned, both are views of the _same_ datatable.

    Which is why I think that, although the Add DataSource wizard and the drag and drop creation of bound controls make things _appear_ nice and easy, they are not - except in the simplest of cases - conducive to efficient data-handling.  I agree with Bob that, for anything but a minimalist app, there ought to be a central dataset and, if that then requires a bit of manual coding for bindings, the extra effort is worth it.

    I also don't really like TableAdapters.  But that is a different issue ;-)


    PS.  On re-reading that, and Bob's posts, before pressing "submit" I feel I've covered much of the same ground as him.  But I decided to post anyway as I have added bits about how (at least as I see it) the wizards and drag and drop systems are functioning.
    LVL 1

    Author Comment

    Thanks so much, I think I have a much better idea now. I guess the bottom line is that it was convenience for MS to put all those Data objects on forms without regard for syncing up the data between forms. No doubt if you ask them it will called a feature, not an inconvenience.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    This article describes relatively difficult and non-obvious issues that are likely to arise when creating COM class in Visual Studio and deploying it by professional MSI-authoring tools. It is assumed that the reader is already familiar with the cla…
    Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
    Migrating to Microsoft Office 365 is becoming increasingly popular for organizations both large and small. If you have made the leap to Microsoft’s cloud platform, you know that you will need to create a corporate email signature for your Office 365…
    Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

    760 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    7 Experts available now in Live!

    Get 1:1 Help Now