Solved

C# - Parse through DB2 column data and INSERT the data as new rows into SQL

Posted on 2011-09-27
18
509 Views
Last Modified: 2013-12-17
Hi everyone!

I have data stored in a DB2 database which I am attempting to extract and put into a SQL Server 2005 db. My logic is to pull the db2 data into a DataSet so I can manipulate it, and then insert the DataSet into SQL.

Problem is - the data in db2 was stored in such a manner that I have to programatically parse through column data so they can be inserted into SQL as individual rows.

The data in the db2 server looks similare to this:

REQNUM(integer)        STATUS(varchar)        DATE(date)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial,Approved,        08/08/2010,08/12/2010,        BLaher,KReheran,        product ready, real...,      
2        Rejected,        09/13/2011,        KGhracie,        product not ready,
45      Initial,Approved,Rejected        08/08/10,03/02/11,04/15/11         BLaher,KReheran,KGhracie,  n,n,n...,

Notice that all data related to a REQNUM  is stored in the same row separated by ','
REQNUM is not Unique.

In SQL I need it to look like this:

REQNUM(int)        STATUS(varchar)        DATE(datetime)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial        08/08/2010        BLaher        product ready
1        Approved       08/12/2010        KReheran         realease approved      
2        Rejected        09/13/2011        KGhracie        product not ready
45      Initial        08/08/10         ...         ...
45      Approved        03/02/11       ...          ...
45      Rejected        04/15/11       ...          ...                        


So I have started an application in C# to parse the data, however it is returned in column data and I cannot figure out how to put it back into row data. The code I am using to parse the data is below.

Any help is greatly appreciated!!!
/// <summary>
        /// Method parses data and substrings row data on ','
        /// </summary>
        /// <param name="view">data from DataGridView</param>
        /// <returns>output array of CXHistory objects</returns>
        public string[] parseData(DataGridView view)
        {
            int i = 0;
            int count = 0;
            List<string> cxHistoryObj = new List<string>();
            int rowcount = ((view.Rows.Count) - 1);
            // iterrate through entire row
            for (int row = 0; row <= rowcount; row++)
            {
                int colCount = ((view.Columns.Count) - 1);
                // once all columns are complete in a row, go to next row
                for (int col = 2; col <= colCount; col++)
                {
                    // rowStrSub is next row to parse
                    string rowStrSub = view.Rows[row].Cells[col].Value.ToString();
                    if (rowStrSub.Length != null)
                    {
                        if (!rowStrSub.EndsWith(","))
                        {
                            // check if ',' exists at end of string - add one if it does not  
                            // for index purposes
                            rowStrSub = rowStrSub + ",";
                        }
                        // iterrate through the string and find all instances of a ','
                        for (int k = 0; k < rowStrSub.Length - 1; k = k)
                        {
                            count = rowStrSub.Split(',').Length - 1;
                            i = 0;
                            while ((i = rowStrSub.IndexOf(',', i)) != -1)
                            {
                                // retSubStr is returned sub string after indexed on ','
                                string retSubStr = (rowStrSub.Substring(k, i - k));
                                cxHistoryObj.Add(retSubStr);
                                k = i + 1;  
                                i++;
                            }   
                        }
                    }
                }
            }
            string[] output = cxHistoryObj.ToArray();
            //string msg = string.Format("Ouput size = {0}", output.Length);
            //MessageBox.Show(msg);

            return output;
         }

Open in new window

0
Comment
Question by:I_s
  • 10
  • 5
  • 2
  • +1
18 Comments
 
LVL 37

Expert Comment

by:momi_sabag
ID: 36712233
why don't you use the String.split method? it will make your life much easier
0
 

Author Comment

by:I_s
ID: 36712319
I guess I thought using indexOf was the most efficient, I suppose I could use String.Split, but would that take care of the problem at hand?

I have no trouble separating the strings with the following code, the problem is it puts them all into a single column.

I need it in rows?
0
 
LVL 45

Expert Comment

by:Kdo
ID: 36712559
Hi I s,

DB2 supports recursive SQL that will convert columns to rows (or rows to columns).  But if the maximum number of items on a single line is fixed (known), then you might be able to do this easier with straight SQL.  Or split the text withing C#.  All are very workable solutions.

Regarding recursive SQL, here's an Experts Exchange article that describes how to do it.  Adding the logic to split multiple strings is pretty easy.

  http://www.experts-exchange.com/Database/DB2/A_3618-Recursive-SQL-in-DB2-Converting-rows-to-columns.html


Kent
0
 
LVL 4

Expert Comment

by:gmarino
ID: 36714004
I'm sorry, but I cannot see hoe the data could be stored in the DB2 table as described.  The DATE column is defined as datetime - a field which cannot contain more than one date.  Are you SURE that SELECT * FROM TABLE returns columns with multiple comma-separated values???  

It sounds like you are querying an xml table, in which case there are xquery methods to parse out the data.
0
 
LVL 37

Expert Comment

by:momi_sabag
ID: 36714054
i don't understand the problem
you fetch a record from the db2 database and then for each column you do a split by comma
for each column you now have an array of values
you pick the array with most values and that is the number of rows you will insert
you either duplicate the id (or apply what ever logic you need) and add default values if some of the columns are missing some of the values (although, i don't think it is possible according to your description)
0
 

Author Comment

by:I_s
ID: 36716536
I apologize for the confusing question, I will post back with a more clear description of my problem. Just a little side tracked right now.

I appreciate your replies!
0
 

Author Comment

by:I_s
ID: 36716731
@gmarino

You are correct about the date col. This is a history (TBLHISTORY) table and the dates are pulled from TBLMAIN and converted to varchar. The DB2 Table which I am working with is structured like this:

CREATE TABLE "XXXXXX"."TBLHISTORY"  (
              "REQNUM" INTEGER NOT NULL ,
              "DATE" VARCHAR(600) ,
              "STATUS" VARCHAR(300) ,
              "USERID" VARCHAR(300) ,
              "COMMENT" VARCHAR(3000) )  
             IN "XXXXX" ;

I am by no means a DBA in SQL or DB2, I was tasked to port the data from one server to the other by any means possible.

In TBLHISTORY for some reason unknown, someone had the idea to save the data as I had explained above. REQNUM can have many STATUS - For each STATUS change, there is a DATE,USERID, and COMMENT - so they inserted the data in a single row separated by a ',' ...

So, I hope this is a little more clear, in DB2 the data looks as such:

REQNUM                 DATE                                     STATUS                            USERID                    COMMENT
----------------------------------------------------------------------------------------------------------------------------------
1                   08/08/10,09/09/10,...,       Initial,Change,Approved,       Ken,Bob,Nicki,      Comment1,Comment2,...

My code adds the following to List<string> cxHistoryObj; as such:

08/08/10
09/09/10
09/10/10
Initial
Change
Approved
Ken
Bob
Nicki
Comment1
Comment2
Comment3

I then want to put this data back into a DataTable as such:

1     08/08/10     Initial         Ken     Comment1
1     09/09/10     Change     Bob      Comment2
1     09/10/10     Approved  Nicki     Comment3


I hope this clears up my initial question, and thanks in advance for any assistance!
0
 

Author Comment

by:I_s
ID: 36717634
@kdo

Kent,

Great post, could this be done in a sp? I am trying to wrap my head around how I might split the strings in each column where a ',' exists, and then use recursive sql to put them into row data.

I'll post back soon
0
 
LVL 45

Expert Comment

by:Kdo
ID: 36717680
Hi I_s,

You can certainly wrap a stored procedure around that.  You just need to add a couple of columns.  Be glad to do that for you.

The challenge may be the data, no matter which approach you use.  The comment field could well contain embedded commas.  The Query will always split the string on the next comma, not necessarily the original separator.


Kent
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:I_s
ID: 36717682
Basically I am attempting to do the opposite,

Where your example looks like this:


SELECT * FROM rec;

SNUM  WORDNUM    WORD
1     1          This
1     2          is
1     3          a
1     4          fine
1     5          example

Mine looks like this:

SNUM  WORDNUM    WORD
1     1,2,3,4,5,          This,is,a,fine,example,

And I want your example as a result!

0
 

Author Comment

by:I_s
ID: 36717697
kdo,

They were cognizant of embedded commas!

All embedded commas were replaced with the symbol '#@#' - so, that should not present a problem!
0
 
LVL 45

Accepted Solution

by:
Kdo earned 500 total points
ID: 36717702
0
 

Author Comment

by:I_s
ID: 36717798
Again great post! Exactly what I needed,

Im going to except it as the solution, do you mind if I post back with any followup questions? I do not have much sql exposure...

Thanks for your assistance!
0
 
LVL 45

Expert Comment

by:Kdo
ID: 36718687
Hi I_s,

Feel free to ask more questions.  That's what Experts Exchange is all about.  :)

Grasping recursive SQL can be quite a challenge, especially if one is not well schooled in the principals of recursion and at least one recursive language.  Still, it's powerful stuff.

I've taken the liberty of expanding the query to generate the data that you want.  Just change the table name on line 49 to match your needs.  I've not tested it, but it should be pretty close.  If it gives you any trouble, just post back and we'll debug it.


Kent

WITH pquery (reqnum, subrow, status, date, userid, comment, status_string, date_string, userid_string, comment_string )
AS
(
--  Select the first token from the result strings
  SELECT reqnum, 1, 
    CASE WHEN LOCATE (',', status) > 0 THEN 
      SUBSTR (status, 1, LOCATE (',', status) - 1)
    ELSE
      status
    END status,
    CASE WHEN LOCATE (',', date) > 0 THEN 
      SUBSTR (date, 1, LOCATE (',', date) - 1)
    ELSE
      date
    END date,
    CASE WHEN LOCATE (',', userid) > 0 THEN 
      SUBSTR (userid, 1, LOCATE (',', userid) - 1)
    ELSE
      userid
    END userid,
    CASE WHEN LOCATE (',', comment) > 0 THEN 
      SUBSTR (comment, 1, LOCATE (',', comment) - 1)
    ELSE
      comment
    END comment,
    
--  Select the remainder strings that will be passed back to the query
    CASE WHEN locate (',', status) > 0 THEN
      LTRIM (SUBSTR (status, LOCATE (',', status) + 1))
    ELSE
      NULL
    END status_string,
    CASE WHEN locate (',', date) > 0 THEN
      LTRIM (SUBSTR (sentence, LOCATE (',', date) + 1))
    ELSE
      NULL
    END date_string,
    CASE WHEN locate (',', userid) > 0 THEN
      LTRIM (SUBSTR (userid, LOCATE (',', userid) + 1))
    ELSE
      NULL
    END userid_string,
    CASE WHEN locate (',', comment) > 0 THEN
      LTRIM (SUBSTR (comment, LOCATE (',', comment) + 1))
    ELSE
      NULL
    END comment_string
    
  FROM {yourtable} base 
  
  UNION ALL

--  Now select the first (next) token in each of the remainder strings
  SELECT reqnum, subrow + 1, 
    CASE WHEN LOCATE (',', status_string) > 0 then 
      SUBSTR (status_string, 1, LOCATE (',', status_string) - 1)
    ELSE
      status_string
    END status,
    CASE WHEN LOCATE (',', userid_string) > 0 then 
      SUBSTR (userid_string, 1, LOCATE (',', userid_string) - 1)
    ELSE
      userid_string
    END userid,
    CASE WHEN LOCATE (',', date_string) > 0 then 
      SUBSTR (date_string, 1, LOCATE (',', date_string) - 1)
    ELSE
      date_string
    END date,
    CASE WHEN LOCATE (',', comment_string) > 0 then 
      SUBSTR (comment_string, 1, LOCATE (',', comment_string) - 1)
    ELSE
      comment_string
    END comment,

--  And build the new remainder strings    
    CASE WHEN LOCATE (',', status_string) > 0 THEN
      LTRIM (SUBSTR (status_string, LOCATE (',', status_string) + 1))
    ELSE
      NULL
    END status_string
    CASE WHEN LOCATE (',', userid_string) > 0 THEN
      LTRIM (SUBSTR (userid_string, LOCATE (',', userid_string) + 1))
    ELSE
      NULL
    END userid_string
    CASE WHEN LOCATE (',', date_string) > 0 THEN
      LTRIM (SUBSTR (date_string, LOCATE (',', date_string) + 1))
    ELSE
      NULL
    END date_string
    CASE WHEN LOCATE (',', comment_string) > 0 THEN
      LTRIM (SUBSTR (comment_string, LOCATE (',', comment_string) + 1))
    ELSE
      NULL
    END comment_string
  FROM pquery t0
  WHERE t0.status_string IS NOT NULL
     OR t0.userid_string IS NOT NULL
     OR t0.date_string IS NOT NULL
     OR t0.comment_string IS NOT NULL
)
SELECT * 
FROM pquery pq;

Open in new window

0
 

Author Comment

by:I_s
ID: 36718837
Thanks for the code, as it stands, I recieve the following errors.

I am using SQL Server 2005, is LOCATE not compatible?

Msg 195, Level 15, State 10, Line 10
'LOCATE' is not a recognized built-in function name.
Msg 195, Level 15, State 10, Line 59
'LOCATE' is not a recognized built-in function name.
Msg 128, Level 15, State 1, Line 86
The name "userid_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 91
The name "date_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 96
The name "comment_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
0
 

Author Comment

by:I_s
ID: 36718869
I just realized that the error is probably becuase I am writing the query in SSMS... and not DB2..
0
 
LVL 45

Expert Comment

by:Kdo
ID: 36718896

Yep.  That's a DB2 query.  To run it in SQL Server you'll have to replace the LOCATE function with CHARINDEX.


Kent
0
 

Author Comment

by:I_s
ID: 36815223
Works great Kent,

Thanks again for your help! I can definetly use recursive sql in the future!


Also, for anyone who may look at this post, I was able to do this successfully using C# as well with the following code:
/// <summary>
        /// getCXHistoryData is a method which makes a connection to the DB2 udb and returns a CXHistory Obj.
        /// </summary>
        /// <returns>cxhObjectsList is List of CXHistory Obj's</returns>
        public List<CXHISTORY> getCXHistoryData(){
            OleDbCommand oledbCMD;
            OleDbConnection oledbCONN;
            OleDbDataReader oledbReader;
            cxhObjectsList = new List<CXHISTORY>();
            errors = new List<string>();
            try
            {
                using (oledbCONN = new OleDbConnection(ApplicationConsts.DB2_CONNECTION_STRING))
                {
                    oledbCMD = new OleDbCommand(ApplicationConsts.HISTORY_QUERY, oledbCONN);
                    oledbCONN.Open();
                    using (oledbReader = oledbCMD.ExecuteReader())
                    {
                        while (oledbReader.Read())
                        {
                            int requestnum = ((int)oledbReader["REQUESTNUMBER"]);
                            string finstatus = oledbReader["FINALSTATUS"].ToString();
                            string dates = oledbReader["DATE"].ToString();
                            string userId = oledbReader["USERID"].ToString().ToUpper();
                            string status = oledbReader["STATUS"].ToString();
                            string comment = oledbReader["COMMENT"].ToString().Replace("@#@", ",");

                            string[] rowDates = dates.Split(',');
                            string[] rowUserId = userId.Split(',');
                            string[] rowStatus = status.Split(',');
                            string[] rowComment = comment.Split(',');

                            bool equal = rowDates.Length == rowUserId.Length && rowStatus.Length == rowComment.Length && rowDates.Length == rowStatus.Length;
                            if (!equal)
                            {
                                string error = string.Format("Error: REQUESTNUMBER: {0} Inconsistent number of entries found in row data.", requestnum);
                                errors.Add(error);
                                //throw new Exception(error);
                            }
                            else
                            {
                                for (int i = 0; i < rowDates.Length - 1; i++)
                                {
                                    CXHISTORY cxhObj = new CXHISTORY(requestnum, finstatus, rowDates[i], rowStatus[i], rowUserId[i], rowComment[i]);
                                    cxhObjectsList.Add(cxhObj);
                                    i++;
                                }
                            }
                        }
                    }
                }
            }
            catch(Exception e)
            {
                MessageBox.Show(e.ToString());
            }
            return cxhObjectsList;
        }

Open in new window

0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Recently while returning home from work my wife (another .NET developer) was murmuring something. On further poking she said that she has been assigned a task where she has to serialize and deserialize objects and she is afraid of serialization. Wha…
We all know that functional code is the leg that any good program stands on when it comes right down to it, however, if your program lacks a good user interface your product may not have the appeal needed to keep your customers happy. This issue can…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now