• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 547
  • Last Modified:

C# - Parse through DB2 column data and INSERT the data as new rows into SQL

Hi everyone!

I have data stored in a DB2 database which I am attempting to extract and put into a SQL Server 2005 db. My logic is to pull the db2 data into a DataSet so I can manipulate it, and then insert the DataSet into SQL.

Problem is - the data in db2 was stored in such a manner that I have to programatically parse through column data so they can be inserted into SQL as individual rows.

The data in the db2 server looks similare to this:

REQNUM(integer)        STATUS(varchar)        DATE(date)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial,Approved,        08/08/2010,08/12/2010,        BLaher,KReheran,        product ready, real...,      
2        Rejected,        09/13/2011,        KGhracie,        product not ready,
45      Initial,Approved,Rejected        08/08/10,03/02/11,04/15/11         BLaher,KReheran,KGhracie,  n,n,n...,

Notice that all data related to a REQNUM  is stored in the same row separated by ','
REQNUM is not Unique.

In SQL I need it to look like this:

REQNUM(int)        STATUS(varchar)        DATE(datetime)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial        08/08/2010        BLaher        product ready
1        Approved       08/12/2010        KReheran         realease approved      
2        Rejected        09/13/2011        KGhracie        product not ready
45      Initial        08/08/10         ...         ...
45      Approved        03/02/11       ...          ...
45      Rejected        04/15/11       ...          ...                        


So I have started an application in C# to parse the data, however it is returned in column data and I cannot figure out how to put it back into row data. The code I am using to parse the data is below.

Any help is greatly appreciated!!!
/// <summary>
        /// Method parses data and substrings row data on ','
        /// </summary>
        /// <param name="view">data from DataGridView</param>
        /// <returns>output array of CXHistory objects</returns>
        public string[] parseData(DataGridView view)
        {
            int i = 0;
            int count = 0;
            List<string> cxHistoryObj = new List<string>();
            int rowcount = ((view.Rows.Count) - 1);
            // iterrate through entire row
            for (int row = 0; row <= rowcount; row++)
            {
                int colCount = ((view.Columns.Count) - 1);
                // once all columns are complete in a row, go to next row
                for (int col = 2; col <= colCount; col++)
                {
                    // rowStrSub is next row to parse
                    string rowStrSub = view.Rows[row].Cells[col].Value.ToString();
                    if (rowStrSub.Length != null)
                    {
                        if (!rowStrSub.EndsWith(","))
                        {
                            // check if ',' exists at end of string - add one if it does not  
                            // for index purposes
                            rowStrSub = rowStrSub + ",";
                        }
                        // iterrate through the string and find all instances of a ','
                        for (int k = 0; k < rowStrSub.Length - 1; k = k)
                        {
                            count = rowStrSub.Split(',').Length - 1;
                            i = 0;
                            while ((i = rowStrSub.IndexOf(',', i)) != -1)
                            {
                                // retSubStr is returned sub string after indexed on ','
                                string retSubStr = (rowStrSub.Substring(k, i - k));
                                cxHistoryObj.Add(retSubStr);
                                k = i + 1;  
                                i++;
                            }   
                        }
                    }
                }
            }
            string[] output = cxHistoryObj.ToArray();
            //string msg = string.Format("Ouput size = {0}", output.Length);
            //MessageBox.Show(msg);

            return output;
         }

Open in new window

0
I_s
Asked:
I_s
  • 10
  • 5
  • 2
  • +1
1 Solution
 
momi_sabagCommented:
why don't you use the String.split method? it will make your life much easier
0
 
I_sAuthor Commented:
I guess I thought using indexOf was the most efficient, I suppose I could use String.Split, but would that take care of the problem at hand?

I have no trouble separating the strings with the following code, the problem is it puts them all into a single column.

I need it in rows?
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi I s,

DB2 supports recursive SQL that will convert columns to rows (or rows to columns).  But if the maximum number of items on a single line is fixed (known), then you might be able to do this easier with straight SQL.  Or split the text withing C#.  All are very workable solutions.

Regarding recursive SQL, here's an Experts Exchange article that describes how to do it.  Adding the logic to split multiple strings is pretty easy.

  http://www.experts-exchange.com/Database/DB2/A_3618-Recursive-SQL-in-DB2-Converting-rows-to-columns.html


Kent
0
Free recovery tool for Microsoft Active Directory

Veeam Explorer for Microsoft Active Directory provides fast and reliable object-level recovery for Active Directory from a single-pass, agentless backup or storage snapshot — without the need to restore an entire virtual machine or use third-party tools.

 
gmarinoCommented:
I'm sorry, but I cannot see hoe the data could be stored in the DB2 table as described.  The DATE column is defined as datetime - a field which cannot contain more than one date.  Are you SURE that SELECT * FROM TABLE returns columns with multiple comma-separated values???  

It sounds like you are querying an xml table, in which case there are xquery methods to parse out the data.
0
 
momi_sabagCommented:
i don't understand the problem
you fetch a record from the db2 database and then for each column you do a split by comma
for each column you now have an array of values
you pick the array with most values and that is the number of rows you will insert
you either duplicate the id (or apply what ever logic you need) and add default values if some of the columns are missing some of the values (although, i don't think it is possible according to your description)
0
 
I_sAuthor Commented:
I apologize for the confusing question, I will post back with a more clear description of my problem. Just a little side tracked right now.

I appreciate your replies!
0
 
I_sAuthor Commented:
@gmarino

You are correct about the date col. This is a history (TBLHISTORY) table and the dates are pulled from TBLMAIN and converted to varchar. The DB2 Table which I am working with is structured like this:

CREATE TABLE "XXXXXX"."TBLHISTORY"  (
              "REQNUM" INTEGER NOT NULL ,
              "DATE" VARCHAR(600) ,
              "STATUS" VARCHAR(300) ,
              "USERID" VARCHAR(300) ,
              "COMMENT" VARCHAR(3000) )  
             IN "XXXXX" ;

I am by no means a DBA in SQL or DB2, I was tasked to port the data from one server to the other by any means possible.

In TBLHISTORY for some reason unknown, someone had the idea to save the data as I had explained above. REQNUM can have many STATUS - For each STATUS change, there is a DATE,USERID, and COMMENT - so they inserted the data in a single row separated by a ',' ...

So, I hope this is a little more clear, in DB2 the data looks as such:

REQNUM                 DATE                                     STATUS                            USERID                    COMMENT
----------------------------------------------------------------------------------------------------------------------------------
1                   08/08/10,09/09/10,...,       Initial,Change,Approved,       Ken,Bob,Nicki,      Comment1,Comment2,...

My code adds the following to List<string> cxHistoryObj; as such:

08/08/10
09/09/10
09/10/10
Initial
Change
Approved
Ken
Bob
Nicki
Comment1
Comment2
Comment3

I then want to put this data back into a DataTable as such:

1     08/08/10     Initial         Ken     Comment1
1     09/09/10     Change     Bob      Comment2
1     09/10/10     Approved  Nicki     Comment3


I hope this clears up my initial question, and thanks in advance for any assistance!
0
 
I_sAuthor Commented:
@kdo

Kent,

Great post, could this be done in a sp? I am trying to wrap my head around how I might split the strings in each column where a ',' exists, and then use recursive sql to put them into row data.

I'll post back soon
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi I_s,

You can certainly wrap a stored procedure around that.  You just need to add a couple of columns.  Be glad to do that for you.

The challenge may be the data, no matter which approach you use.  The comment field could well contain embedded commas.  The Query will always split the string on the next comma, not necessarily the original separator.


Kent
0
 
I_sAuthor Commented:
Basically I am attempting to do the opposite,

Where your example looks like this:


SELECT * FROM rec;

SNUM  WORDNUM    WORD
1     1          This
1     2          is
1     3          a
1     4          fine
1     5          example

Mine looks like this:

SNUM  WORDNUM    WORD
1     1,2,3,4,5,          This,is,a,fine,example,

And I want your example as a result!

0
 
I_sAuthor Commented:
kdo,

They were cognizant of embedded commas!

All embedded commas were replaced with the symbol '#@#' - so, that should not present a problem!
0
 
Kent OlsenData Warehouse Architect / DBACommented:
0
 
I_sAuthor Commented:
Again great post! Exactly what I needed,

Im going to except it as the solution, do you mind if I post back with any followup questions? I do not have much sql exposure...

Thanks for your assistance!
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi I_s,

Feel free to ask more questions.  That's what Experts Exchange is all about.  :)

Grasping recursive SQL can be quite a challenge, especially if one is not well schooled in the principals of recursion and at least one recursive language.  Still, it's powerful stuff.

I've taken the liberty of expanding the query to generate the data that you want.  Just change the table name on line 49 to match your needs.  I've not tested it, but it should be pretty close.  If it gives you any trouble, just post back and we'll debug it.


Kent

WITH pquery (reqnum, subrow, status, date, userid, comment, status_string, date_string, userid_string, comment_string )
AS
(
--  Select the first token from the result strings
  SELECT reqnum, 1, 
    CASE WHEN LOCATE (',', status) > 0 THEN 
      SUBSTR (status, 1, LOCATE (',', status) - 1)
    ELSE
      status
    END status,
    CASE WHEN LOCATE (',', date) > 0 THEN 
      SUBSTR (date, 1, LOCATE (',', date) - 1)
    ELSE
      date
    END date,
    CASE WHEN LOCATE (',', userid) > 0 THEN 
      SUBSTR (userid, 1, LOCATE (',', userid) - 1)
    ELSE
      userid
    END userid,
    CASE WHEN LOCATE (',', comment) > 0 THEN 
      SUBSTR (comment, 1, LOCATE (',', comment) - 1)
    ELSE
      comment
    END comment,
    
--  Select the remainder strings that will be passed back to the query
    CASE WHEN locate (',', status) > 0 THEN
      LTRIM (SUBSTR (status, LOCATE (',', status) + 1))
    ELSE
      NULL
    END status_string,
    CASE WHEN locate (',', date) > 0 THEN
      LTRIM (SUBSTR (sentence, LOCATE (',', date) + 1))
    ELSE
      NULL
    END date_string,
    CASE WHEN locate (',', userid) > 0 THEN
      LTRIM (SUBSTR (userid, LOCATE (',', userid) + 1))
    ELSE
      NULL
    END userid_string,
    CASE WHEN locate (',', comment) > 0 THEN
      LTRIM (SUBSTR (comment, LOCATE (',', comment) + 1))
    ELSE
      NULL
    END comment_string
    
  FROM {yourtable} base 
  
  UNION ALL

--  Now select the first (next) token in each of the remainder strings
  SELECT reqnum, subrow + 1, 
    CASE WHEN LOCATE (',', status_string) > 0 then 
      SUBSTR (status_string, 1, LOCATE (',', status_string) - 1)
    ELSE
      status_string
    END status,
    CASE WHEN LOCATE (',', userid_string) > 0 then 
      SUBSTR (userid_string, 1, LOCATE (',', userid_string) - 1)
    ELSE
      userid_string
    END userid,
    CASE WHEN LOCATE (',', date_string) > 0 then 
      SUBSTR (date_string, 1, LOCATE (',', date_string) - 1)
    ELSE
      date_string
    END date,
    CASE WHEN LOCATE (',', comment_string) > 0 then 
      SUBSTR (comment_string, 1, LOCATE (',', comment_string) - 1)
    ELSE
      comment_string
    END comment,

--  And build the new remainder strings    
    CASE WHEN LOCATE (',', status_string) > 0 THEN
      LTRIM (SUBSTR (status_string, LOCATE (',', status_string) + 1))
    ELSE
      NULL
    END status_string
    CASE WHEN LOCATE (',', userid_string) > 0 THEN
      LTRIM (SUBSTR (userid_string, LOCATE (',', userid_string) + 1))
    ELSE
      NULL
    END userid_string
    CASE WHEN LOCATE (',', date_string) > 0 THEN
      LTRIM (SUBSTR (date_string, LOCATE (',', date_string) + 1))
    ELSE
      NULL
    END date_string
    CASE WHEN LOCATE (',', comment_string) > 0 THEN
      LTRIM (SUBSTR (comment_string, LOCATE (',', comment_string) + 1))
    ELSE
      NULL
    END comment_string
  FROM pquery t0
  WHERE t0.status_string IS NOT NULL
     OR t0.userid_string IS NOT NULL
     OR t0.date_string IS NOT NULL
     OR t0.comment_string IS NOT NULL
)
SELECT * 
FROM pquery pq;

Open in new window

0
 
I_sAuthor Commented:
Thanks for the code, as it stands, I recieve the following errors.

I am using SQL Server 2005, is LOCATE not compatible?

Msg 195, Level 15, State 10, Line 10
'LOCATE' is not a recognized built-in function name.
Msg 195, Level 15, State 10, Line 59
'LOCATE' is not a recognized built-in function name.
Msg 128, Level 15, State 1, Line 86
The name "userid_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 91
The name "date_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 96
The name "comment_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
0
 
I_sAuthor Commented:
I just realized that the error is probably becuase I am writing the query in SSMS... and not DB2..
0
 
Kent OlsenData Warehouse Architect / DBACommented:

Yep.  That's a DB2 query.  To run it in SQL Server you'll have to replace the LOCATE function with CHARINDEX.


Kent
0
 
I_sAuthor Commented:
Works great Kent,

Thanks again for your help! I can definetly use recursive sql in the future!


Also, for anyone who may look at this post, I was able to do this successfully using C# as well with the following code:
/// <summary>
        /// getCXHistoryData is a method which makes a connection to the DB2 udb and returns a CXHistory Obj.
        /// </summary>
        /// <returns>cxhObjectsList is List of CXHistory Obj's</returns>
        public List<CXHISTORY> getCXHistoryData(){
            OleDbCommand oledbCMD;
            OleDbConnection oledbCONN;
            OleDbDataReader oledbReader;
            cxhObjectsList = new List<CXHISTORY>();
            errors = new List<string>();
            try
            {
                using (oledbCONN = new OleDbConnection(ApplicationConsts.DB2_CONNECTION_STRING))
                {
                    oledbCMD = new OleDbCommand(ApplicationConsts.HISTORY_QUERY, oledbCONN);
                    oledbCONN.Open();
                    using (oledbReader = oledbCMD.ExecuteReader())
                    {
                        while (oledbReader.Read())
                        {
                            int requestnum = ((int)oledbReader["REQUESTNUMBER"]);
                            string finstatus = oledbReader["FINALSTATUS"].ToString();
                            string dates = oledbReader["DATE"].ToString();
                            string userId = oledbReader["USERID"].ToString().ToUpper();
                            string status = oledbReader["STATUS"].ToString();
                            string comment = oledbReader["COMMENT"].ToString().Replace("@#@", ",");

                            string[] rowDates = dates.Split(',');
                            string[] rowUserId = userId.Split(',');
                            string[] rowStatus = status.Split(',');
                            string[] rowComment = comment.Split(',');

                            bool equal = rowDates.Length == rowUserId.Length && rowStatus.Length == rowComment.Length && rowDates.Length == rowStatus.Length;
                            if (!equal)
                            {
                                string error = string.Format("Error: REQUESTNUMBER: {0} Inconsistent number of entries found in row data.", requestnum);
                                errors.Add(error);
                                //throw new Exception(error);
                            }
                            else
                            {
                                for (int i = 0; i < rowDates.Length - 1; i++)
                                {
                                    CXHISTORY cxhObj = new CXHISTORY(requestnum, finstatus, rowDates[i], rowStatus[i], rowUserId[i], rowComment[i]);
                                    cxhObjectsList.Add(cxhObj);
                                    i++;
                                }
                            }
                        }
                    }
                }
            }
            catch(Exception e)
            {
                MessageBox.Show(e.ToString());
            }
            return cxhObjectsList;
        }

Open in new window

0

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

  • 10
  • 5
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now