?
Solved

C# - Parse through DB2 column data and INSERT the data as new rows into SQL

Posted on 2011-09-27
18
Medium Priority
?
527 Views
Last Modified: 2013-12-17
Hi everyone!

I have data stored in a DB2 database which I am attempting to extract and put into a SQL Server 2005 db. My logic is to pull the db2 data into a DataSet so I can manipulate it, and then insert the DataSet into SQL.

Problem is - the data in db2 was stored in such a manner that I have to programatically parse through column data so they can be inserted into SQL as individual rows.

The data in the db2 server looks similare to this:

REQNUM(integer)        STATUS(varchar)        DATE(date)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial,Approved,        08/08/2010,08/12/2010,        BLaher,KReheran,        product ready, real...,      
2        Rejected,        09/13/2011,        KGhracie,        product not ready,
45      Initial,Approved,Rejected        08/08/10,03/02/11,04/15/11         BLaher,KReheran,KGhracie,  n,n,n...,

Notice that all data related to a REQNUM  is stored in the same row separated by ','
REQNUM is not Unique.

In SQL I need it to look like this:

REQNUM(int)        STATUS(varchar)        DATE(datetime)        USERID(varchar)        COMMENT(varchar)  
---------------------------------------------------------------------------
1        Initial        08/08/2010        BLaher        product ready
1        Approved       08/12/2010        KReheran         realease approved      
2        Rejected        09/13/2011        KGhracie        product not ready
45      Initial        08/08/10         ...         ...
45      Approved        03/02/11       ...          ...
45      Rejected        04/15/11       ...          ...                        


So I have started an application in C# to parse the data, however it is returned in column data and I cannot figure out how to put it back into row data. The code I am using to parse the data is below.

Any help is greatly appreciated!!!
/// <summary>
        /// Method parses data and substrings row data on ','
        /// </summary>
        /// <param name="view">data from DataGridView</param>
        /// <returns>output array of CXHistory objects</returns>
        public string[] parseData(DataGridView view)
        {
            int i = 0;
            int count = 0;
            List<string> cxHistoryObj = new List<string>();
            int rowcount = ((view.Rows.Count) - 1);
            // iterrate through entire row
            for (int row = 0; row <= rowcount; row++)
            {
                int colCount = ((view.Columns.Count) - 1);
                // once all columns are complete in a row, go to next row
                for (int col = 2; col <= colCount; col++)
                {
                    // rowStrSub is next row to parse
                    string rowStrSub = view.Rows[row].Cells[col].Value.ToString();
                    if (rowStrSub.Length != null)
                    {
                        if (!rowStrSub.EndsWith(","))
                        {
                            // check if ',' exists at end of string - add one if it does not  
                            // for index purposes
                            rowStrSub = rowStrSub + ",";
                        }
                        // iterrate through the string and find all instances of a ','
                        for (int k = 0; k < rowStrSub.Length - 1; k = k)
                        {
                            count = rowStrSub.Split(',').Length - 1;
                            i = 0;
                            while ((i = rowStrSub.IndexOf(',', i)) != -1)
                            {
                                // retSubStr is returned sub string after indexed on ','
                                string retSubStr = (rowStrSub.Substring(k, i - k));
                                cxHistoryObj.Add(retSubStr);
                                k = i + 1;  
                                i++;
                            }   
                        }
                    }
                }
            }
            string[] output = cxHistoryObj.ToArray();
            //string msg = string.Format("Ouput size = {0}", output.Length);
            //MessageBox.Show(msg);

            return output;
         }

Open in new window

0
Comment
Question by:I_s
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 5
  • 2
  • +1
18 Comments
 
LVL 37

Expert Comment

by:momi_sabag
ID: 36712233
why don't you use the String.split method? it will make your life much easier
0
 

Author Comment

by:I_s
ID: 36712319
I guess I thought using indexOf was the most efficient, I suppose I could use String.Split, but would that take care of the problem at hand?

I have no trouble separating the strings with the following code, the problem is it puts them all into a single column.

I need it in rows?
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 36712559
Hi I s,

DB2 supports recursive SQL that will convert columns to rows (or rows to columns).  But if the maximum number of items on a single line is fixed (known), then you might be able to do this easier with straight SQL.  Or split the text withing C#.  All are very workable solutions.

Regarding recursive SQL, here's an Experts Exchange article that describes how to do it.  Adding the logic to split multiple strings is pretty easy.

  http://www.experts-exchange.com/Database/DB2/A_3618-Recursive-SQL-in-DB2-Converting-rows-to-columns.html


Kent
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 4

Expert Comment

by:gmarino
ID: 36714004
I'm sorry, but I cannot see hoe the data could be stored in the DB2 table as described.  The DATE column is defined as datetime - a field which cannot contain more than one date.  Are you SURE that SELECT * FROM TABLE returns columns with multiple comma-separated values???  

It sounds like you are querying an xml table, in which case there are xquery methods to parse out the data.
0
 
LVL 37

Expert Comment

by:momi_sabag
ID: 36714054
i don't understand the problem
you fetch a record from the db2 database and then for each column you do a split by comma
for each column you now have an array of values
you pick the array with most values and that is the number of rows you will insert
you either duplicate the id (or apply what ever logic you need) and add default values if some of the columns are missing some of the values (although, i don't think it is possible according to your description)
0
 

Author Comment

by:I_s
ID: 36716536
I apologize for the confusing question, I will post back with a more clear description of my problem. Just a little side tracked right now.

I appreciate your replies!
0
 

Author Comment

by:I_s
ID: 36716731
@gmarino

You are correct about the date col. This is a history (TBLHISTORY) table and the dates are pulled from TBLMAIN and converted to varchar. The DB2 Table which I am working with is structured like this:

CREATE TABLE "XXXXXX"."TBLHISTORY"  (
              "REQNUM" INTEGER NOT NULL ,
              "DATE" VARCHAR(600) ,
              "STATUS" VARCHAR(300) ,
              "USERID" VARCHAR(300) ,
              "COMMENT" VARCHAR(3000) )  
             IN "XXXXX" ;

I am by no means a DBA in SQL or DB2, I was tasked to port the data from one server to the other by any means possible.

In TBLHISTORY for some reason unknown, someone had the idea to save the data as I had explained above. REQNUM can have many STATUS - For each STATUS change, there is a DATE,USERID, and COMMENT - so they inserted the data in a single row separated by a ',' ...

So, I hope this is a little more clear, in DB2 the data looks as such:

REQNUM                 DATE                                     STATUS                            USERID                    COMMENT
----------------------------------------------------------------------------------------------------------------------------------
1                   08/08/10,09/09/10,...,       Initial,Change,Approved,       Ken,Bob,Nicki,      Comment1,Comment2,...

My code adds the following to List<string> cxHistoryObj; as such:

08/08/10
09/09/10
09/10/10
Initial
Change
Approved
Ken
Bob
Nicki
Comment1
Comment2
Comment3

I then want to put this data back into a DataTable as such:

1     08/08/10     Initial         Ken     Comment1
1     09/09/10     Change     Bob      Comment2
1     09/10/10     Approved  Nicki     Comment3


I hope this clears up my initial question, and thanks in advance for any assistance!
0
 

Author Comment

by:I_s
ID: 36717634
@kdo

Kent,

Great post, could this be done in a sp? I am trying to wrap my head around how I might split the strings in each column where a ',' exists, and then use recursive sql to put them into row data.

I'll post back soon
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 36717680
Hi I_s,

You can certainly wrap a stored procedure around that.  You just need to add a couple of columns.  Be glad to do that for you.

The challenge may be the data, no matter which approach you use.  The comment field could well contain embedded commas.  The Query will always split the string on the next comma, not necessarily the original separator.


Kent
0
 

Author Comment

by:I_s
ID: 36717682
Basically I am attempting to do the opposite,

Where your example looks like this:


SELECT * FROM rec;

SNUM  WORDNUM    WORD
1     1          This
1     2          is
1     3          a
1     4          fine
1     5          example

Mine looks like this:

SNUM  WORDNUM    WORD
1     1,2,3,4,5,          This,is,a,fine,example,

And I want your example as a result!

0
 

Author Comment

by:I_s
ID: 36717697
kdo,

They were cognizant of embedded commas!

All embedded commas were replaced with the symbol '#@#' - so, that should not present a problem!
0
 
LVL 46

Accepted Solution

by:
Kent Olsen earned 2000 total points
ID: 36717702
0
 

Author Comment

by:I_s
ID: 36717798
Again great post! Exactly what I needed,

Im going to except it as the solution, do you mind if I post back with any followup questions? I do not have much sql exposure...

Thanks for your assistance!
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 36718687
Hi I_s,

Feel free to ask more questions.  That's what Experts Exchange is all about.  :)

Grasping recursive SQL can be quite a challenge, especially if one is not well schooled in the principals of recursion and at least one recursive language.  Still, it's powerful stuff.

I've taken the liberty of expanding the query to generate the data that you want.  Just change the table name on line 49 to match your needs.  I've not tested it, but it should be pretty close.  If it gives you any trouble, just post back and we'll debug it.


Kent

WITH pquery (reqnum, subrow, status, date, userid, comment, status_string, date_string, userid_string, comment_string )
AS
(
--  Select the first token from the result strings
  SELECT reqnum, 1, 
    CASE WHEN LOCATE (',', status) > 0 THEN 
      SUBSTR (status, 1, LOCATE (',', status) - 1)
    ELSE
      status
    END status,
    CASE WHEN LOCATE (',', date) > 0 THEN 
      SUBSTR (date, 1, LOCATE (',', date) - 1)
    ELSE
      date
    END date,
    CASE WHEN LOCATE (',', userid) > 0 THEN 
      SUBSTR (userid, 1, LOCATE (',', userid) - 1)
    ELSE
      userid
    END userid,
    CASE WHEN LOCATE (',', comment) > 0 THEN 
      SUBSTR (comment, 1, LOCATE (',', comment) - 1)
    ELSE
      comment
    END comment,
    
--  Select the remainder strings that will be passed back to the query
    CASE WHEN locate (',', status) > 0 THEN
      LTRIM (SUBSTR (status, LOCATE (',', status) + 1))
    ELSE
      NULL
    END status_string,
    CASE WHEN locate (',', date) > 0 THEN
      LTRIM (SUBSTR (sentence, LOCATE (',', date) + 1))
    ELSE
      NULL
    END date_string,
    CASE WHEN locate (',', userid) > 0 THEN
      LTRIM (SUBSTR (userid, LOCATE (',', userid) + 1))
    ELSE
      NULL
    END userid_string,
    CASE WHEN locate (',', comment) > 0 THEN
      LTRIM (SUBSTR (comment, LOCATE (',', comment) + 1))
    ELSE
      NULL
    END comment_string
    
  FROM {yourtable} base 
  
  UNION ALL

--  Now select the first (next) token in each of the remainder strings
  SELECT reqnum, subrow + 1, 
    CASE WHEN LOCATE (',', status_string) > 0 then 
      SUBSTR (status_string, 1, LOCATE (',', status_string) - 1)
    ELSE
      status_string
    END status,
    CASE WHEN LOCATE (',', userid_string) > 0 then 
      SUBSTR (userid_string, 1, LOCATE (',', userid_string) - 1)
    ELSE
      userid_string
    END userid,
    CASE WHEN LOCATE (',', date_string) > 0 then 
      SUBSTR (date_string, 1, LOCATE (',', date_string) - 1)
    ELSE
      date_string
    END date,
    CASE WHEN LOCATE (',', comment_string) > 0 then 
      SUBSTR (comment_string, 1, LOCATE (',', comment_string) - 1)
    ELSE
      comment_string
    END comment,

--  And build the new remainder strings    
    CASE WHEN LOCATE (',', status_string) > 0 THEN
      LTRIM (SUBSTR (status_string, LOCATE (',', status_string) + 1))
    ELSE
      NULL
    END status_string
    CASE WHEN LOCATE (',', userid_string) > 0 THEN
      LTRIM (SUBSTR (userid_string, LOCATE (',', userid_string) + 1))
    ELSE
      NULL
    END userid_string
    CASE WHEN LOCATE (',', date_string) > 0 THEN
      LTRIM (SUBSTR (date_string, LOCATE (',', date_string) + 1))
    ELSE
      NULL
    END date_string
    CASE WHEN LOCATE (',', comment_string) > 0 THEN
      LTRIM (SUBSTR (comment_string, LOCATE (',', comment_string) + 1))
    ELSE
      NULL
    END comment_string
  FROM pquery t0
  WHERE t0.status_string IS NOT NULL
     OR t0.userid_string IS NOT NULL
     OR t0.date_string IS NOT NULL
     OR t0.comment_string IS NOT NULL
)
SELECT * 
FROM pquery pq;

Open in new window

0
 

Author Comment

by:I_s
ID: 36718837
Thanks for the code, as it stands, I recieve the following errors.

I am using SQL Server 2005, is LOCATE not compatible?

Msg 195, Level 15, State 10, Line 10
'LOCATE' is not a recognized built-in function name.
Msg 195, Level 15, State 10, Line 59
'LOCATE' is not a recognized built-in function name.
Msg 128, Level 15, State 1, Line 86
The name "userid_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 91
The name "date_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
Msg 128, Level 15, State 1, Line 96
The name "comment_string" is not permitted in this context. Valid expressions are constants, constant expressions, and (in some contexts) variables. Column names are not permitted.
0
 

Author Comment

by:I_s
ID: 36718869
I just realized that the error is probably becuase I am writing the query in SSMS... and not DB2..
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 36718896

Yep.  That's a DB2 query.  To run it in SQL Server you'll have to replace the LOCATE function with CHARINDEX.


Kent
0
 

Author Comment

by:I_s
ID: 36815223
Works great Kent,

Thanks again for your help! I can definetly use recursive sql in the future!


Also, for anyone who may look at this post, I was able to do this successfully using C# as well with the following code:
/// <summary>
        /// getCXHistoryData is a method which makes a connection to the DB2 udb and returns a CXHistory Obj.
        /// </summary>
        /// <returns>cxhObjectsList is List of CXHistory Obj's</returns>
        public List<CXHISTORY> getCXHistoryData(){
            OleDbCommand oledbCMD;
            OleDbConnection oledbCONN;
            OleDbDataReader oledbReader;
            cxhObjectsList = new List<CXHISTORY>();
            errors = new List<string>();
            try
            {
                using (oledbCONN = new OleDbConnection(ApplicationConsts.DB2_CONNECTION_STRING))
                {
                    oledbCMD = new OleDbCommand(ApplicationConsts.HISTORY_QUERY, oledbCONN);
                    oledbCONN.Open();
                    using (oledbReader = oledbCMD.ExecuteReader())
                    {
                        while (oledbReader.Read())
                        {
                            int requestnum = ((int)oledbReader["REQUESTNUMBER"]);
                            string finstatus = oledbReader["FINALSTATUS"].ToString();
                            string dates = oledbReader["DATE"].ToString();
                            string userId = oledbReader["USERID"].ToString().ToUpper();
                            string status = oledbReader["STATUS"].ToString();
                            string comment = oledbReader["COMMENT"].ToString().Replace("@#@", ",");

                            string[] rowDates = dates.Split(',');
                            string[] rowUserId = userId.Split(',');
                            string[] rowStatus = status.Split(',');
                            string[] rowComment = comment.Split(',');

                            bool equal = rowDates.Length == rowUserId.Length && rowStatus.Length == rowComment.Length && rowDates.Length == rowStatus.Length;
                            if (!equal)
                            {
                                string error = string.Format("Error: REQUESTNUMBER: {0} Inconsistent number of entries found in row data.", requestnum);
                                errors.Add(error);
                                //throw new Exception(error);
                            }
                            else
                            {
                                for (int i = 0; i < rowDates.Length - 1; i++)
                                {
                                    CXHISTORY cxhObj = new CXHISTORY(requestnum, finstatus, rowDates[i], rowStatus[i], rowUserId[i], rowComment[i]);
                                    cxhObjectsList.Add(cxhObj);
                                    i++;
                                }
                            }
                        }
                    }
                }
            }
            catch(Exception e)
            {
                MessageBox.Show(e.ToString());
            }
            return cxhObjectsList;
        }

Open in new window

0

Featured Post

How To Install Bash on Windows 10

Windows’ budding partnership with Canonical has certainly led to some great improvements. One of them being the ability to use Bash on your Windows machine without third party applications! This might be one of the greatest things a cloud engineer in a Windows environment can do!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question