Solved

Split string on commas but not when enclosed in parentheses

Posted on 2016-11-28
7
47 Views
Last Modified: 2016-11-29
Given the following input:
	[COMPANY] [VARCHAR](64) NULL,
	[BANKCODE] [CHAR](4) NULL,
	[OPENDATE] [DATETIME] NULL,
	[REFERENCENUMBER] [NUMERIC](13, 0) NULL,
	[VENDORNAME] [CHAR](45) NULL,
	[AMOUNT] [NUMERIC](13, 2) NULL,
	[EMAILADDRESS] [VARCHAR](256) NULL

Open in new window

A basic split would be like this:
            string[] columnDefinitions = inputString.Split(new char[] {','});
            foreach (string s in columnDefinitions)
            {
                Console.WriteLine(s.Trim());
            }

Open in new window

But that splits on every comma and produces the following output (each line represents a single item in the array):
[COMPANY] [VARCHAR](64) NULL
[BANKCODE] [CHAR](4) NULL
[OPENDATE] [DATETIME] NULL
[REFERENCENUMBER] [NUMERIC](13
0) NULL
[VENDORNAME] [CHAR](45) NULL
[AMOUNT] [NUMERIC](13
2) NULL
[EMAILADDRESS] [VARCHAR](256) NULL

Open in new window

What I really want is for it to only split on commas that are not inside parentheses. I'm happy to use Regex for this. I'm sure there's a fairly easy pattern for it. The desired output would be a string array containing this:
[COMPANY] [VARCHAR](64) NULL
[BANKCODE] [CHAR](4) NULL
[OPENDATE] [DATETIME] NULL
[REFERENCENUMBER] [NUMERIC](13, 0) NULL
[VENDORNAME] [CHAR](45) NULL
[AMOUNT] [NUMERIC](13, 2) NULL
[EMAILADDRESS] [VARCHAR](256) NULL

Open in new window

0
Comment
Question by:Russ Suter
7 Comments
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 41904879
I'm sure I'm missing something.

Why don't you just split on \n? (endline character).

HTH,
Dan
0
 
LVL 20

Author Comment

by:Russ Suter
ID: 41904921
Indeed you are missing something. SQL works perfectly well if all statements are on a single line. The following 3 statements are identical as far as SQL is concerned:
CREATE TABLE FOO(
	[COMPANY] [VARCHAR](64) NULL,
	[BANKCODE] [CHAR](4) NULL
)

Open in new window

CREATE TABLE FOO(	[COMPANY] [VARCHAR](64) NULL, [BANKCODE] [CHAR](4) NULL)

Open in new window

CREATE TABLE FOO(	[COMPANY] [VARCHAR](64) NULL,[BANKCODE] [CHAR](4) NULL)

Open in new window

Note in the 3rd example there need not even be whitespace between the column definitions. I have no way of guaranteeing that the input will be formatted in any specific way so the split needs to strictly adhere to SQL parsing rules.
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 41904930
Try this:
(\[.*?\)\s?(?:NULL)?),
The results should be in group 1.
regex
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 20

Author Comment

by:Russ Suter
ID: 41904954
There's no guarantee that the word NULL will be there either.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 41905339
You could just parse the SQL skipping the pertinent commas:
string sql = "[COMPANY] [VARCHAR](64) NULL,[BANKCODE] [CHAR](4) NULL,[OPENDATE] [DATETIME] NULL, 	[REFERENCENUMBER] [NUMERIC] (13, 0) NULL, 	[VENDORNAME] [CHAR] (45) NULL, 	[AMOUNT] [NUMERIC] (13, 2) NULL, 	[EMAILADDRESS] [VARCHAR] (256) NULL";

System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
byte[] bytes = encoding.GetBytes(sql);
int part = 0;
List<string> sqlParts = new List<string>();
bool skipComma = false;

foreach (byte b in bytes)
{
	if (sqlParts.Count != part+1)
	{
		sqlParts.Add(string.Empty);		
	}
        // Skip splitting by comma if we are inside a set of parenthesis.
	if (b == 40)
	{
		skipComma = true;
	}
	else if (b == 41)
	{
		skipComma=false;
	}

        // Cut the comma and add a new line of SQL 
        // or append the char to the current line of SQL.
	if (skipComma == false && b == 44)
	{
		part++;
	}
	else
	{
		sqlParts[part] = (sqlParts[part] += Convert.ToChar(b)).TrimStart();
	}
}

// List the SQL lines.
for (int i = 0; i <= part; i++)
{
	sqlParts[i].Dump();
} 

Open in new window

This will produce:
[COMPANY] [VARCHAR](64) NULL
[BANKCODE] [CHAR](4) NULL
[OPENDATE] [DATETIME] NULL
[REFERENCENUMBER] [NUMERIC] (13, 0) NULL
[VENDORNAME] [CHAR] (45) NULL
[AMOUNT] [NUMERIC] (13, 2) NULL
[EMAILADDRESS] [VARCHAR] (256) NULL

Open in new window

/gustav
0
 
LVL 62

Accepted Solution

by:
Fernando Soto earned 500 total points
ID: 41905828
Hi Russ;

This code snippet should do what you need.
// Input string
var input = "[COMPANY] [VARCHAR](64) NULL, [BANKCODE] [CHAR](4) NULL, [OPENDATE] [DATETIME] NULL, [REFERENCENUMBER] [NUMERIC](13, 0) NULL, [VENDORNAME] [CHAR](45) NULL, [AMOUNT] [NUMERIC](13, 2) NULL, [EMAILADDRESS] [VARCHAR](256) NULL";
// Build new string
StringBuilder sb = new StringBuilder();
// Used to bypass the , inside of ( ... )
bool bypass = false;

foreach(var c in input) {
  // Switch on or off bypass depending on character ( or )
	if (c == '(' || c == ')') {
    bypass = !bypass;
    sb.Append(c);
    continue;
	}
	
	if (c == ',' && bypass == false)
	  // bypass ,
    sb.Append(' ');
	else
	  // Don't bypass ,
    sb.Append(c);	
}

// Display the new string
Console.WriteLine(sb.ToString());

Open in new window

0
 
LVL 20

Author Closing Comment

by:Russ Suter
ID: 41905921
I went with a modified version of Fernando's approach. It looks like this:

private List<string> ExtractIndividualColumnDefinitions(string columnDefinitions)
        {
            int parenLevel = 0;
            List<string> resultSet = new List<string>();
            string currentColumnDefinition = string.Empty;
            for (int i = 1; i < columnDefinitions.Length - 1; ++i)
            {
                if (columnDefinitions[i] == ',' && parenLevel == 0)
                {
                    resultSet.Add(currentColumnDefinition.Trim());
                    currentColumnDefinition = string.Empty;
                }
                else
                {
                    if (columnDefinitions[i] == '(')
                    {
                        ++parenLevel;
                    }
                    else if (columnDefinitions[i] == ')')
                    {
                        --parenLevel;
                    }
                    currentColumnDefinition += columnDefinitions[i];
                }
            }
            if (!string.IsNullOrEmpty(currentColumnDefinition))
            {
                resultSet.Add(currentColumnDefinition.Trim());
            }
            return resultSet;
        }

I'm still fairly sure there's a workable Regex solution to this problem but the Regex is probably more complex than my knowledge level and I just need to move on.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

912 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now