Solved

c# regular expression to parse MS Open XML fragment

Posted on 2012-04-04
5
312 Views
Last Modified: 2012-06-27
I am looking for an accurate way to parse the following, I am assuming regex or maybe XML parser. This is for a c# application.

examples:

"w:name w:val=\"CALC_OFFICEADDRFULL\"
"w:enabled w:val=\"true\"
"w:calcOnExit w:val=\"false\"
"w:type w:val=\"regular\"

expected result (key value or something)

name,CALC_OFFICEADDRFULL
enabled,true
calcOnExit,false
type,regular

Much appreciated,

-Markus
0
Comment
Question by:markusr13
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 23

Expert Comment

by:wdosanjos
ID: 37805958
Please try the following:
var rxKey = new Regex(@"(?<=w:)\w+(?=\s)");
var rxValue = new Regex("(?<=w:val=\\\\\").+(?=\\\\\")");
var tests = new string[]
{
	"\"w:name w:val=\\\"CALC_OFFICEADDRFULL\\\"",
	"\"w:enabled w:val=\\\"true\\\"",
	"\"w:calcOnExit w:val=\\\"false\\\"",
	"\"w:type w:val=\\\"regular\\\""
};

foreach (var test in tests)
{
	Console.WriteLine("{0},{1}", rxKey.Match(test).Value, rxValue.Match(test).Value);
}

Open in new window

Output:
name,CALC_OFFICEADDRFULL
enabled,true
calcOnExit,false
type,regular

Open in new window

0
 
LVL 23

Expert Comment

by:wdosanjos
ID: 37806007
Here is another (faster) option with simple substrings:
var tests = new string[]
{
	"\"w:name w:val=\\\"CALC_OFFICEADDRFULL\\\"",
	"\"w:enabled w:val=\\\"true\\\"",
	"\"w:calcOnExit w:val=\\\"false\\\"",
	"\"w:type w:val=\\\"regular\\\""
};

foreach (var test in tests)
{
	var key = test.Substring(3, test.IndexOf(" ") - 3);

	int i = test.IndexOf("\\\"") + 2;
	var value = test.Substring(i, test.Length - i - 2);
	
	Console.WriteLine("{0},{1}", key, value);
}

Open in new window

0
 

Author Comment

by:markusr13
ID: 37807754
Sorry,

The debugger through in the \'s (and i had an extra quote)

try

w:name w:val="CALC_OFFICEADDRFULL"
w:enabled w:val="true"
w:calcOnExit w:val="false"
w:type w:val="regular"

-Markus
0
 
LVL 23

Accepted Solution

by:
wdosanjos earned 500 total points
ID: 37807787
Not a problem.  There you go.

Option 1: (Regex)
var rxKey = new Regex(@"(?<=w:)\w+(?=\s)");
var rxValue = new Regex("(?<=w:val=\").+(?=\")");

var tests = new string[]
{
"w:name w:val=\"CALC_OFFICEADDRFULL\"",
"w:enabled w:val=\"true\"",
"w:calcOnExit w:val=\"false\"",
"w:type w:val=\"regular\""
};

foreach (var test in tests)
{
    Console.WriteLine("{0},{1}", rxKey.Match(test).Value, rxValue.Match(test).Value);
}

Open in new window


Option 2: (Substring)
var tests = new string[]
{
"w:name w:val=\"CALC_OFFICEADDRFULL\"",
"w:enabled w:val=\"true\"",
"w:calcOnExit w:val=\"false\"",
"w:type w:val=\"regular\""
};

foreach (var test in tests)
{
	var key = test.Substring(2, test.IndexOf(" ") - 2);

	int i = test.IndexOf("\"") + 1;
	var value = test.Substring(i, test.Length - i - 1);
	
	Console.WriteLine("{0},{1}", key, value);
}

Open in new window

0
 

Author Comment

by:markusr13
ID: 37807789
points increased due to my data error.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How to repeat the data 4 30
Need help converting bitmap to image in VB.Net 8 47
Data organization issue 7 39
c#  FTP ftpwebrequest URI invalid 6 15
Today I had a very interesting conundrum that had to get solved quickly. Needless to say, it wasn't resolved quickly because when we needed it we were very rushed, but as soon as the conference call was over and I took a step back I saw the correct …
Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question