Solved

c# regular expression to parse MS Open XML fragment

Posted on 2012-04-04
5
313 Views
Last Modified: 2012-06-27
I am looking for an accurate way to parse the following, I am assuming regex or maybe XML parser. This is for a c# application.

examples:

"w:name w:val=\"CALC_OFFICEADDRFULL\"
"w:enabled w:val=\"true\"
"w:calcOnExit w:val=\"false\"
"w:type w:val=\"regular\"

expected result (key value or something)

name,CALC_OFFICEADDRFULL
enabled,true
calcOnExit,false
type,regular

Much appreciated,

-Markus
0
Comment
Question by:markusr13
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 23

Expert Comment

by:wdosanjos
ID: 37805958
Please try the following:
var rxKey = new Regex(@"(?<=w:)\w+(?=\s)");
var rxValue = new Regex("(?<=w:val=\\\\\").+(?=\\\\\")");
var tests = new string[]
{
	"\"w:name w:val=\\\"CALC_OFFICEADDRFULL\\\"",
	"\"w:enabled w:val=\\\"true\\\"",
	"\"w:calcOnExit w:val=\\\"false\\\"",
	"\"w:type w:val=\\\"regular\\\""
};

foreach (var test in tests)
{
	Console.WriteLine("{0},{1}", rxKey.Match(test).Value, rxValue.Match(test).Value);
}

Open in new window

Output:
name,CALC_OFFICEADDRFULL
enabled,true
calcOnExit,false
type,regular

Open in new window

0
 
LVL 23

Expert Comment

by:wdosanjos
ID: 37806007
Here is another (faster) option with simple substrings:
var tests = new string[]
{
	"\"w:name w:val=\\\"CALC_OFFICEADDRFULL\\\"",
	"\"w:enabled w:val=\\\"true\\\"",
	"\"w:calcOnExit w:val=\\\"false\\\"",
	"\"w:type w:val=\\\"regular\\\""
};

foreach (var test in tests)
{
	var key = test.Substring(3, test.IndexOf(" ") - 3);

	int i = test.IndexOf("\\\"") + 2;
	var value = test.Substring(i, test.Length - i - 2);
	
	Console.WriteLine("{0},{1}", key, value);
}

Open in new window

0
 

Author Comment

by:markusr13
ID: 37807754
Sorry,

The debugger through in the \'s (and i had an extra quote)

try

w:name w:val="CALC_OFFICEADDRFULL"
w:enabled w:val="true"
w:calcOnExit w:val="false"
w:type w:val="regular"

-Markus
0
 
LVL 23

Accepted Solution

by:
wdosanjos earned 500 total points
ID: 37807787
Not a problem.  There you go.

Option 1: (Regex)
var rxKey = new Regex(@"(?<=w:)\w+(?=\s)");
var rxValue = new Regex("(?<=w:val=\").+(?=\")");

var tests = new string[]
{
"w:name w:val=\"CALC_OFFICEADDRFULL\"",
"w:enabled w:val=\"true\"",
"w:calcOnExit w:val=\"false\"",
"w:type w:val=\"regular\""
};

foreach (var test in tests)
{
    Console.WriteLine("{0},{1}", rxKey.Match(test).Value, rxValue.Match(test).Value);
}

Open in new window


Option 2: (Substring)
var tests = new string[]
{
"w:name w:val=\"CALC_OFFICEADDRFULL\"",
"w:enabled w:val=\"true\"",
"w:calcOnExit w:val=\"false\"",
"w:type w:val=\"regular\""
};

foreach (var test in tests)
{
	var key = test.Substring(2, test.IndexOf(" ") - 2);

	int i = test.IndexOf("\"") + 1;
	var value = test.Substring(i, test.Length - i - 1);
	
	Console.WriteLine("{0},{1}", key, value);
}

Open in new window

0
 

Author Comment

by:markusr13
ID: 37807789
points increased due to my data error.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…
In this video we outline the Physical Segments view of NetCrunch network monitor. By following this brief how-to video, you will be able to learn how NetCrunch visualizes your network, how granular is the information collected, as well as where to f…

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question