Solved

Can we do this RegEx in Notepad ++?

Posted on 2014-10-17
17
130 Views
Last Modified: 2014-10-21
My XML code has issues where there are end </Group> tags without the start <Group>. I need to remove any matches of this end tag without the start tag. Nothing in between can be removed. I'm trying to figure out correct RegEx for this in NotePad++.

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
</Group>
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
Comment
Question by:bman2011
  • 8
  • 7
  • 2
17 Comments
 
LVL 23

Expert Comment

by:NVIT
ID: 40387276
Does this help?
(<.+>)\r?\n(.+)\r?\n(.+)\r?\n(.+)\r?\n(</Group>)

Open in new window

0
 
LVL 23

Expert Comment

by:NVIT
ID: 40387291
...or
Find:
(^<.+>\r?\n)(.+\r?\n)(.+\r?\n)(.+\r?\n)(</Group>)

Open in new window

Replace:
<Beg>\r\n\1\2\3\4\5\r\n<End>

Open in new window

0
 
LVL 23

Expert Comment

by:NVIT
ID: 40387639
Please clarify. What should the resultant example look like?
0
 

Author Comment

by:bman2011
ID: 40387684
The result would be the same as listed but without the </Group> that does not have the beginning tag. So it would look like below:

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
 

Author Comment

by:bman2011
ID: 40387686
I haven't tried your examples, but will try soon once I get home.
0
 
LVL 23

Expert Comment

by:NVIT
ID: 40387697
In that case...
Find:
(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>\r?\n)(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>)

Open in new window

Replace:
\1\3\4\r\n

Open in new window

0
 
LVL 23

Expert Comment

by:NVIT
ID: 40387707
Did you check the Regular Expression option?
What version Notepad++?
See mine attached.
greenshot-2014-10-17-13-41-41.png
0
 

Author Comment

by:bman2011
ID: 40387710
Hmmm..is this assuming that there are only 3 lines of parnames everytime? Because this is not always the case, it could vary between 3 to 20, to 35, no exact amount of parname lines. Also, Would it be thrown off if there were empty lines in between each line? It does not seem to work.

Notepad ++ 6.5.5
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 23

Expert Comment

by:NVIT
ID: 40387719
Yes. 3 lines.
Can you post a bigger sample?
0
 

Author Comment

by:bman2011
ID: 40387744
Here is original variation as close as possible. I'm basically working with xml file that has literally 20k lines of these. First Device PartID is what I need to correct because it has no starting Group and remove the </Group> the second device Part ID is correct because it contains all. This repeats 20k times all with different values for each parname and can be more or less parnames.

    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window


Basically, if there was a way to find any tags where it did not have a matching start tag before it, then this would also solve my problem.
0
 
LVL 84

Expert Comment

by:ozo
ID: 40387771
Find:
(^|</Group>)(((?!<Group)[\s\S])*?)</Group>
Replace:
\1\2
0
 

Author Comment

by:bman2011
ID: 40387841
Ozo, your regex works however it also finds conditions where the start group tag is already there.

How about we do, if there is anything that matches

Color="">
          <Pmeter 

Open in new window

then remove the </Group> that comes next.
0
 
LVL 84

Expert Comment

by:ozo
ID: 40387858
Do you have an example of finding conditions where the start group tag is already there?
0
 
LVL 23

Expert Comment

by:NVIT
ID: 40387914
bman2011, per your example, should it be the other way, i.e. for the first Device, add a <Group GID="3">? Then, it will look like the 2nd Device?
Just checking...
    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
        <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window

0
 

Author Comment

by:bman2011
ID: 40389294
Nope, the problem is that the way the XML file was generated was done incorrectly. I need to just remove all instances of </Group> without the starting Group and this will allow the XML file to be parsed correctly. Let me know if this helps make it anymore clear.
0
 
LVL 23

Accepted Solution

by:
NVIT earned 500 total points
ID: 40389648
Try this.

1. Turn off "Wrap around"
2. Move cursor to top of file
3. Find:
(^.+<Car.+\r\n)(^.+<Pmeter.+$)(((?!/Group)[\s\S])*?)(^.+</Group>\r\n)
4. Replace:
\1\2\3
5. Replace All
0
 

Author Closing Comment

by:bman2011
ID: 40395553
This was the closest to getting my issue resolved. I have much to learn with regular expressions and will be noting this down. Thanks.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now