Can we do this RegEx in Notepad ++?

My XML code has issues where there are end </Group> tags without the start <Group>. I need to remove any matches of this end tag without the start tag. Nothing in between can be removed. I'm trying to figure out correct RegEx for this in NotePad++.

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
</Group>
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

bman2011Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

NVITCommented:
Does this help?
(<.+>)\r?\n(.+)\r?\n(.+)\r?\n(.+)\r?\n(</Group>)

Open in new window

0
NVITCommented:
...or
Find:
(^<.+>\r?\n)(.+\r?\n)(.+\r?\n)(.+\r?\n)(</Group>)

Open in new window

Replace:
<Beg>\r\n\1\2\3\4\5\r\n<End>

Open in new window

0
NVITCommented:
Please clarify. What should the resultant example look like?
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

bman2011Author Commented:
The result would be the same as listed but without the </Group> that does not have the beginning tag. So it would look like below:

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
bman2011Author Commented:
I haven't tried your examples, but will try soon once I get home.
0
NVITCommented:
In that case...
Find:
(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>\r?\n)(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>)

Open in new window

Replace:
\1\3\4\r\n

Open in new window

0
NVITCommented:
Did you check the Regular Expression option?
What version Notepad++?
See mine attached.
greenshot-2014-10-17-13-41-41.png
0
bman2011Author Commented:
Hmmm..is this assuming that there are only 3 lines of parnames everytime? Because this is not always the case, it could vary between 3 to 20, to 35, no exact amount of parname lines. Also, Would it be thrown off if there were empty lines in between each line? It does not seem to work.

Notepad ++ 6.5.5
0
NVITCommented:
Yes. 3 lines.
Can you post a bigger sample?
0
bman2011Author Commented:
Here is original variation as close as possible. I'm basically working with xml file that has literally 20k lines of these. First Device PartID is what I need to correct because it has no starting Group and remove the </Group> the second device Part ID is correct because it contains all. This repeats 20k times all with different values for each parname and can be more or less parnames.

    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window


Basically, if there was a way to find any tags where it did not have a matching start tag before it, then this would also solve my problem.
0
ozoCommented:
Find:
(^|</Group>)(((?!<Group)[\s\S])*?)</Group>
Replace:
\1\2
0
bman2011Author Commented:
Ozo, your regex works however it also finds conditions where the start group tag is already there.

How about we do, if there is anything that matches

Color="">
          <Pmeter 

Open in new window

then remove the </Group> that comes next.
0
ozoCommented:
Do you have an example of finding conditions where the start group tag is already there?
0
NVITCommented:
bman2011, per your example, should it be the other way, i.e. for the first Device, add a <Group GID="3">? Then, it will look like the 2nd Device?
Just checking...
    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
        <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window

0
bman2011Author Commented:
Nope, the problem is that the way the XML file was generated was done incorrectly. I need to just remove all instances of </Group> without the starting Group and this will allow the XML file to be parsed correctly. Let me know if this helps make it anymore clear.
0
NVITCommented:
Try this.

1. Turn off "Wrap around"
2. Move cursor to top of file
3. Find:
(^.+<Car.+\r\n)(^.+<Pmeter.+$)(((?!/Group)[\s\S])*?)(^.+</Group>\r\n)
4. Replace:
\1\2\3
5. Replace All
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
bman2011Author Commented:
This was the closest to getting my issue resolved. I have much to learn with regular expressions and will be noting this down. Thanks.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
XML

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.