Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 153
  • Last Modified:

Can we do this RegEx in Notepad ++?

My XML code has issues where there are end </Group> tags without the start <Group>. I need to remove any matches of this end tag without the start tag. Nothing in between can be removed. I'm trying to figure out correct RegEx for this in NotePad++.

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
</Group>
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
bman2011
Asked:
bman2011
  • 8
  • 7
  • 2
1 Solution
 
NVITCommented:
Does this help?
(<.+>)\r?\n(.+)\r?\n(.+)\r?\n(.+)\r?\n(</Group>)

Open in new window

0
 
NVITCommented:
...or
Find:
(^<.+>\r?\n)(.+\r?\n)(.+\r?\n)(.+\r?\n)(</Group>)

Open in new window

Replace:
<Beg>\r\n\1\2\3\4\5\r\n<End>

Open in new window

0
 
NVITCommented:
Please clarify. What should the resultant example look like?
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
bman2011Author Commented:
The result would be the same as listed but without the </Group> that does not have the beginning tag. So it would look like below:

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
 
bman2011Author Commented:
I haven't tried your examples, but will try soon once I get home.
0
 
NVITCommented:
In that case...
Find:
(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>\r?\n)(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>)

Open in new window

Replace:
\1\3\4\r\n

Open in new window

0
 
NVITCommented:
Did you check the Regular Expression option?
What version Notepad++?
See mine attached.
greenshot-2014-10-17-13-41-41.png
0
 
bman2011Author Commented:
Hmmm..is this assuming that there are only 3 lines of parnames everytime? Because this is not always the case, it could vary between 3 to 20, to 35, no exact amount of parname lines. Also, Would it be thrown off if there were empty lines in between each line? It does not seem to work.

Notepad ++ 6.5.5
0
 
NVITCommented:
Yes. 3 lines.
Can you post a bigger sample?
0
 
bman2011Author Commented:
Here is original variation as close as possible. I'm basically working with xml file that has literally 20k lines of these. First Device PartID is what I need to correct because it has no starting Group and remove the </Group> the second device Part ID is correct because it contains all. This repeats 20k times all with different values for each parname and can be more or less parnames.

    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window


Basically, if there was a way to find any tags where it did not have a matching start tag before it, then this would also solve my problem.
0
 
ozoCommented:
Find:
(^|</Group>)(((?!<Group)[\s\S])*?)</Group>
Replace:
\1\2
0
 
bman2011Author Commented:
Ozo, your regex works however it also finds conditions where the start group tag is already there.

How about we do, if there is anything that matches

Color="">
          <Pmeter 

Open in new window

then remove the </Group> that comes next.
0
 
ozoCommented:
Do you have an example of finding conditions where the start group tag is already there?
0
 
NVITCommented:
bman2011, per your example, should it be the other way, i.e. for the first Device, add a <Group GID="3">? Then, it will look like the 2nd Device?
Just checking...
    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
        <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window

0
 
bman2011Author Commented:
Nope, the problem is that the way the XML file was generated was done incorrectly. I need to just remove all instances of </Group> without the starting Group and this will allow the XML file to be parsed correctly. Let me know if this helps make it anymore clear.
0
 
NVITCommented:
Try this.

1. Turn off "Wrap around"
2. Move cursor to top of file
3. Find:
(^.+<Car.+\r\n)(^.+<Pmeter.+$)(((?!/Group)[\s\S])*?)(^.+</Group>\r\n)
4. Replace:
\1\2\3
5. Replace All
0
 
bman2011Author Commented:
This was the closest to getting my issue resolved. I have much to learn with regular expressions and will be noting this down. Thanks.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 8
  • 7
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now