Solved

Can we do this RegEx in Notepad ++?

Posted on 2014-10-17
17
141 Views
Last Modified: 2014-10-21
My XML code has issues where there are end </Group> tags without the start <Group>. I need to remove any matches of this end tag without the start tag. Nothing in between can be removed. I'm trying to figure out correct RegEx for this in NotePad++.

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
</Group>
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
Comment
Question by:bman2011
  • 8
  • 7
  • 2
17 Comments
 
LVL 24

Expert Comment

by:NVIT
ID: 40387276
Does this help?
(<.+>)\r?\n(.+)\r?\n(.+)\r?\n(.+)\r?\n(</Group>)

Open in new window

0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387291
...or
Find:
(^<.+>\r?\n)(.+\r?\n)(.+\r?\n)(.+\r?\n)(</Group>)

Open in new window

Replace:
<Beg>\r\n\1\2\3\4\5\r\n<End>

Open in new window

0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387639
Please clarify. What should the resultant example look like?
0
Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

 

Author Comment

by:bman2011
ID: 40387684
The result would be the same as listed but without the </Group> that does not have the beginning tag. So it would look like below:

<Car Name="HONA">
  <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
  <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
  <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
<Group GID="15">
  <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
  <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
  <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
</Group>

Open in new window

0
 

Author Comment

by:bman2011
ID: 40387686
I haven't tried your examples, but will try soon once I get home.
0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387697
In that case...
Find:
(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>\r?\n)(^<.+>\r?\n.+\r?\n.+\r?\n.+\r?\n)(</Group>)

Open in new window

Replace:
\1\3\4\r\n

Open in new window

0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387707
Did you check the Regular Expression option?
What version Notepad++?
See mine attached.
greenshot-2014-10-17-13-41-41.png
0
 

Author Comment

by:bman2011
ID: 40387710
Hmmm..is this assuming that there are only 3 lines of parnames everytime? Because this is not always the case, it could vary between 3 to 20, to 35, no exact amount of parname lines. Also, Would it be thrown off if there were empty lines in between each line? It does not seem to work.

Notepad ++ 6.5.5
0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387719
Yes. 3 lines.
Can you post a bigger sample?
0
 

Author Comment

by:bman2011
ID: 40387744
Here is original variation as close as possible. I'm basically working with xml file that has literally 20k lines of these. First Device PartID is what I need to correct because it has no starting Group and remove the </Group> the second device Part ID is correct because it contains all. This repeats 20k times all with different values for each parname and can be more or less parnames.

    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window


Basically, if there was a way to find any tags where it did not have a matching start tag before it, then this would also solve my problem.
0
 
LVL 84

Expert Comment

by:ozo
ID: 40387771
Find:
(^|</Group>)(((?!<Group)[\s\S])*?)</Group>
Replace:
\1\2
0
 

Author Comment

by:bman2011
ID: 40387841
Ozo, your regex works however it also finds conditions where the start group tag is already there.

How about we do, if there is anything that matches

Color="">
          <Pmeter 

Open in new window

then remove the </Group> that comes next.
0
 
LVL 84

Expert Comment

by:ozo
ID: 40387858
Do you have an example of finding conditions where the start group tag is already there?
0
 
LVL 24

Expert Comment

by:NVIT
ID: 40387914
bman2011, per your example, should it be the other way, i.e. for the first Device, add a <Group GID="3">? Then, it will look like the 2nd Device?
Just checking...
    <Device PartID="29000153" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
        <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>
<Device PartID="29000000" MODNM="M-880">
      <Car Name="HONA" LastEdited="" Blah="" Accessed="" Color="">
       <Group GID="3">
          <Pmeter ParName="*CDMZI" Value="F%+A8E=,,,0;+MS=V123,456;" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
          <Pmeter ParName="DMTZ" Value="" Type="A" Flag="P" />
          <Pmeter ParName="CDRFCP" Value="0" Type="A" Flag="P" />
        </Group>
        <Group GID="15">
          <Pmeter ParName="#PT" Value="NONE" Type="A" Flag="P" />
          <Pmeter ParName="CDH6" Value="100.111.100.44" Type="A" Flag="P" />
          <Pmeter ParName="CDEPORT" Value="9003" Type="A" Flag="P" />
        </Group>
      </Car>
      <DevFiles />
    </Device>

Open in new window

0
 

Author Comment

by:bman2011
ID: 40389294
Nope, the problem is that the way the XML file was generated was done incorrectly. I need to just remove all instances of </Group> without the starting Group and this will allow the XML file to be parsed correctly. Let me know if this helps make it anymore clear.
0
 
LVL 24

Accepted Solution

by:
NVIT earned 500 total points
ID: 40389648
Try this.

1. Turn off "Wrap around"
2. Move cursor to top of file
3. Find:
(^.+<Car.+\r\n)(^.+<Pmeter.+$)(((?!/Group)[\s\S])*?)(^.+</Group>\r\n)
4. Replace:
\1\2\3
5. Replace All
0
 

Author Closing Comment

by:bman2011
ID: 40395553
This was the closest to getting my issue resolved. I have much to learn with regular expressions and will be noting this down. Thanks.
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Formatting issues in XSL FO 3 44
REReplaceNoCase help 1 43
PHP alternative to file_get_contents('php://input') 4 76
In sql, how to roll up multiple rows to only one row. 4 39
I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question