Solved

Java regex replace

Posted on 2014-02-23
11
474 Views
Last Modified: 2014-02-24
i have a string in an xml file <Boxed_Length>   </Boxed_Length> i need a regular expression in java to replace all spaces so instead it should say <Boxed_Length></Boxed_Length>
0
Comment
Question by:samjud
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39881087
You can try:

<Boxed_Length>\s+</Boxed_Lenth>

Open in new window

0
 

Author Comment

by:samjud
ID: 39881092
i am running this in talend and got a invalid escape sequence error so i tried <Boxed_Length>\\s+</Boxed_Lenth> it still does not work
0
 

Author Comment

by:samjud
ID: 39881093
here is the full string i tried "<Boxed_Length>\\s+</Boxed_Length>&#(.*);"
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881231
Different tools and languages have different standards for regular expression syntax. The string you're using looks good for Java, but I don't know whether talend uses the same syntax; do you know?

It's worth noting, in case you didn't notice, that kaufmed's pattern was missing a letter g from the closing tag ie Boxed_Lenth should be Boxed_Length. It looks like you've fixed it in your latest pattern, but I'm pointing it out just in case it makes a difference.
0
 

Author Comment

by:samjud
ID: 39881246
As far as i know talend uses regular java expression syntax and yes i did fix the spelling in my test.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881247
If it's easy and quick to change the pattern and retest, a good technique is to start with a very simple pattern like "Boxed_Length" and build up the complexity once you have the simple one working.

Alternatively, you may like to provide a copy and paste of the data you're trying to match. There might be (I'm guessing almost certainly is) something in the data that causes it to not match. It might be something as simple as a space between the > and & characters.
0
 

Author Comment

by:samjud
ID: 39881309
below is an example of 1 xml record..

<Row>
            <Item_SKU>MANOWAR16</Item_SKU>
            <Promo_Title>12" Guitar Speakers</Promo_Title>
            <QTY_on_hand>7</QTY_on_hand>
            <COST>61</COST>
            <UPC>876358001583</UPC>
            <Weight>9.9</Weight>
            <Brand>EMINENCE</Brand>
            <MSRP>89.99</MSRP>
            <UAP>89.99</UAP>
            <TOPCATEGORY>DJ,HOME,STAGE,PERSONAL,RECREATION,SCHOOL,INSTRUMENTAL,PORTABLE,CLUB</TOPCATEGORY>
            <Boxed_Length>   </Boxed_Length>
            <Boxed_Height />
            <Boxed_Width>   </Boxed_Width>
            <CCREATEDATE>2011-11-29T12:40:50.68</CCREATEDATE>
            <DATEMODIFIED>2014-1-22</DATEMODIFIED>
            <MFG_PROD_ID>MANOWAR16</MFG_PROD_ID>
            <Image_x0020_URL>/images/XYZ123/MANOWAR16.jpg</Image_x0020_URL>
            <CATEGORY>WOOFERS-GUITAR-12IN</CATEGORY>
            <MFGCOUNTRY>CHINA</MFGCOUNTRY>
            <Long_Description>SPECIFICATION    
Nominal Basket Diameter  12", 304.8mm
Nominal Impedance*  16 ohms
Power Rating    
Watts 120W
Music Program  
Resonance 102Hz
Usable Frequency Range  70Hz-5.5kHz
Sensitivity*** 101.6
Magnet Weight  38 oz.
Gap Height  0.312", 7.92mm
Voice Coil Diameter 1.75", 44.5mm
                SOUND CLIPS
THIELE &amp; SMALL PARAMETERS          Clean    Heavy    OD
Resonant Frequency (fs)  102Hz  
DC Resistance (Re)  13.1  
Coil Inductance (Le)  0.74mH      
      Download PDF Spec Sheet  
 
Mechanical Q (Qms)  12.39
Electromagnetic Q (Qes)  0.97
Total Q (Qts)  0.85
Compliance Equivalent Volume (Vas)  31.5 liters / 1.1 cu.ft.
Mechanical Compliance of Suspension (Cms)  0.08mm/N  
BL Product (BL)  16.5 T-M  
Diaphram Mass inc. Airload (Mms)  30 grams  
Efficiency Bandwidth Product (EBP)  105  
Maximum Linear Excursion (Xmax)  0.8mm  
Surface Area of Cone (Sd)  519.5 cm2  
Maximum Mechanical Limit (Xlim)    
     
MOUNTING INFORMATION      
Recommended Enclosure      
Sealed Acceptable  
Vented Acceptable  
Overall Diameter  12.02", 305.3mm  
Baffle Hole Diameter  10.97", 278.6mm  
Front Sealing Gasket  fitted as standard  
Rear Sealing Gasket  fitted as standard  
Mounting Holes Diameter  0.25", 6.4mm  
Mounting Holes B.C. D.  11.63", 295.4mm  
Depth 5.2", 132mm  
Net Weight  8.1 lbs., 3.7 kg  
Shipping Weight  9.9 lbs., 4.5 kg  
     
MATERIALS OF CONSTRUCTION      
Coil Construction  Copper voice coil  
Coil Former Polyimide former  
Magnet Composition  Ferrite magnet  
Core Details  Non-vented core  
Basket Materials  Pressed steel basket    
Cone Composition  Paper Cone  
Cone Edge Composition  Paper cone edge  
Dustcap Composition Zurette dust cap  
   
</Long_Description>
      </Row>
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39881329
Some considerations:

a. is this actually a Java question?
b. why in fact are you concerned with that particular whitespace - it's not as if there's much of it ..?
0
 

Author Comment

by:samjud
ID: 39881335
a. i think so
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 500 total points
ID: 39881477
Unless I'm missing something, the pattern you said you tried:
"<Boxed_Length>\\s+</Boxed_Length>&#(.*);"

Open in new window

won't work because there's no &# characters immediately after the closing boxed_length tag.

The more basic pattern:
"<Boxed_Length>\\s+</Boxed_Length>"

Open in new window

should work, provided that we're really dealing with the same patterns that Java accepts. Are you sure the backslash needs the extra escape character, for example?

As I previously mentioned above, a good technique for fixing patterns that don't work is to start with a simple pattern and get that working before building on it. Start with something like:
"Boxed_Length"

Open in new window

Once the above pattern is known to work, try:
"<Boxed_Length>"

Open in new window

then try:
"<Boxed_Length>\\s+"

Open in new window

and keep building up the pattern until you either encounter a problem (in which case you can ask for more help) or you have the final result you need. This isn't difficult; it just requires a number of iterations of testing/debugging.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39882086
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
Then that's your real problem. Needless to say, it shouldn't be necessary to be finding workarounds like this for such a major platform
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
Suggested Courses

710 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question