Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Java regex replace

Posted on 2014-02-23
11
Medium Priority
?
485 Views
Last Modified: 2014-02-24
i have a string in an xml file <Boxed_Length>   </Boxed_Length> i need a regular expression in java to replace all spaces so instead it should say <Boxed_Length></Boxed_Length>
0
Comment
Question by:samjud
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39881087
You can try:

<Boxed_Length>\s+</Boxed_Lenth>

Open in new window

0
 

Author Comment

by:samjud
ID: 39881092
i am running this in talend and got a invalid escape sequence error so i tried <Boxed_Length>\\s+</Boxed_Lenth> it still does not work
0
 

Author Comment

by:samjud
ID: 39881093
here is the full string i tried "<Boxed_Length>\\s+</Boxed_Length>&#(.*);"
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881231
Different tools and languages have different standards for regular expression syntax. The string you're using looks good for Java, but I don't know whether talend uses the same syntax; do you know?

It's worth noting, in case you didn't notice, that kaufmed's pattern was missing a letter g from the closing tag ie Boxed_Lenth should be Boxed_Length. It looks like you've fixed it in your latest pattern, but I'm pointing it out just in case it makes a difference.
0
 

Author Comment

by:samjud
ID: 39881246
As far as i know talend uses regular java expression syntax and yes i did fix the spelling in my test.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881247
If it's easy and quick to change the pattern and retest, a good technique is to start with a very simple pattern like "Boxed_Length" and build up the complexity once you have the simple one working.

Alternatively, you may like to provide a copy and paste of the data you're trying to match. There might be (I'm guessing almost certainly is) something in the data that causes it to not match. It might be something as simple as a space between the > and & characters.
0
 

Author Comment

by:samjud
ID: 39881309
below is an example of 1 xml record..

<Row>
            <Item_SKU>MANOWAR16</Item_SKU>
            <Promo_Title>12" Guitar Speakers</Promo_Title>
            <QTY_on_hand>7</QTY_on_hand>
            <COST>61</COST>
            <UPC>876358001583</UPC>
            <Weight>9.9</Weight>
            <Brand>EMINENCE</Brand>
            <MSRP>89.99</MSRP>
            <UAP>89.99</UAP>
            <TOPCATEGORY>DJ,HOME,STAGE,PERSONAL,RECREATION,SCHOOL,INSTRUMENTAL,PORTABLE,CLUB</TOPCATEGORY>
            <Boxed_Length>   </Boxed_Length>
            <Boxed_Height />
            <Boxed_Width>   </Boxed_Width>
            <CCREATEDATE>2011-11-29T12:40:50.68</CCREATEDATE>
            <DATEMODIFIED>2014-1-22</DATEMODIFIED>
            <MFG_PROD_ID>MANOWAR16</MFG_PROD_ID>
            <Image_x0020_URL>/images/XYZ123/MANOWAR16.jpg</Image_x0020_URL>
            <CATEGORY>WOOFERS-GUITAR-12IN</CATEGORY>
            <MFGCOUNTRY>CHINA</MFGCOUNTRY>
            <Long_Description>SPECIFICATION    
Nominal Basket Diameter  12", 304.8mm
Nominal Impedance*  16 ohms
Power Rating    
Watts 120W
Music Program  
Resonance 102Hz
Usable Frequency Range  70Hz-5.5kHz
Sensitivity*** 101.6
Magnet Weight  38 oz.
Gap Height  0.312", 7.92mm
Voice Coil Diameter 1.75", 44.5mm
                SOUND CLIPS
THIELE &amp; SMALL PARAMETERS          Clean    Heavy    OD
Resonant Frequency (fs)  102Hz  
DC Resistance (Re)  13.1  
Coil Inductance (Le)  0.74mH      
      Download PDF Spec Sheet  
 
Mechanical Q (Qms)  12.39
Electromagnetic Q (Qes)  0.97
Total Q (Qts)  0.85
Compliance Equivalent Volume (Vas)  31.5 liters / 1.1 cu.ft.
Mechanical Compliance of Suspension (Cms)  0.08mm/N  
BL Product (BL)  16.5 T-M  
Diaphram Mass inc. Airload (Mms)  30 grams  
Efficiency Bandwidth Product (EBP)  105  
Maximum Linear Excursion (Xmax)  0.8mm  
Surface Area of Cone (Sd)  519.5 cm2  
Maximum Mechanical Limit (Xlim)    
     
MOUNTING INFORMATION      
Recommended Enclosure      
Sealed Acceptable  
Vented Acceptable  
Overall Diameter  12.02", 305.3mm  
Baffle Hole Diameter  10.97", 278.6mm  
Front Sealing Gasket  fitted as standard  
Rear Sealing Gasket  fitted as standard  
Mounting Holes Diameter  0.25", 6.4mm  
Mounting Holes B.C. D.  11.63", 295.4mm  
Depth 5.2", 132mm  
Net Weight  8.1 lbs., 3.7 kg  
Shipping Weight  9.9 lbs., 4.5 kg  
     
MATERIALS OF CONSTRUCTION      
Coil Construction  Copper voice coil  
Coil Former Polyimide former  
Magnet Composition  Ferrite magnet  
Core Details  Non-vented core  
Basket Materials  Pressed steel basket    
Cone Composition  Paper Cone  
Cone Edge Composition  Paper cone edge  
Dustcap Composition Zurette dust cap  
   
</Long_Description>
      </Row>
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39881329
Some considerations:

a. is this actually a Java question?
b. why in fact are you concerned with that particular whitespace - it's not as if there's much of it ..?
0
 

Author Comment

by:samjud
ID: 39881335
a. i think so
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 2000 total points
ID: 39881477
Unless I'm missing something, the pattern you said you tried:
"<Boxed_Length>\\s+</Boxed_Length>&#(.*);"

Open in new window

won't work because there's no &# characters immediately after the closing boxed_length tag.

The more basic pattern:
"<Boxed_Length>\\s+</Boxed_Length>"

Open in new window

should work, provided that we're really dealing with the same patterns that Java accepts. Are you sure the backslash needs the extra escape character, for example?

As I previously mentioned above, a good technique for fixing patterns that don't work is to start with a simple pattern and get that working before building on it. Start with something like:
"Boxed_Length"

Open in new window

Once the above pattern is known to work, try:
"<Boxed_Length>"

Open in new window

then try:
"<Boxed_Length>\\s+"

Open in new window

and keep building up the pattern until you either encounter a problem (in which case you can ask for more help) or you have the final result you need. This isn't difficult; it just requires a number of iterations of testing/debugging.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39882086
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
Then that's your real problem. Needless to say, it shouldn't be necessary to be finding workarounds like this for such a major platform
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
Suggested Courses

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question