Solved

Java regex replace

Posted on 2014-02-23
11
467 Views
Last Modified: 2014-02-24
i have a string in an xml file <Boxed_Length>   </Boxed_Length> i need a regular expression in java to replace all spaces so instead it should say <Boxed_Length></Boxed_Length>
0
Comment
Question by:samjud
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 39881087
You can try:

<Boxed_Length>\s+</Boxed_Lenth>

Open in new window

0
 

Author Comment

by:samjud
ID: 39881092
i am running this in talend and got a invalid escape sequence error so i tried <Boxed_Length>\\s+</Boxed_Lenth> it still does not work
0
 

Author Comment

by:samjud
ID: 39881093
here is the full string i tried "<Boxed_Length>\\s+</Boxed_Length>&#(.*);"
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881231
Different tools and languages have different standards for regular expression syntax. The string you're using looks good for Java, but I don't know whether talend uses the same syntax; do you know?

It's worth noting, in case you didn't notice, that kaufmed's pattern was missing a letter g from the closing tag ie Boxed_Lenth should be Boxed_Length. It looks like you've fixed it in your latest pattern, but I'm pointing it out just in case it makes a difference.
0
 

Author Comment

by:samjud
ID: 39881246
As far as i know talend uses regular java expression syntax and yes i did fix the spelling in my test.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 39881247
If it's easy and quick to change the pattern and retest, a good technique is to start with a very simple pattern like "Boxed_Length" and build up the complexity once you have the simple one working.

Alternatively, you may like to provide a copy and paste of the data you're trying to match. There might be (I'm guessing almost certainly is) something in the data that causes it to not match. It might be something as simple as a space between the > and & characters.
0
 

Author Comment

by:samjud
ID: 39881309
below is an example of 1 xml record..

<Row>
            <Item_SKU>MANOWAR16</Item_SKU>
            <Promo_Title>12" Guitar Speakers</Promo_Title>
            <QTY_on_hand>7</QTY_on_hand>
            <COST>61</COST>
            <UPC>876358001583</UPC>
            <Weight>9.9</Weight>
            <Brand>EMINENCE</Brand>
            <MSRP>89.99</MSRP>
            <UAP>89.99</UAP>
            <TOPCATEGORY>DJ,HOME,STAGE,PERSONAL,RECREATION,SCHOOL,INSTRUMENTAL,PORTABLE,CLUB</TOPCATEGORY>
            <Boxed_Length>   </Boxed_Length>
            <Boxed_Height />
            <Boxed_Width>   </Boxed_Width>
            <CCREATEDATE>2011-11-29T12:40:50.68</CCREATEDATE>
            <DATEMODIFIED>2014-1-22</DATEMODIFIED>
            <MFG_PROD_ID>MANOWAR16</MFG_PROD_ID>
            <Image_x0020_URL>/images/XYZ123/MANOWAR16.jpg</Image_x0020_URL>
            <CATEGORY>WOOFERS-GUITAR-12IN</CATEGORY>
            <MFGCOUNTRY>CHINA</MFGCOUNTRY>
            <Long_Description>SPECIFICATION    
Nominal Basket Diameter  12", 304.8mm
Nominal Impedance*  16 ohms
Power Rating    
Watts 120W
Music Program  
Resonance 102Hz
Usable Frequency Range  70Hz-5.5kHz
Sensitivity*** 101.6
Magnet Weight  38 oz.
Gap Height  0.312", 7.92mm
Voice Coil Diameter 1.75", 44.5mm
                SOUND CLIPS
THIELE &amp; SMALL PARAMETERS          Clean    Heavy    OD
Resonant Frequency (fs)  102Hz  
DC Resistance (Re)  13.1  
Coil Inductance (Le)  0.74mH      
      Download PDF Spec Sheet  
 
Mechanical Q (Qms)  12.39
Electromagnetic Q (Qes)  0.97
Total Q (Qts)  0.85
Compliance Equivalent Volume (Vas)  31.5 liters / 1.1 cu.ft.
Mechanical Compliance of Suspension (Cms)  0.08mm/N  
BL Product (BL)  16.5 T-M  
Diaphram Mass inc. Airload (Mms)  30 grams  
Efficiency Bandwidth Product (EBP)  105  
Maximum Linear Excursion (Xmax)  0.8mm  
Surface Area of Cone (Sd)  519.5 cm2  
Maximum Mechanical Limit (Xlim)    
     
MOUNTING INFORMATION      
Recommended Enclosure      
Sealed Acceptable  
Vented Acceptable  
Overall Diameter  12.02", 305.3mm  
Baffle Hole Diameter  10.97", 278.6mm  
Front Sealing Gasket  fitted as standard  
Rear Sealing Gasket  fitted as standard  
Mounting Holes Diameter  0.25", 6.4mm  
Mounting Holes B.C. D.  11.63", 295.4mm  
Depth 5.2", 132mm  
Net Weight  8.1 lbs., 3.7 kg  
Shipping Weight  9.9 lbs., 4.5 kg  
     
MATERIALS OF CONSTRUCTION      
Coil Construction  Copper voice coil  
Coil Former Polyimide former  
Magnet Composition  Ferrite magnet  
Core Details  Non-vented core  
Basket Materials  Pressed steel basket    
Cone Composition  Paper Cone  
Cone Edge Composition  Paper cone edge  
Dustcap Composition Zurette dust cap  
   
</Long_Description>
      </Row>
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39881329
Some considerations:

a. is this actually a Java question?
b. why in fact are you concerned with that particular whitespace - it's not as if there's much of it ..?
0
 

Author Comment

by:samjud
ID: 39881335
a. i think so
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 500 total points
ID: 39881477
Unless I'm missing something, the pattern you said you tried:
"<Boxed_Length>\\s+</Boxed_Length>&#(.*);"

Open in new window

won't work because there's no &# characters immediately after the closing boxed_length tag.

The more basic pattern:
"<Boxed_Length>\\s+</Boxed_Length>"

Open in new window

should work, provided that we're really dealing with the same patterns that Java accepts. Are you sure the backslash needs the extra escape character, for example?

As I previously mentioned above, a good technique for fixing patterns that don't work is to start with a simple pattern and get that working before building on it. Start with something like:
"Boxed_Length"

Open in new window

Once the above pattern is known to work, try:
"<Boxed_Length>"

Open in new window

then try:
"<Boxed_Length>\\s+"

Open in new window

and keep building up the pattern until you either encounter a problem (in which case you can ask for more help) or you have the final result you need. This isn't difficult; it just requires a number of iterations of testing/debugging.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 39882086
b. it looks like talend has a feature (bug) so that when there is a record in an xml file that just has whitespace without any data it breaks and does not continue.. hence the need to remove the whitespace.
Then that's your real problem. Needless to say, it shouldn't be necessary to be finding workarounds like this for such a major platform
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
JUnit 4 @Before and @BeforeClass differences 3 60
ForLoop Example 3 49
Arrays.asList  VS  ArrayList 4 60
Crystal Reports Licensing Questions 4 13
By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:

822 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question