[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 220
  • Last Modified:

Regular Expression Find and Replace

I have rows of data that look similar to this (2 fields- an English phrase and a Spanish phrase)
"$10 for the first 20 people, $5 for all following attendees", "$10 la primera 20 personas, $5 para los demas participantes"
"$20 for the first 20 people, $0 for all following attendees", "$20 la primera 20 personas,$0 para los demas participantes"
"$30 for the first 50 people, $1.50 for all following attendees", "$30 la primera 50 personas,$1.50 para los demas participantes"
...

I add a row to the table whenever a new previously unused phrase comes in- such as:
$2.50 for the first 2 people, $1.00 for all following attendees"

Since this is a phrase that I have already translated before (minus the numeric differences)- I want to be able to create the similar Spanish phrase for the new value by simply subsituting the new numeric values into a similar Spanish phrase- so for this new example, it would be "$2.50 la primera 2 personas,$1.00 para los demas participantes"

So basically, I need to be able to compare new English phrases that arrive in a file with all of the English phrases that I already have in my table and see if there is a match on everything but the numeric values (although the numeric values have to be in the same position) and if there is,  update the numeric values of the corresponding Spanish phrase to match.
0
dws02432
Asked:
dws02432
  • 4
  • 4
  • 3
  • +1
1 Solution
 
käµfm³d 👽Commented:
...with all of the English phrases that I already have in my table...
Where is this table? A database? An HTML table?
0
 
dws02432Author Commented:
Hi kaufmed, it is a database table with 2 fields- one field with English phrases and one field with the corresponding Spanish phrase.
0
 
käµfm³d 👽Commented:
And are you trying to perform this comparison in code, or on the database itself? If on the DB, which DBMS are you using?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
dws02432Author Commented:
Hi kaufmed, I am trying to perform the comparison using VB.NET.  I am using SQL Server 2008.
0
 
käµfm³d 👽Commented:
OK. SQL Sever doesn't support regex-capable queries out of the box. In order to do that, you would have to enable CLR integration and write a CLR stored proc to get that functionality. Barring that, the only thing I can think of is to query and return the entire table, which may or may not be feasible for you.
0
 
Terry WoodsIT GuruCommented:
Or you could query the data row by row and update it one row at a time. The regex could be done using VB.NET or some other language you're comfortable with. This is assuming the operation doesn't need to be done too often, as performance won't be instant.
0
 
dws02432Author Commented:
Hi TerryAtOpus, that is my plan- to use vb.net and update row by row.  This is for a process that runs only once a month, and it is only for a table with about 5,000 rows usually at the most.  It is the regex that I need help with to take the numeric changes found in the English and update the numbers only in its Spanish phrase equivalent.

So if the new English field is "$5 for 20 people" then I want to update/change an existing Spanish phrase to "$5 por 20 personas" (it might have originally been "$2 for 30 personas" but I want to only update the numeric values so that it matches the English phrase).
0
 
Terry WoodsIT GuruCommented:
Ok, you can capture the numbers from the English text with a pattern like this:
\$(\d+) for the first (\d+) people, $(\d+) for all following attendees

Let's say you capture the values into 3 variables: myFirstDollars, myFirstPeople, and myExtraDollars

Then you can use a replacement with pattern something like:
"^(.*?)(\d+)(.*?)(\d+)(.*?)(\d+)(.*?)$
and replacement:
"$1"&myFirstDollars&"$3"&myFirstPeople&"$5"&myExtraDollars&"$7"

0
 
Terry WoodsIT GuruCommented:
The replacement pattern should be:
"^(.*?)(\d+)(.*?)(\d+)(.*?)(\d+)(.*?)$"
when enclosed in quotes (I missed the end quote)
0
 
dws02432Author Commented:
Hi TerryAtOpus,

Thanks.  I gave that a try, and it almost meets my needs perfectly- but for numbers like $5.00 it sees the value as two numbers 5 and 00.  Is it possible to have a Regex that parses 5.00 as one value while still working for numbers without the decimal?  I won't always know what order the values will be in- sometimes the value 6 (for example) may appear first in the sentence and other times it may be the value of 5% (for example) that appears first and yet other times the first number value in the sentence may be $5.00- so I need a RegEx that can retrieve the entire numeric value whether or not it has a decimal position.  Thanks in advance- your post was very helpful.
0
 
GwynforWebCommented:
Why not 3 simple regular expressions to

(1) replace "for the first" with "la primera"
(2) replace "people" with "personas"
(3) replace "for all following attendees" with "para los demas participantes"
0
 
Terry WoodsIT GuruCommented:
This should do as you describe:
"^(.*?)(\d+(?:\.\d+)?)(.*?)(\d+(?:\.\d+)?)(.*?)(\d+(?:\.\d+)?)(.*?)$"
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

  • 4
  • 4
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now