Avatar of Bobby
Bobby
Flag for United States of America asked on

replace a string with a string from a different file

I have a txt file with a lot of [urlid=12345] (the 12345 can be any sequence of numbers, and it's not only 5 numbers, could be 3, 6, etc). I need to replace all those with the actual URL's that they reference, which I have in an excel file with two columns: urlid and url. The urlid column matches the urlid in [urlid=12345] in the txt file, and the url column contains the url I need to replace every [urlid=12345] with in the txt file.
Regular ExpressionsWindows BatchMicrosoft ExcelMicrosoft Office

Avatar of undefined
Last Comment
Bobby

8/22/2022 - Mon
David Favor

Use curl to resolve the links to URLs.

Simple PERL or BASH + sed script to replace URLs.

Likely any of your tech staff can do this in a short amount of time.
Bobby

ASKER
We don't know any of that, at least those of us on staff now and not on vacation. I was hoping more like we, via regex, find all the instances of [urlid=12345] (or whatever the number sequence is) in the txt file, copy them, and then I can dump them into the excel file. Even at that though, it's not a one-to-one... there are way more records in the excel file (urlid's) than there are in the txt file.
David Favor

This is a very simple task + likely take some time to code.

If you have no one on staff to do this, just hire someone.

You're also making this way to hard. If you have a link, like https://foolcom/12345 which redirects to another URL, just use curl to resolve the redirect. This way you know for sure you have the correct redirect also.

You can go the Excel data way. If you do, then you just ingest all your data into a simple script, like PERL, into hashes + just correlate your data of... [urlid=12345] to whatever your Excel data real URL might be.

As I said, simple to code + will require a bit of time.

Something this simple, just hire someone off Fiverr.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Terry Woods

Can you share the files? This is a pretty quick task I think
Bobby

ASKER
We don't have a link like https://whatever.com/12345 in the txt file, we only have [urlid=12345]. There are no redirects. The txt file is referring to a URL id in a table (excel file copy in this case)... in that table, there is the URL ID and the corresponding URL. Im trying to find the actual URL by tying together the two data sources.
Bobby

ASKER
Yes, I'll share them in 2 minutes...
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Bobby

ASKER
Sample of the txt file...

It's <a href="http://www.stresscure.com/hrn/april.html">National Stress Awareness Month</a>.Yes, it's also [urlid=3150]Occupational Therapy Month[/urlid] too, but every month is about 5 different National Months, so bear with us on this. While we won't do a weekly feature on stress, because there's already enough stress in the world, we did want to share with you some helpful reminders about stress and relieving it. </p>\
<br>\
<p>\
Stress is both a biological and psychological term. It's been a popular topic of discussion in healthcare since the 1930's, but the term is thrown around in conversation without much real understanding. It has become a topic of concern for most American and European societies, and yet was scarcely talked about less than 100 years ago. Some recent researchers have called into question the very existence of the popular notion of stress, claiming it is too wide a term for a variety of distinct problems. But for those who experience stress, it's a very real force.\
</p> \
<p>\
In the 1970's a popular idea among scientists dealt with eustress and distress. Eustress, it was theorized was the positive stress that comes from a demanding physical or mental activity; distress was theorized as the negative kind that comes from a similar activity, but proves damaging to the body. In the early 21st century, research showed that any stress response in the human body creates hormones like adrenaline which damage the body's tissues, slightly in small amounts and in large doses can cause serious long term damage.\
</p>\
<p>\
What we call [urlid=81328]stress[/urlid], whether from our jobs, family, friends or communities, ultimately is an inescapable part of life. A utopia free from human worry has not yet been created, but when it is created, I hope I get the email. In the meantime, we are forced to cope. My great-great-great uncle, once removed, Sigmund Freud had some wild ideas about all this. He called it The Pleasure Principle, and even he wasn't quite sure what it was all about, only that humans have a tendency to try to find ways to get happiness in life, even when happiness is nowhere to be found.  \
</p>\

Open in new window

Bobby

ASKER
sample of the Excel file...
sample.xlsx
Terry Woods

If you can't provide the real data, I'd need to provide you a PHP script. Can you run PHP?

Also, you would need to copy and paste the data from Excel into a txt file so that it's easier to work with in PHP (otherwise there's substantial work involved for pulling data out of the xlsx file).
Your help has saved me hundreds of hours of internet surfing.
fblack61
Bobby

ASKER
Yes to both questions. Thank you.
Terry Woods

Did you want the output data in html format with <a href...   ...</a> ?
Bobby

ASKER
Yes, please.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Terry Woods

Here's the PHP code for you. It should be reasonably self explanatory for you to be able to configure the file names of your input data as needed.

Note that each time you run it, the output file will be overwritten.

<?php
$data_url = file_get_contents("data_url.txt"); // Input file of URLs copied and pasted from 2 columns of data in Excel. The copy and paste process automatically adds a tab character between the id and the URL
$data_in = file_get_contents("data_in.txt"); // Input data

$data_output_filename = "data_out.txt";
$data_out = $data_in;

$url_lines = explode("\n", $data_url);

foreach ($url_lines as $line) {
    list($id, $url) = explode("\t", $line);
    $data_out = preg_replace("#\[urlid=$id\](.*?)\[/urlid\]#", "<a href='$url'>$1</a>", $data_out);
}

file_put_contents($data_output_filename, $data_out);

Open in new window

Bobby

ASKER
I named all the files what you have there, I created data_out.txt and put it in same directory, I gave all 3 text files and the php file 777 perms, and then running it puts the contents of data_in.txt into data_out.txt, doesnt remove or alter the [urlid=12345] tags. I also added a ?> to the end of your script, no diff.
Bobby

ASKER
oh crap, maybe I see what it is... you have [/urlid\] in there, but Ive already replaced all those with </a>. I will alter and try again.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Bobby

ASKER
ugh... Ill have to alter your PHP because it's tool late to undo what I did. Do I do this?...

$data_out = preg_replace("#\[urlid=$id\](.*?)\#", "<a href='$url'>$1", $data_out);
Bobby

ASKER
and is this supposed to say data_url.txt at the end?

$url_lines = explode("\n", $data_url);
Terry Woods

I tested it and it was working ok with the given sample data. If you update your sample, I can alter the code to match.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Terry Woods

Your input data should be in data_in.txt

The file data_out.txt is the result file that gets created by the script
Bobby

ASKER
It's <a href="http://www.stresscure.com/hrn/april.html">National Stress Awareness Month</a>.Yes, it's also [urlid=3150]Occupational Therapy Month</a>too, but every month is about 5 different National Months, so bear with us on this. While we won't do a weekly feature on stress, because there's already enough stress in the world, we did want to share with you some helpful reminders about stress and relieving it. </p>\
<br>\
<p>\
Stress is both a biological and psychological term. It's been a popular topic of discussion in healthcare since the 1930's, but the term is thrown around in conversation without much real understanding. It has become a topic of concern for most American and European societies, and yet was scarcely talked about less than 100 years ago. Some recent researchers have called into question the very existence of the popular notion of stress, claiming it is too wide a term for a variety of distinct problems. But for those who experience stress, it's a very real force.\
</p> \
<p>\
In the 1970's a popular idea among scientists dealt with eustress and distress. Eustress, it was theorized was the positive stress that comes from a demanding physical or mental activity; distress was theorized as the negative kind that comes from a similar activity, but proves damaging to the body. In the early 21st century, research showed that any stress response in the human body creates hormones like adrenaline which damage the body's tissues, slightly in small amounts and in large doses can cause serious long term damage.\
</p>\
<p>\
What we call [urlid=81328]stress</a>, whether from our jobs, family, friends or communities, ultimately is an inescapable part of life. A utopia free from human worry has not yet been created, but when it is created, I hope I get the email. In the meantime, we are forced to cope. My great-great-great uncle, once removed, Sigmund Freud had some wild ideas about all this. He called it The Pleasure Principle, and even he wasn't quite sure what it was all about, only that humans have a tendency to try to find ways to get happiness in life, even when happiness is nowhere to be found.  \
</p>\

Open in new window

Terry Woods

All the code I provided was working, so would only require changing if you had data different to your sample.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Terry Woods

Ok, I'm working on a different copy of the code now... will be done in about 2 mins
ASKER CERTIFIED SOLUTION
Terry Woods

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Bobby

ASKER
Bobby

ASKER
gotta go for tonight but will check first thing tomorrow. Thanks.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Bobby

ASKER
Thanks very much.