Modify HTML content saved in a DB table

I have a table in a DB with a general text field which contains HTML formatted text. I need to parse the content of such a field, find all the "img" tags and perform 2 operations (only for "img" tags):

1) Remove completely the "style" attribute (if there is one).
2) Add a class="img-responsive" attribute.

For example, a simple string to be parse can be as follows:

<div>
<p>This is some text</p>
<img src="http://www.mywebsite.com/myImage.jpg" alt = "" style="width:600px; height: 400px;"/>
</div>

Open in new window


Or, something more complex:

<p style="margin: 0px 0px 10px; background: white; font-size: 14.6667px; font-family: Calibri, sans-serif;"><span style="font-size: 14pt; font-family: Georgia, serif; color: #404040;">
<img src="http://www.mywebsite.com/myImage.jpg" alt="" style="width:600px; height: 898px;" />
<br /></span></p>  

Open in new window


In both cases, the "img" tag should result in:

<img src="http://www.mywebsite.com/myImage.jpg" alt="" class="img-responsive" />

Open in new window


I know that one option is to use regular expressions but I do not have any experience with them. I will be using C# to perform this task.
I will very much appreciate your help.

Respectfully,
Jorge Maldonado
Jorge MaldonadoAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Chinmay PatelChief Technical NinjaCommented:
Hi jorge,

I would take a different route if i had to do it myself. How about just getting img and alt attributes and get rid of everything else and then just add img-responsive class attribute. Is that something we can do here?

Regards,
Chinmay.
Jorge MaldonadoAuthor Commented:
It sounds good. I need to get rid of all the attributes for every "img" tag except the following:

<img src="http://www.mywebsite.com/myImage1.jpg" alt = ""/

Open in new window


An then insert class="img-responsive".
What about using HTML Agility Pack?
Does anybody know how to use it?

Regards.
ste5anSenior DeveloperCommented:
Is it an existing project or new one?

In the latter case I would consider using XHTML as content format. Cause then it can be processed as XML and you don't need regex.
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

Jorge MaldonadoAuthor Commented:
It is an existing project and the DB table already has many records with HTML formatted data.

Regards,
Jorge Maldonado
Chinmay PatelChief Technical NinjaCommented:
Hi jorge,

I don't think we need to use HTML agility pack for this but you are free to do so. It is as straight forward to use as they claim.
If you get stuck somewhere, do let me know.

Regards,
Chinmay.
Jorge MaldonadoAuthor Commented:
Hi Chinmay,

I do not have a final solution yet, I am performing some tests with HTML Agility Pack.
I will appreciate if you can help using another approach.

Respectfully,
Jorge Maldonado
Chinmay PatelChief Technical NinjaCommented:
Hi Jorge,

A non-HTML Agility pack approach will be using string manipulation
1. Find the <img> tag using indexOf
2. Extract src and alt attributes using either indexOf or String,Split("=") - find the attribute src and alt and their values(which will be next to them) from the array
3. Build your string with the above info, add CSS class.
4. Rinse and repeat for the next string.

Regards,
Chinmay.
lenamtlCommented:
You can also generated an excel or csv  file from the DB table then use search and replace ...
Jorge MaldonadoAuthor Commented:
I could finally managed to make HTML Agility Pack work successfully. The need I have is to process HTML before it is displayed in a ASP.NET MVC website. The approach I took is to

a) Process the information from the DB in the controller.
b) Use HTML Agility Pack to process the HTML in question.
c) Step (b) is done only with data in memory and not saved to the DB.
d) Call the View and pass the data.
e) Display the data in the View.

In this way, all the information in my DB remains unchanged. So, basically I only apply the required changes to the HTML data at run-time.

Best regards,
Jorge Maldonado

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.