Split cell contents into multiple cells

Hi everyone,

In column A of my spreadsheet, I have data that looks like the following:

<p>Some text is here</p>
<p> A name is here <br />
A title is here <br />
A company name is here <br />
<span class="dated">A date is here</span>
</p>
<a href="http://www.somesite.com/"><img alt="" src="/img/logo.jpg" style="float: right; height: 60px; margin-top: -75px; top: -70px;" /></a>

I need to separate out the information so that the data in each row of column A is separated out as follows:

Cell B = <p>Some text is here</p>
Cell C = <p> A name is here <br />
Cell D = A title is here <br />
Cell E = A company name is here <br />
Cell F = <span class="dated">A date is here</span>
Cell G = http://www.somesite.com/
Cell H = /img/logo.jpg

I don't need the html elements, so the above would look like this:

Cell B = Some text is here
Cell C = A name is here
Cell D = A title is here
Cell E = A company name is here
Cell F = A date is here
Cell G = http://www.somesite.com/
Cell H = /img/logo.jpg

The html elements, I guess, would just be used in the formula to target the different information.

I wondered if anyone could tell me how this could be done, ie. whether a formula could be used or whether it would need a VBA script?

Also note that in the example that I gave above, there is a line for company name and also a line for the image but in some instances, both those two lines are not there so it may look like this:

<p>Some text is here</p>
<p> A name is here <br />
A title is here <br />
<span class="dated">A date is here</span>
</p>

Or another variation of the above is that it would have the company name but not the image as follows:

<p>Some text is here</p>
<p> A name is here <br />
A title is here <br />
A company name is here <br />
<span class="dated">A date is here</span>
</p>

So I'm not sure if the code could cater to these types of variations in the data.

Thanks so  much in advance.
LVL 1
gwh2Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Roy CoxGroup Finance ManagerCommented:
An example workbook would help
0
gwh2Author Commented:
Thanks for the reply,

I've attached some sample data ...
0
gwh2Author Commented:
Sorry - it's attached now
sample-data.xlsx
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Ryan ChongCommented:
try this...

I use a function to get the content in between.

Function getText(s As String, Optional delimiter As String = "</p>", Optional idx As Integer = 0, Optional replaceStartText As String = "<p>")
    On Error GoTo Err
    Dim tmp As String
    tmp = Split(s, delimiter)(idx)
    lastIdx = InStrRev(tmp, replaceStartText)
    If lastIdx > 0 Then
        getText = Trim(Replace(Mid(tmp, lastIdx + Len(replaceStartText)), vbLf, ""))
    Else
        getText = Trim(Replace(tmp, vbLf, ""))
    End If
    Exit Function
Err:
    getText = ""
End Function

Open in new window

sample-data_b.xlsm
0
gwh2Author Commented:
Thanks for the code. I opened up your file and then went under the Tools menu and then under Macros but I couldn't see the macro there in order to run it. I can see it in the visual basic editor but I don't know how to run it from there. Can you let me know what I should do?
0
Ryan ChongCommented:
I have put the function into the cell's formula... like:

=getText(A1)

you can download the sample file I had uploaded in my previous comment.
0
gwh2Author Commented:
Ok thanks. I've tested the formula but it doesn't seem to be working correctly.

In Row 1:

(1) column F should say "Company Name1A" but instead it's showing the date
(2) column G should be the date but instead it's showing the url
(3) column H should be the url but instead it's showing the image path
(4) column I should be the image path

In Row 2:

(1) column E is showing the span around the date when it should only show the date itself
(2) column F is showing the date by itself when it should be in column E
(3) column G is showing the following:

"<p>Testimonial text is here</p>
<p> Firstname2 Lastname2 <br />
    Position2<br />
<span class=""dated"">22/9/2014</span>
</p>"

(4) column H is showing the following:

"<p>Testimonial text is here</p>
<p> Firstname2 Lastname2 <br />
    Position2<br />
<span class=""dated"">22/9/2014</span>
</p>"

The other rows are not displaying correctly either.

Are you able to help further?
0
gwh2Author Commented:
Perhaps the problem is that the content of the cells differs. I thought that you could use the html elements to formulate some code but maybe it's not possible?
0
Ryan ChongCommented:
>>(2) column G should be the date but instead it's showing the url
>>(3) column H should be the url but instead it's showing the image path
>>(4) column I should be the image path

this is different from what you have defined in your question...

Cell F = A date is here
Cell G = http://www.somesite.com/
Cell H = /img/logo.jpg
0
Ryan ChongCommented:
>>Perhaps the problem is that the content of the cells differs. I thought that you could use the html elements to formulate some code but maybe it's not possible?

some of the content is not "formatted" with proper value, for example: Cell A3 do not have the Position.

<p>Testimonial text is here</p>
<p> Firstname3 Lastname3 <br />
class="dated">14/8/2014</span>
</p>

so we do need a better handler for different scenarios
0
gwh2Author Commented:
Sorry when I first posted the question I didn't realise the variations in Column A. I've since worked out that there will be 5 variations, ie. the rows in Column A will either be broken into 3, 4, 6, 7 or 8 columns.

Due to these variation, will this still be possible?
0
Ryan ChongCommented:
can you post the latest Excel you have and then we shall work from there onwards...
0
gwh2Author Commented:
The latest one that I have is the one that you're working from. If you look at the contents in Column A you can see the variations. For example Row 1 has the following information:

<p>Testimonial text is here</p>
<p> Firstname1 Lastname1 <br />
Position1<br />
Company Name1<br />
Company Name1A <br />
<span class="dated">1/9/2015</span>
</p>
<a href="http://www.someonesite.com1/"><img alt="" src="/img/logo1.jpg" style="float: right; height: 60px; margin-top: -75px; top: -70px;" /></a>

The above row would need to be split into 8 columns as there is 8 sets of information.

The following is Row B:

<p>Testimonial text is here</p>
<p> Firstname2 Lastname2 <br />
Position2<br />
<span class="dated">22/9/2014</span>
</p>

The above row would only be split into 4 columns as there is only 4 sets of information.

The other 3 rows are also different.

Are you able to work from that same spreadsheet?
0
Ryan ChongCommented:
>>Are you able to work from that same spreadsheet?
that's the reason why I suggest you to upload the one you have to avoid discrepancies and waste of efforts. cheers
0
gwh2Author Commented:
Ok attached is the sample data - thanks
sample-data.xlsx
0
Ryan ChongCommented:
it's seems that there are some conflict in your data, for example:

in cell A2:

<p>Testimonial text is here</p>
<p> Firstname2 Lastname2 <br />
   Position2<br />
<span class="dated">22/9/2014</span>
</p>

in cell A4:

<p>Testimonial text is here</p>
<p> Firstname4 Lastname4 <br />
Company Name4<br />
<span class="dated">19/6/2014</span>
</p>
<a href="http://www.someonesite.com4/"><img alt="" src="/img/logo4.jpg" style="float: right; height: 60px; margin-top: -75px; top: -70px;" /></a>

both content hold different values but are in same position. How's your Text Mining logic to handle this?
0
gwh2Author Commented:
Hi, yeah I realise that there are conflicts but to be honest I don't know how to handle them. As long as the information is in its only cell and column, I can then just go through manually and move things around after the code has run.

Are you ok with that? I understand if it's not possible to do.
0
Ryan ChongCommented:
a decision has to be made either by yourself or by your immediate supervisor to firm out the text mining logic. Without a specific tag or logic in text content, we may need to sacrifice some data accuracy when we do the mining
0
gwh2Author Commented:
Maybe the logic could be:

Anything between <p> and </p>, copy to new cell
Anything between <p> and <br />, copy to new cell
Anything before a <br /> tag, copy to new cell
Anything between span tags, copy to new cell

And the same for the url and image file path
0
gwh2Author Commented:
If you can't do that, then could we revise the scope and just copy the date, the url and image file path. That would be enough I think.
0
gwh2Author Commented:
The rest I can do manually.
0
Ryan ChongCommented:
>>If you can't do that, then could we revise the scope and just copy the date, the url and image file path. That would be enough I think.

I think the latest sample already able to extract those info.

Or you can refer to the one I attached in this comment.

hope that helps
sample-data_d.xlsm
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
gwh2Author Commented:
Ok thanks for your help - I appreciate it.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Excel

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.