We help IT Professionals succeed at work.

Code to remove, duplicate and add information to Excel spreadsheet

shaz0503
shaz0503 asked
on
Medium Priority
229 Views
Last Modified: 2012-05-11
All

I have a large data set (???) that needs tidying before being 'uploaded' to stats program.

As this data set will be updated on a bi annual basis, I am looking for a solution to a main data tidying exercise.

In one field of data, names of authors, there can be a single author or up to 25 authors in a single cell.

What I need to do is to insert duplicate rows for each of these authors.  That is if there are 10 authors, I need to insert 9 rows BUT each row will only contain one name.

My solution for this round was to insert rows and then manually remove names so that each author had one row.

(All other data is repeated for each author)

In addition, where there is an * after the name I need to add Int to column D and where there is a # I need to add Ext to column D

See attached example

Any ideas or thoughts would be much appreciated as I don't necessarily want to be copying and pasting and deleting for two weeks again to get data in usable format.

TIA

Shaz



ERA-EE.xlsx
Comment
Watch Question

Cool, all of the above can be done!!! but i would need some time.. when would u need it by?

and only thing i would do is, will add the cleansed data in a new sheet rather than adding new rows inbetween and clean it. Coz, it might cause some disturbance to existing data.

-Bala

Author

Commented:
Thanks

Any help you are able to offer is appreciated.... Not really urgent, would just like to get it before the next round of data is out (prob in July) so I can see that it works.

Adding a new sheet is not a problem

rgds
Is this what you want?

Please run the macro "Sample" in the attachment. The output will be generated in a new sheet called Output.

HTH

Sid

Code Used

Sub Sample()
    Dim i As Long, j As Long, k As Long, LastRow As Long, r As Long, pos As Long
    Dim ws1 As Worksheet, ws2 As Worksheet
    Dim HashArray() As String, StarArray() As String
    Dim strTemp As String
    
    On Error Resume Next
    Application.DisplayAlerts = False
    Sheets("Output").Delete
    Application.DisplayAlerts = True
    On Error GoTo 0
    
    Set ws1 = Sheets("Raw Data")
    Set ws2 = Sheets.Add
    ws2.Name = "Output"
    
    LastRow = ws1.Range("A" & Rows.Count).End(xlUp).Row
    
    r = 1
    
    For i = 1 To LastRow
        If InStr(1, ws1.Range("B" & i).Value, "#") Or InStr(1, ws1.Range("B" & i).Value, "*") Then
            strTemp = ws1.Range("A" & i).Value
            ws2.Range("A" & r).Value = ws1.Range("A" & i).Value
            HashArray = Split(ws1.Range("B" & i).Value, "#")
            For j = 0 To UBound(HashArray) - 1
                If InStr(1, HashArray(j), "*") Then
                    StarArray = Split(HashArray(j), "*")
                    For k = 0 To UBound(StarArray)
                        If Left(StarArray(k), 1) = "," Then
                            ws2.Range("B" & r).Value = Mid(StarArray(k), 3)
                        Else
                            ws2.Range("B" & r).Value = StarArray(k)
                        End If
                        pos = InStr(1, ws1.Range("B" & i).Value, StarArray(k)) + Len(StarArray(k))
                        If Mid(ws1.Range("B" & i).Value, pos, 1) = "#" Then
                            ws2.Range("C" & r).Value = "EXT"
                        ElseIf Mid(ws1.Range("B" & i).Value, pos, 1) = "*" Then
                            ws2.Range("C" & r).Value = "INT"
                        End If
                        ws2.Range("A" & r).Value = strTemp
                        r = ws2.Range("B" & Rows.Count).End(xlUp).Row + 1
                    Next
                Else
                    If Left(HashArray(j), 1) = "," Then
                        ws2.Range("B" & r).Value = Mid(HashArray(j), 3)
                    Else
                        ws2.Range("B" & r).Value = HashArray(j)
                    End If
                    pos = InStr(1, ws1.Range("B" & i).Value, HashArray(k)) + Len(HashArray(k))
                    If Mid(ws1.Range("B" & i).Value, pos, 1) = "#" Then
                        ws2.Range("C" & r).Value = "EXT"
                    ElseIf Mid(ws1.Range("B" & i).Value, pos, 1) = "*" Then
                        ws2.Range("C" & r).Value = "INT"
                    End If
                    ws2.Range("A" & r).Value = strTemp
                    r = ws2.Range("B" & Rows.Count).End(xlUp).Row + 1
                End If
            Next
        End If
        r = ws2.Range("B" & Rows.Count).End(xlUp).Row + 1
    Next i
End Sub

Open in new window

ERA-EE.xlsm

Author

Commented:
Thanks so much --

This worked in excellent in part...see attached file

There was an error saying 'subscript out of range'

You will see that only 64 rows of data are in the Output sheet.  There should be in excess of 3000.....

Also seems to be skipping some rows - see row three on teh raw data sheet is missing and so are rows 9, 10, 11.

Copy-of-ERA-2010---HASS-master-l.xls
Looking at it now. Sorry I didn't have much data to test the macro with :)

Sid
I found the error. I was assuming that the text will contain both # and * Where as in Line 12 it was just * :)

Please try this file. let me know if you see any discrepancies.

Sid
Raw-Data.xls
No. Some are still missing. Let me recheck it

Sid
Ok amended a small piece of code. Please try this file.

Sid
Raw-Data.xls
SiddarthRout, u  r amazing yaar!!! whichever questions i open, u have already done it!! awesome...

u r the Expert!!!!
Thanks Bala Bhai :)

Sid

Author

Commented:
SiddarthRout

Thank you so much - you have saved me so much time....I will now try this with all the columns of data (30 in total) and get back if any problems...

The only one I can see from what you have done is that the last row does not record and INT or EXT .... Does part of the code need to be 'extended'

rgds
>>>I will now try this with all the columns of data (30 in total) and get back if any problems...

The will have to be amended for 30 columns :)

>>>The only one I can see from what you have done is that the last row does not record and INT or EXT .... Does part of the code need to be 'extended'

Let me check that for you.

Sid

Author

Commented:
Thanks Sid

I didn't want to send the entire file - should have though

The file actually has data in coulmns A - AJ

If this is what is needed

I am off home for the day - will await your reply... talk tomorrow

rgds
>>>The file actually has data in coulmns A - AJ

That is good enough. There is just a minor change which I will upload in the next few moments

Sid
>>>The only one I can see from what you have done is that the last row does not record and INT or EXT .... Does part of the code need to be 'extended'

I just noticed that it created that extra row when it shouldn't it is just a duplicate of the above row :)

I will incorporate that in the code.

Sid
>>>The file actually has data in coulmns A - AJ

Quick question.

What is in Col C or Rawdata? The reason I am asking is because e are going to put INT/EXT in Col C in the output file.

Sid

Author

Commented:
Sid

Apologies for not getting back to you...

Column A has ID numbers; Column B Title of publication; Column C has the authors (this is the one to be 'split/duplicated';

I need to add a column at Column D to insert the 'INT/EXT'

and then columns E - AK contain data

Sorry for the confusion on my part
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
Sid

Sorry for not getting back sooner...

I ran this and looks ok although got stuck in a loop and had to 'stop'...

Will try again today at some stage and get back to you -

Author

Commented:
Sid...

Thank you so much - this has saved me a lot of time (and sores eyes)...I am now going to try and understand the code to apply to other workbooks....

rgds
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.