Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Manipulating existing Excel rows and columns to new format

Posted on 2014-03-20
14
Medium Priority
?
225 Views
Last Modified: 2014-03-27
I have several very large Excel spreadsheets containing customer information, each record consisting of multiple rows (variable).  I need to coerce the existing data into a standardized format with a fixed number of rows, concatenating some cells into one and breaking others into separate rows.  While I don't think the VBA required to perform any one of these tasks may not be all that difficult, when I tried doing this myself I quickly hit a wall and gave up.  Perhaps someone here can assist...

The existing spreadsheet looks like this:
Existing Excel spreadsheet layout
Notice how the Address field can be comprised of either 3 or 4 lines (depending on whether or not there is a "Suite #) and the Business Classification(s) and Countries where work is performed fields can span multiple rows.  I need to coerce the existing spreadsheet such that:
the Address field is a concatenation of street address and optional suite # separated by a comma
the City, ST Zip information is separated into individually labeled rows (City, ST and Postal Code)
the Country info is separated into its own labeled row (Country)
the Business Classification(s) info is concatenated into one cell separated by commas
the Countries where work is performed info is concatenated into one cell separated by commas

I would like to get the resultant spreadsheet post-manipulation to have one customer record per row with each field (Name, Address, City, ST, Postal Code, Country, etc) in its own column.

The Excel spreadsheets I currently have comprise several hundreds of thousands of records spread across multiple spreadsheets.  Any assistance with the VBA necessary to massage the data into the desired format is greatly appreciated.  Thank you.
0
Comment
Question by:Lloyd Charlier
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 3
  • +1
14 Comments
 
LVL 13

Expert Comment

by:Norm Dickinson
ID: 39943062
In the long run the best way to approach this kind of data is to convert/import it into Access.
0
 
LVL 7

Expert Comment

by:COACHMAN99
ID: 39943079
why not add another column and add formula to concatenate the fields?
e.g. address = B2 & B3 & B4
0
 

Author Comment

by:Lloyd Charlier
ID: 39943107
Norm,

Ultimately, all of this data will be imported into Access.  The only reason I asked for an Excel-to-Excel manipulation is that I thought (perhaps ignorantly) that it might be easier to implement.  Once in the desired Excel format, I already know how to import it into Access.  If you can provide guidance on the necessary VBA/queries to perform this import into Access directly, please have at it! <grin>

Thanks.
0
Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

 
LVL 13

Expert Comment

by:Norm Dickinson
ID: 39943143
I think that to do it right, you will want to map out all of the data in a normalization map, so that you know ahead of time what fields and tables you will be putting the data into in order to achieve a workable relational database from your existing flat file. That said, manipulating the data in Excel is probably going to be a bit redundant in that regard and may be very difficult to pull off. For example, some records are going to vary in size by record, such as Countries where Work is Performed. Do you really want to take the time to put that into a comma delimited field in a spreadsheet only to have to break it back out into a table in Access? From the other perspective, this is exactly the kind of data import nearly any large database faces and is fairly common for an import. I think setting up the database with the best normalization possible and then fitting the data into the database is going to be far easier than manipulating the data as it sits and then importing it. Of course I am assuming that you are going to import into a relational database, as opposed to a flat file database, and that the benefits of the relational database concept are going to be realized.
0
 
LVL 31

Expert Comment

by:gowflow
ID: 39943146
ok lets take it one step at a time and be practical

Appreciate you correct if my understanding is wrong

1) Your present data is layed out in a (or several) worksheet using only 2 columns A and B like the picture you have attached.

2) if 1) is correct then all the records have the label in A and the data in B and they follow ie when record1 ends, record 2 starts after the last row of record 1)

3) if 2) is correct Are all the Fields have the exact same layout (for multiple rows Filed) like Address, Buisness Classifications, Country where work is Performed
They all start with the Label Address, Buisness Classification etc.. and if data span on a second row the label is blank, the next row label is blank and the data end when there is a new field. Please confirm also this pattern is all along the data. Like you will never have Address on row1 then Address on row 2 for the same record.

4) You need a VBA that will convert this multi/non table layout data in Data Like table format having 1 row per record and having all the fields in Columns like you very clearly explained all along in your post.

Regards
gowflow
0
 

Author Comment

by:Lloyd Charlier
ID: 39943237
gowflow:

Thanks for the response.  I understand your concerns and will try to clear things up a bit.

The data is currently laid out in two (2) columns only.  The number of rows for each customer varies (some have an address comprising two lines if they have a suite # - others just one).  All of the "Address" info (street, optional suite, city, st zip, country) is spread across multiple rows in the "B" column but fronted by a joined 'A' label cell (Address).  The same is true for "Business Classification(s)" and "Countries where work is performed" (i.e. the data spans one or more rows in the 'B' column but is fronted by a joined 'A' label cell.  While I understand and appreciate data normalization for the latter two fields (placing the possible values in their own tables), I suggested concatenation instead for the sake of expediency and the fact that I have no visibility into all of the possible values.  For what it's worth, the importance of the latter two fields is low.

Does this answer your questions or do you require additional clarification?
0
 
LVL 31

Expert Comment

by:gowflow
ID: 39943248
yes it does but best is to have a sample of your data (that you can alter to remove confidentiality) and the more you have the better it is and will build a macro for you.
gowflow
0
 
LVL 13

Expert Comment

by:Norm Dickinson
ID: 39943259
It is really a toss up as to manipulating the data in the current format as gowflow suggests or importing it into a new database and doing the manipulation there. The advantage to the gowflow method would be if you continue to use the data for a period of time prior to actually launching the database.
0
 

Author Comment

by:Lloyd Charlier
ID: 39943275
gowflow,

See attached file...  Thanks.
Engineering-Firms.xlsx
0
 
LVL 31

Expert Comment

by:gowflow
ID: 39943296
Perfect for the file,

One last re

•the City, ST Zip information is separated into individually labeled rows (City, ST and Postal Code)


•the Country info is separated into its own labeled row (Country)

the end result you want
Col A    Col B        Col C   Col D     Col E                Col F
Name   Address   City     State   Postal Code    Country

? pls advise for these (shall we break also Address in its constituants or leave the Adress altogether ?

gowflow
0
 

Author Comment

by:Lloyd Charlier
ID: 39943503
gowflow:
In a perfect world, I suppose the 2nd optional address line (suite, apt, unit, ...) should be stored separately ("Address2").  Practically speaking however, either approach would be fine.
0
 
LVL 31

Accepted Solution

by:
gowflow earned 2000 total points
ID: 39946561
Hi

Sorry it took sometime, as was a bit busy and needed to make sure data is converted correctly.

I have attached the macro in the file posted pls make sure macros are enabled and in sheet1 Activate the button Transfer Data to Table and check results in DataTable sheet.

Here is the code for that.

Sub CreateDataTable()
Dim WS As Worksheet
Dim WSTable As Worksheet
Dim MaxRow As Long, I As Long, J As Long
Dim Title As Variant, CSC As Variant

'---> Disable Events
With Application
    .EnableEvents = False
    .DisplayAlerts = False
    .ScreenUpdating = False
End With

Set WS = ActiveSheet
Set WSTable = Sheets("DataTable")
MaxRow = WS.Range("B" & WS.Rows.Count).End(xlUp).Row

'---> Clean DataTable
WSTable.Cells.Delete

'---> Generate Header
Title = Array("Name", "Address1", "Address2", "City", "State", "Postal Code", "Country", "Phone", "Fax", "Website", "Primary Contact", "Description", "Total Number of employees", "Business Classifications", "Countries where work is performed")
WSTable.Range(WSTable.Range("A1"), WSTable.Cells(1, 15)).Value = Title
WSTable.UsedRange.EntireColumn.AutoFit
J = 1

'---> Start Process Creating DataTable
For I = 1 To MaxRow
    
    Select Case WS.Cells(I, "A")
        Case "Name:", "Phone:", "Fax:", "Website:", "Primary Contact:", "Description:", "Total Number of employees:"
            
            '---> Increment Row
            If WS.Cells(I, "A") = "Name:" Then
                J = J + 1
                WSTable.UsedRange.EntireColumn.AutoFit
            End If
            
            For K = 0 To UBound(Title)
                If InStr(1, WS.Cells(I, "A"), Title(K)) <> 0 Then
                    Exit For
                End If
            Next K
            WSTable.Cells(J, K + 1) = WS.Cells(I, "B")
        
        Case "Address:"
            '---> Address1
            WSTable.Cells(J, 2) = WS.Cells(I, "B")
            I = I + 1
            
            '---> Address2
            If WS.Range("A" & I).End(xlDown).Row - I > 2 Then
            'If InStr(1, LCase(WS.Cells(I, "B")), "suite") <> 0 Then
                WSTable.Cells(J, 3) = WS.Cells(I, "B")
                I = I + 1
            End If
                
            '---> City, State, Postal
            '---> City is all before the comma
            CSC = Split(WS.Cells(I, "B"), ",")
            WSTable.Cells(J, 4) = CSC(0)
            
            CSC = Split(Trim(CSC(1)), " ")
            For K = LBound(CSC) To UBound(CSC)
                WSTable.Cells(J, K + 5) = CSC(K)
            Next K
            I = I + 1
            
            '---> Country
            WSTable.Cells(J, 7) = WS.Cells(I, "B")
            
        Case "Business Classifications:", "Countries where work is performed:"
            For K = 0 To UBound(Title)
                If InStr(1, WS.Cells(I, "A"), Title(K)) <> 0 Then
                    Exit For
                End If
            Next K
            Do
                If WSTable.Cells(J, K + 1) <> "" Then WSTable.Cells(J, K + 1) = WSTable.Cells(J, K + 1) & ", "
                WSTable.Cells(J, K + 1) = WSTable.Cells(J, K + 1) & WS.Cells(I, "B")
                I = I + 1
            Loop Until WS.Cells(I, "A") <> ""
            I = I - 1
            
    End Select

Next I

'---> Enable Events
With Application
    .EnableEvents = False
    .DisplayAlerts = False
    .ScreenUpdating = False
End With


MsgBox ("All data Transfered successfully for " & J & " records")

End Sub

Open in new window



Let me know your comments.
gowflow
Engineering-Firms.xlsm
0
 

Author Closing Comment

by:Lloyd Charlier
ID: 39957370
gowflow,

Your script works perfectly!  I have already run it against multiple spreadsheets, the resultant spreadsheets I subsequently imported into Access to build a 150000+ record database.  I need to tweak it a bit so as to accommodate some anomalous record layouts I've encountered but all-in-all your work will prove to be a real productivity boon for me going forward.  Thanks again for the great job!
0
 
LVL 31

Expert Comment

by:gowflow
ID: 39958262
Your welcome and glad I could help.
gowflow
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Some code to ensure data integrity when using macros within Excel. Also included code that helps secure your data within an Excel workbook.
: Microsoft Office Collaborate for free and online versions of Microsoft  Word, Excel, Powerpoint, OneNote, Onedrive , Email, Calendar etc. In short we can say that Microsoft office is a suite of servers, applications and services developed by  Micr…
Viewers will learn the basics of formula auditing in Excel 2013.
Viewers will learn the basics of the new Quick Analysis feature in Excel 2013.

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question