Solved

Split columns that have delimiters in either excel or access

Posted on 2010-08-18
16
380 Views
Last Modified: 2013-11-24
Hi all,  I have a problem where I have a csv file that has data separated by semi-colons.  Easy to import into excel or access and split you may think.  However, my data is formatted as follows:-

0100     Resident;Resident2     Title;Title2

Where the spaces (TAB) is the column delimiter and ; is the delimiter within the next column.  Finding it difficult to explain but I have a line where the first column contains the reference number for that property, but then the second column contains TWO Names split by the semi-colon, the third column contains those people's Title's and so on.  There are always the same amount of entries in the remaining columns as in the second column, so every resident will have a title etc, however the FIRST column only has the ONE reference number entry!!

I want to split/view the data so this it would read as follows:-

0100     Resident     Title
0100     Resident2   Title2

Hope this makes sense!  And hope it's possible!  Have attached sample CSV file.

I'm  no programmer but am pretty good with excel/access if someone could point me in the right direction!

Cheers
Test.csv
0
Comment
Question by:howsonhome
  • 8
  • 3
  • 3
16 Comments
 
LVL 45

Expert Comment

by:patrickab
Comment Utility
howsonhome,

In Excel, and in the File Open dialogue box select Text as the file type, find the file and then yYou need to use different text file import settings. Choose the 'delimited' option, press 'Next' and select 'commas' and 'semi-colon' as the delimiters, press OK. That will separate the data into sensible columns - see attached file.

Patrick


Test-csv-01.xls
0
 

Author Comment

by:howsonhome
Comment Utility
Hi Sorry that's not what is required.  I'm finding it difficult to make clear what I need so have done a manual version in an excel spreadsheet.  The first sheet is the ORIGINAL data, (the second sheet is where I can TRANSPOSE data into columns but not sure if this is any use anyway), and the last sheet is actually what I REQUIRE which I have just typed in manually.  So I'm trying to get the ORIGINAL sheet to look like the REQUIRED sheet without having to do it manually as I actually have around 16,000 records!

Many thanks.
Test2.xls
0
 
LVL 85

Expert Comment

by:Rory Archibald
Comment Utility
Try this - run LoadFile2
Sub LoadFile2()

    Const ForReading As Integer = 1

    Dim varFileName, avarData, varData

    Dim lngRow As Long, lngColumnCount As Long, lngIndex As Long, lngRecCount As Long

    Dim x As Long, y As Long

    Dim fso As Object, tsrStream1 As Object

    Dim shtNew As Worksheet

    

    Set shtNew = ActiveWorkbook.Sheets.Add

    

    varFileName = Application.GetOpenFilename("CSV Files (*.csv), *.csv", , "Select source file")

    

    If TypeName(varFileName) = "Boolean" Then Exit Sub

    

    Set fso = CreateObject("Scripting.FileSystemObject")

    lngRow = 1

    Set tsrStream1 = fso.OpenTextFile(varFileName, ForReading, False)

    avarData = Split(tsrStream1.ReadAll, vbCrLf)

    tsrStream1.Close

    Application.ScreenUpdating = False

    For lngIndex = LBound(avarData) To UBound(avarData)

        If Len(avarData(lngIndex)) > 0 Then

            varData = GetRowData(CStr(avarData(lngIndex)))

            lngRecCount = UBound(varData, 1)

            lngColumnCount = UBound(varData, 2)

            shtNew.Cells(lngRow, "A").Resize(lngRecCount, lngColumnCount).Value = varData

            lngRow = lngRow + lngRecCount

        End If

    Next lngIndex

    Application.ScreenUpdating = True

End Sub

Function GetRowData(strIn As String) As Variant

    Dim varData, varRecs

    Dim avarOutput()

    Dim lngRecCount As Long

    Dim x As Long, y As Long

    

    varData = Split(strIn, ",")

    ' get rec count from second field

    varRecs = Split(varData(1), ";")

    lngRecCount = 0

    For x = LBound(varRecs) To UBound(varRecs)

        If Len(varRecs(x)) > 0 Then lngRecCount = lngRecCount + 1

    Next x

    If lngRecCount = 0 Then lngRecCount = 1

    ReDim avarOutput(1 To lngRecCount, 1 To UBound(varData) + 1)

    

    ' now iterate the row data

    For x = LBound(varData) To UBound(varData)

        If InStr(varData(x), ";") > 0 Then

            varRecs = Split(varData(x), ";")

            For y = 1 To lngRecCount

                avarOutput(y, x + 1) = varRecs(y - 1)

            Next y

        Else

            For y = 1 To lngRecCount

                avarOutput(y, x + 1) = varData(x)

            Next y

        End If

    Next x

    GetRowData = avarOutput

End Function

Open in new window

0
 
LVL 45

Expert Comment

by:patrickab
Comment Utility
howsonhome,

OK, I understand except that the text file that you provided did not have all that data in it. Can you upload a text file with several sets of data - as shown in the Excel file. If each line has the same number of pieces of data, even if some are blank, we should be able to parse the data and make sense of it. So over to you for that new text file.

Patrick
0
 
LVL 45

Expert Comment

by:patrickab
Comment Utility
Seems rorya has got there in one go. I'll bow out at this stage...
0
 

Author Comment

by:howsonhome
Comment Utility
Hi Rorya, thanks for that, it works against the sample data I uploaded, but not against my full file.  I'll attach a copy of an entire record for you to see all the headings as it is slightly different!

the file Truefile.csv is one where I export the data from a progress database into a csv file.  It has  a delimiter of "}".  If you can amend the code so it does it straight from this then that would be fab!

the Amendedfile.csv is the one I do a bit of work on to repeat the TENANCYREF 8 times in that column and have already split the columns by the "}" delimiter.  It's a drag having to do this beforehand but if it's easier to amend the code using this format than that's great too!

Hope this is clear!

There are nearly 16,000 rows in the original file but as the data is sensitive I can't send you all the rows.  All the columns and headings are there now and I have amended the data to two rows so I'm not breaking any laws!
Many thanks!

Sorry but the data is very sensitive which is why I've been giving examples.
Truefile.csv
Amendedfile.csv
0
 
LVL 85

Expert Comment

by:Rory Archibald
Comment Utility
Just change the fifth line in GetRowData from this:
    varData = Split(strIn, ",")


to this:
    varData = Split(strIn, "}")


Regards,
Rory
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:howsonhome
Comment Utility
I get a 'Subscript out of Range' error on the 16,000 record file but it works on those examples so am going to presume there's an error in the data somewhere in the large file, will take a look!

Apart from that YOU'RE A GENIUS!!

Many thanks,
Debs
0
 

Author Comment

by:howsonhome
Comment Utility
Hi again, the procedure stops at a certain line as in the attached file below but I can't fathom out why, any help would be appreciated!  Thanks


SHEET.txt
0
 

Author Comment

by:howsonhome
Comment Utility
It's the first line the programme encounters that actually has the full 8 entries in some fields if this helps!  Thanks.
0
 
LVL 85

Accepted Solution

by:
Rory Archibald earned 500 total points
Comment Utility
You have a mismatch at the end of that record. There are 8 entries (i.e. 7 semicolons) for every field except the last which only has 6. Try this revised version of the GetRowData function:



Function GetRowData(strIn As String) As Variant

    Dim varData, varRecs

    Dim avarOutput()

    Dim lngRecCount As Long

    Dim x As Long, y As Long

    

    varData = Split(strIn, "}")

    ' get rec count from second field

    varRecs = Split(varData(1), ";")

    lngRecCount = 0

    For x = LBound(varRecs) To UBound(varRecs)

        If Len(varRecs(x)) > 0 Then lngRecCount = lngRecCount + 1

    Next x

    If lngRecCount = 0 Then lngRecCount = 1

    ReDim avarOutput(1 To lngRecCount, 1 To UBound(varData) + 1)

    

    ' now iterate the row data

    For x = LBound(varData) To UBound(varData)

        If InStr(varData(x), ";") > 0 Then

            varRecs = Split(varData(x), ";")

            If lngRecCount > UBound(varRecs) + 1 Then lngRecCount = UBound(varRecs) + 1

            For y = 1 To lngRecCount

                avarOutput(y, x + 1) = varRecs(y - 1)

            Next y

        Else

            For y = 1 To lngRecCount

                avarOutput(y, x + 1) = varData(x)

            Next y

        End If

    Next x

    GetRowData = avarOutput

End Function

Open in new window

0
 

Author Comment

by:howsonhome
Comment Utility
I took that line out of the file and run it again, it went much further until it once again encountered a line with the full 8 entries.  Unfortunatly I don't know VB enough to know how to change the programme but this appears to be the problem.  Regards, Deborah
0
 

Author Comment

by:howsonhome
Comment Utility
Sorry, have been ill in hospital, can you reopen this and I'll rate it, thanks
0
 

Author Comment

by:howsonhome
Comment Utility
See above
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

MS Access 2003 or later To MySQL Migration Project Hello All, this is my second article in the category of MS-OFFICE Automation. In internet I am not able to find any comprehensive resource on the Migration of MS Access back-end to MySQL so I fin…
Technology opened people to different means of presenting information, but PowerPoint remains to be above competition. Know why PPT still works today.
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now