Link to home
Start Free TrialLog in
Avatar of jskfan
jskfanFlag for Cyprus

asked on

Format Text file to Excel

Format Text file  to Excel.

I have a text file that looks like this one below.

Client:            VM1
Backup ID:       VM1
Policy:            WINDOWS_VM1
Policy Type:       MS-Windows (13)
Proxy Client:      (none specified)
Creator:           root
Name1:             (none specified)
Sched Label:       WEEKLY
Schedule Type:     FULL (0)
Retention Level:   1 year (8)
Backup Time:           2/3/2018 12:00:07 AM (1517634007)
Elapsed Time:      4748 second(s)
Expiration Time:       2/3/2019 12:00:07 AM (1549170007)
Maximum Expiration Time:       2/3/2019 12:00:07 AM (1549170007)
Compressed:        no
Client Encrypted:  no
Kilobytes:         23819641
Number of Files:   40278
Number of Copies:  1
Number of Fragments:   1
Histogram:         0 0 0 0 0 0 0 0 0 0
DB Compressed:     no
Files File Name:    WINDOWS_VM1
Previous Backup Files File Name:   (none specified)
Parent Backup Image File Name:   (none specified)
SW Version:        (none specified)
Options:           0x0
MPX:               1
TIR Info:          0
TIR Expiration:       12/31/1969 7:00:00 PM (0)
Keyword:           (none specified)
Media Descriptor:        1;Dom;MAR-MM33-324PA;MAR-MM33-324PA;MAR-MM33-324PA-MMU;0

This output is just for Client:            VM1
then there will be space ( few lines) in the text file, then the re will be information for Client:            VM2


Client:            VM2
Backup ID:       VM2
Policy:            WINDOWS_VM2
Policy Type:       MS-Windows (13)
Proxy Client:      (none specified)
Creator:           root
Name1:             (none specified)
....

etc...








I want to see it in Excel with the fields below as column headers:

Client:          
Backup ID:      
Policy:            
Policy Type:        
Proxy Client:      
Creator:          
Name1:            

....

etc...



Thank you
Avatar of Shaun Vermaak
Shaun Vermaak
Flag of Australia image

How is this data generated? Can you change the format in the source?

In Excel, this involves Text to Columns and Transposing Data but this is a lot easier to do if the data is already lines in the source.
Why not use Power Query to get the date from text file into the Excel? It's quite easy task for Power Query to get the data in desired format.
please post a representative sample of the file
Avatar of jskfan

ASKER

Avatar of jskfan

ASKER

how to use Power Query to get  data ? is it included in  excel ?
Is the file actually an RTF or some other format, such as TXT?
Power Query is a free Add-In by MS which is builtin in Excel 2016 but can be downloaded from MS site if you have Excel 2010 and later.
Once you download and install Power Query (in case you have 2010 or 2013 version), go to Power Query Tab and insert a Blank Query and paste the query given below into the Advanced Editor, click on Done and click on Close and Load on Home tab to load the data on the sheet.

I am assuming that the file is a Text file and don't forget to change the file path in the query given below in the following line.

Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252,

let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column3", "Column4", "Column5", "Column6"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns","Column1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", Int64.Type}, {"Column1.4", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type1",{{"Column1.2", Text.Trim}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Trimmed Text",{{"Column1.3", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Values", each if [Column1.3] = null then [Column1.2] else [Column1.2] & ":0" & [Column1.3] & ":" & [Column1.4]),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1.2", "Column1.3", "Column1.4"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each ([Column1.1] <> "")),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Headers", each if [Column1.1] = "Client" then "Client" else if [Column1.1] = "Backup ID" then "Backup ID" else if [Column1.1] = "Policy" then "Policy" else if [Column1.1] = "Policy Type" then "Policy Type" else if [Column1.1] = "Proxy Client" then "Proxy Client" else if [Column1.1] = "Creator" then "Creator" else if [Column1.1] = "Name1" then "Name1" else null),
    #"Filtered Rows1" = Table.SelectRows(#"Added Custom1", each ([Headers] <> null)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows1",{"Headers"}),
    #"Added Index" = Table.AddIndexColumn(#"Removed Columns2", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Column1.1]), "Column1.1", "Values"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1"}),
    #"Filtered Rows2" = Table.SelectRows(#"Filled Up", each ([Client] <> null)),
    #"Removed Columns3" = Table.RemoveColumns(#"Filtered Rows2",{"Index"})
in
    #"Removed Columns3"

Open in new window


Watch this short video demo to know that how to insert a blank query.
PowerQueryDemo.mp4
Avatar of jskfan

ASKER

Subodh Tiwari (Neeraj)

It worked for me . Thank you very much for your Help.
There is one thing I could not figure out. The Spreadsheet will show just  6 columns , as the one you showed in the video.  it should show more
Ah my bad. Actually I didn't read the line which says etc... and I thought that you are interested only in six columns.

Please replace the existing query with the following one.

To replace the query, follow these steps..

1) On the sheet where the data is extracted, you will see Workbook Queries Pane.
2) Right click on the query and choose Edit.
3) On Query Editor, go to View Tab and click on Advanced Editor and replace the existing query with the query given below.

New Query:

let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column3", "Column4", "Column5", "Column6"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns","Column1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", Int64.Type}, {"Column1.4", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type1",{{"Column1.2", Text.Trim}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Trimmed Text",{{"Column1.3", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Values", each if [Column1.3] = null then [Column1.2] else [Column1.2] & ":0" & [Column1.3] & ":" & [Column1.4]),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1.2", "Column1.3", "Column1.4"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each ([Column1.1] <> "")),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Column1.1]), "Column1.1", "Values"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1", "Sched Label", "Schedule Type", "Retention Level", "Backup Time", "Elapsed Time", "Expiration Time", "Maximum Expiration Time", "Compressed", "Client Encrypted", "Kilobytes", "Number of Files", "Number of Copies", "Number of Fragments", "Histogram", "DB Compressed", "Files File Name", "Previous Backup Files File Name", "Parent Backup Image File Name", "SW Version", "Options", "MPX", "TIR Info", "TIR Expiration", "Keyword", "Media Descriptor"}),
    #"Filtered Rows1" = Table.SelectRows(#"Filled Up", each ([Client] <> null))
in
    #"Filtered Rows1"

Open in new window

Avatar of jskfan

ASKER

I did. I am not sure why I get the error shown in the screenshot:
User generated image
That means the data you are trying to extract doesn't contain a column Media Descriptor. Is that true?
Did you run the query on the sample file you provided? It worked when I tested it.
Avatar of jskfan

ASKER

It is there, if you open the attached file, you can see it
But it worked with the data you posted in the rtf file which I saved as a text file.
Are you editing the existing query?
Try opening a blank workbook and insert a blank query using the latest query I posted as I showed in the demo video.
Avatar of jskfan

ASKER

I attached clean one
Avatar of jskfan

ASKER

Base on this attached text file : texttoexcel3.txt
SOLUTION
Avatar of Subodh Tiwari (Neeraj)
Subodh Tiwari (Neeraj)
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So, the input is NOT an rtf file?
Avatar of jskfan

ASKER

I am not sure  how it turned out rtf in EE.  originally it was txt.
Avatar of jskfan

ASKER

User generated image
If you look at the "Media Descriptor" column it shows  a number "1" instead of the real Data that is in the text file.
am not sure why
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You are right. The Media Descriptor column was not having the correct data.

Please try this tweaked query...
let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel3.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Merged Columns" = Table.CombineColumns(Source,{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Merged Columns","Merged",Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, false),{"Merged.1", "Merged.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged.1", type text}, {"Merged.2", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Merged.2", Text.Trim}}),
    #"Filtered Rows" = Table.SelectRows(#"Trimmed Text", each ([Merged.1] <> "")),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Merged.1]), "Merged.1", "Merged.2"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1", "Sched Label", "Schedule Type", "Retention Level", "Backup Time", "Elapsed Time", "Expiration Time", "Maximum Expiration Time", "Compressed", "Client Encrypted", "Kilobytes", "Number of Files", "Number of Copies", "Number of Fragments", "Histogram", "DB Compressed", "Files File Name", "Previous Backup Files File Name", "Parent Backup Image File Name", "SW Version", "Options", "MPX", "TIR Info", "TIR Expiration", "Keyword", "Ext Security Info", "File Restore Raw", "Image Dump Level", "File System Only", "Object Descriptor", "Previous BI Time", "BI Full Time", "Request Pid", "Backup Status", "Stream Number", "Backup Copy", "Files File size", "PFI type", "IMAGE_ATTRIBUTE", "Primary Copy", "Image Type", "Job ID", "Num Resumes", "Resume Expiration", "Data Classification", "Data_Classification_ID", "Storage Lifecycle Policy", "Storage Lifecycle Policy Version", "STL_Completed", "Remote Expiration Time", "Origin Master Server", "Origin Master GUID", "Snap Time", "IR Enabled", "Client Character Set", "Image On Hold", "Kilobytes Data Transferred", "Copy number", " Fragment", " Kilobytes", " Remainder", " Media Type", " Density", " File Num", " ID", " Host", " Block Size", " Offset", " Media Date", " Dev Written On", " Flags", " Media Descriptor", " Expiration Time", " MPX", " retention_lvl", " Try to Keep Time", " Copy Creation Time", " Data Format", " checkpoint", " resume num", " Key tag", " STL tag", " Copy on hold"}),
    #"Filtered Rows1" = Table.SelectRows(#"Filled Up", each ([Client] = "VM1"))
in
    #"Filtered Rows1"

Open in new window

Avatar of jskfan

ASKER

Wow, this code from :aikimark is wonderful. It worked perfect.


Subodh Tiwari (Neeraj) : Thank you for your Hard work and appreciate your time.
I see your last query. I realized you hard coded it just for VM1. in fact I left it that just to make or info not public. it can be VM1 ,BBB365,etc...
No problem. :)

The code offered by Aikimark is a robust one and would take care of all your text files.

Glad we could help and your issue has been resolved. :)
Avatar of jskfan

ASKER

Thank you very much for your  Big Help Guys!
You're welcome Jskfan! Glad we could help.