asked on

Format Text file to Excel

Format Text file to Excel.

I have a text file that looks like this one below.

Client: VM1
Backup ID: VM1
Policy: WINDOWS_VM1
Policy Type: MS-Windows (13)
Proxy Client: (none specified)
Creator: root
Name1: (none specified)
Sched Label: WEEKLY
Schedule Type: FULL (0)
Retention Level: 1 year (8)
Backup Time: 2/3/2018 12:00:07 AM (1517634007)
Elapsed Time: 4748 second(s)
Expiration Time: 2/3/2019 12:00:07 AM (1549170007)
Maximum Expiration Time: 2/3/2019 12:00:07 AM (1549170007)
Compressed: no
Client Encrypted: no
Kilobytes: 23819641
Number of Files: 40278
Number of Copies: 1
Number of Fragments: 1
Histogram: 0 0 0 0 0 0 0 0 0 0
DB Compressed: no
Files File Name: WINDOWS_VM1
Previous Backup Files File Name: (none specified)
Parent Backup Image File Name: (none specified)
SW Version: (none specified)
Options: 0x0
MPX: 1
TIR Info: 0
TIR Expiration: 12/31/1969 7:00:00 PM (0)
Keyword: (none specified)
Media Descriptor: 1;Dom;MAR-MM33-324PA;MAR-MM33-324PA;MAR-MM33-324PA-MMU;0

This output is just for Client: VM1
then there will be space ( few lines) in the text file, then the re will be information for Client: VM2

Client: VM2
Backup ID: VM2
Policy: WINDOWS_VM2
Policy Type: MS-Windows (13)
Proxy Client: (none specified)
Creator: root
Name1: (none specified)
....

etc...

I want to see it in Excel with the fields below as column headers:

Client:
Backup ID:
Policy:
Policy Type:
Proxy Client:
Creator:
Name1:

....

etc...

Thank you

Shaun Vermaak

How is this data generated? Can you change the format in the source?

In Excel, this involves Text to Columns and Transposing Data but this is a lot easier to do if the data is already lines in the source.

Subodh Tiwari (Neeraj)

Why not use Power Query to get the date from text file into the Excel? It's quite easy task for Power Query to get the data in desired format.

aikimark

please post a representative sample of the file

jskfan

ASKER

texttoexcel.rtf

jskfan

ASKER

how to use Power Query to get data ? is it included in excel ?

aikimark

Is the file actually an RTF or some other format, such as TXT?

Subodh Tiwari (Neeraj)

Power Query is a free Add-In by MS which is builtin in Excel 2016 but can be downloaded from MS site if you have Excel 2010 and later.
Once you download and install Power Query (in case you have 2010 or 2013 version), go to Power Query Tab and insert a Blank Query and paste the query given below into the Advanced Editor, click on Done and click on Close and Load on Home tab to load the data on the sheet.

I am assuming that the file is a Text file and don't forget to change the file path in the query given below in the following line.

Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252,

let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column3", "Column4", "Column5", "Column6"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns","Column1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", Int64.Type}, {"Column1.4", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type1",{{"Column1.2", Text.Trim}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Trimmed Text",{{"Column1.3", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Values", each if [Column1.3] = null then [Column1.2] else [Column1.2] & ":0" & [Column1.3] & ":" & [Column1.4]),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1.2", "Column1.3", "Column1.4"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each ([Column1.1] <> "")),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Headers", each if [Column1.1] = "Client" then "Client" else if [Column1.1] = "Backup ID" then "Backup ID" else if [Column1.1] = "Policy" then "Policy" else if [Column1.1] = "Policy Type" then "Policy Type" else if [Column1.1] = "Proxy Client" then "Proxy Client" else if [Column1.1] = "Creator" then "Creator" else if [Column1.1] = "Name1" then "Name1" else null),
    #"Filtered Rows1" = Table.SelectRows(#"Added Custom1", each ([Headers] <> null)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows1",{"Headers"}),
    #"Added Index" = Table.AddIndexColumn(#"Removed Columns2", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Column1.1]), "Column1.1", "Values"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1"}),
    #"Filtered Rows2" = Table.SelectRows(#"Filled Up", each ([Client] <> null)),
    #"Removed Columns3" = Table.RemoveColumns(#"Filtered Rows2",{"Index"})
in
    #"Removed Columns3"

Open in new window

Watch this short video demo to know that how to insert a blank query.
PowerQueryDemo.mp4

jskfan

ASKER

Subodh Tiwari (Neeraj)

It worked for me . Thank you very much for your Help.
There is one thing I could not figure out. The Spreadsheet will show just 6 columns , as the one you showed in the video. it should show more

Subodh Tiwari (Neeraj)

Ah my bad. Actually I didn't read the line which says etc... and I thought that you are interested only in six columns.

Please replace the existing query with the following one.

To replace the query, follow these steps..

1) On the sheet where the data is extracted, you will see Workbook Queries Pane.
2) Right click on the query and choose Edit.
3) On Query Editor, go to View Tab and click on Advanced Editor and replace the existing query with the query given below.

New Query:

let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column3", "Column4", "Column5", "Column6"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns","Column1",Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv),{"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", Int64.Type}, {"Column1.4", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type1",{{"Column1.2", Text.Trim}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Trimmed Text",{{"Column1.3", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Values", each if [Column1.3] = null then [Column1.2] else [Column1.2] & ":0" & [Column1.3] & ":" & [Column1.4]),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1.2", "Column1.3", "Column1.4"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each ([Column1.1] <> "")),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Column1.1]), "Column1.1", "Values"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1", "Sched Label", "Schedule Type", "Retention Level", "Backup Time", "Elapsed Time", "Expiration Time", "Maximum Expiration Time", "Compressed", "Client Encrypted", "Kilobytes", "Number of Files", "Number of Copies", "Number of Fragments", "Histogram", "DB Compressed", "Files File Name", "Previous Backup Files File Name", "Parent Backup Image File Name", "SW Version", "Options", "MPX", "TIR Info", "TIR Expiration", "Keyword", "Media Descriptor"}),
    #"Filtered Rows1" = Table.SelectRows(#"Filled Up", each ([Client] <> null))
in
    #"Filtered Rows1"

Open in new window

jskfan

ASKER

I did. I am not sure why I get the error shown in the screenshot:

Subodh Tiwari (Neeraj)

That means the data you are trying to extract doesn't contain a column Media Descriptor. Is that true?
Did you run the query on the sample file you provided? It worked when I tested it.

jskfan

ASKER

It is there, if you open the attached file, you can see it

Subodh Tiwari (Neeraj)

But it worked with the data you posted in the rtf file which I saved as a text file.
Are you editing the existing query?
Try opening a blank workbook and insert a blank query using the latest query I posted as I showed in the demo video.

Subodh Tiwari (Neeraj)

Here is the demo.
PowerQueryDemo2.mp4

jskfan

ASKER

texttoexcel2.txt

jskfan

ASKER

I attached clean one

jskfan

ASKER

texttoexcel3.txt

jskfan

ASKER

Base on this attached text file : texttoexcel3.txt

SOLUTION

Subodh Tiwari (Neeraj)

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

aikimark

So, the input is NOT an rtf file?

jskfan

ASKER

I am not sure how it turned out rtf in EE. originally it was txt.

jskfan

ASKER

If you look at the "Media Descriptor" column it shows a number "1" instead of the real Data that is in the text file.
am not sure why

ASKER CERTIFIED SOLUTION

aikimark

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Subodh Tiwari (Neeraj)

You are right. The Media Descriptor column was not having the correct data.

Please try this tweaked query...

let
    Source = Csv.Document(File.Contents("C:\Users\sktneer\Desktop\texttoexcel3.txt"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Merged Columns" = Table.CombineColumns(Source,{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Merged Columns","Merged",Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, false),{"Merged.1", "Merged.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged.1", type text}, {"Merged.2", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Merged.2", Text.Trim}}),
    #"Filtered Rows" = Table.SelectRows(#"Trimmed Text", each ([Merged.1] <> "")),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
    #"Pivoted Column" = Table.Pivot(#"Added Index", List.Distinct(#"Added Index"[Merged.1]), "Merged.1", "Merged.2"),
    #"Filled Up" = Table.FillUp(#"Pivoted Column",{"Backup ID", "Policy", "Policy Type", "Proxy Client", "Creator", "Name1", "Sched Label", "Schedule Type", "Retention Level", "Backup Time", "Elapsed Time", "Expiration Time", "Maximum Expiration Time", "Compressed", "Client Encrypted", "Kilobytes", "Number of Files", "Number of Copies", "Number of Fragments", "Histogram", "DB Compressed", "Files File Name", "Previous Backup Files File Name", "Parent Backup Image File Name", "SW Version", "Options", "MPX", "TIR Info", "TIR Expiration", "Keyword", "Ext Security Info", "File Restore Raw", "Image Dump Level", "File System Only", "Object Descriptor", "Previous BI Time", "BI Full Time", "Request Pid", "Backup Status", "Stream Number", "Backup Copy", "Files File size", "PFI type", "IMAGE_ATTRIBUTE", "Primary Copy", "Image Type", "Job ID", "Num Resumes", "Resume Expiration", "Data Classification", "Data_Classification_ID", "Storage Lifecycle Policy", "Storage Lifecycle Policy Version", "STL_Completed", "Remote Expiration Time", "Origin Master Server", "Origin Master GUID", "Snap Time", "IR Enabled", "Client Character Set", "Image On Hold", "Kilobytes Data Transferred", "Copy number", " Fragment", " Kilobytes", " Remainder", " Media Type", " Density", " File Num", " ID", " Host", " Block Size", " Offset", " Media Date", " Dev Written On", " Flags", " Media Descriptor", " Expiration Time", " MPX", " retention_lvl", " Try to Keep Time", " Copy Creation Time", " Data Format", " checkpoint", " resume num", " Key tag", " STL tag", " Copy on hold"}),
    #"Filtered Rows1" = Table.SelectRows(#"Filled Up", each ([Client] = "VM1"))
in
    #"Filtered Rows1"

Open in new window

jskfan

ASKER

Wow, this code from :aikimark is wonderful. It worked perfect.

Subodh Tiwari (Neeraj) : Thank you for your Hard work and appreciate your time.
I see your last query. I realized you hard coded it just for VM1. in fact I left it that just to make or info not public. it can be VM1 ,BBB365,etc...

Subodh Tiwari (Neeraj)

No problem. :)

The code offered by Aikimark is a robust one and would take care of all your text files.

Glad we could help and your issue has been resolved. :)

jskfan

ASKER

Thank you very much for your Big Help Guys!

Subodh Tiwari (Neeraj)

You're welcome Jskfan! Glad we could help.