<

The Mystery of the Lost Leading Zeros

Published on
22,637 Points
11,637 Views
15 Endorsements
Last Modified:
Awarded
Editor's Choice
What really happens to your leading zeros when you save your data in .csv format…

Almost everyone today uses Microsoft Excel to collect their contact information in columns and rows called spreadsheets. With this format, important columns can easily be highlighted and complicated calculations can be executed and updated in real time (see Figure 1). In this data-driven age, a fully-functional spreadsheet is not just convenient, it’s necessary.

Figure 1
So why is it, then, that when you go to send your beautifully stylized, precisely calculated spreadsheet to your data handler, they prefer to receive it in .csv format?

The .csv (or comma-separated-values) file format has seemingly been around forever. It is practically the most basic, human-readable format for raw data; second only to the fixed-length format. It is a simple collection of values – those values being the words or numbers that are displayed in each cell of your spreadsheet. Each row of your spread sheet is represented by a line in the .csv file and the values for each cell are separated by commas. There are no colors or type styles and no formulas to be calculated – just values. After all, the program used by the data handler doesn’t need to recalculate each formula when the spreadsheet has already done that for them. And the program doesn’t need different styles to recognize certain columns. It doesn’t prioritize columns; it handles each column of information with equal significance.

So you comply with your data handler by opening your spreadsheet in Excel and executing the “save as” function, selecting the “save as type” of CSV (Comma delimited) (*.csv).

Now you need to convince yourself that everything saved correctly so you double-click the file to see that all of your data is there. After all, you don’t want the data to go to the data handler wrong. What you find there may be ugly—the styles have all disappeared and the formulas have been replaced by their results—but it appears that the data is all there. (see Figure 2)

Figure 2
However, if you look closely enough, you’ll notice that all your New England zip codes and your eight-digit birthdates have lost their leading zeros. Or maybe you have numeric identification codes that originally began with zeros which now start with the first non-zero number instead.

But why are my leading zeros missing? After all, it’s just another Excel format. Right?

Wrong! You’ve been duped…

Actually, .csv files are simple ASCII text files and should be opened with a text editor application such as Notepad or TextEdit. But when you install Microsoft Excel, Excel claims the file format as its very own and changes the double-click behavior of your computer to open this type of file in Excel even though it is NOT a native Excel file format.

What really happens when you double-click a .csv file is that Excel imports it into an empty spreadsheet without displaying any messages or menus to indicate such. Because of the covert nature of the import, most users are misled into believing they are viewing the native file and incorrectly assume that the leading zeros are lost.

The fact is, the leading zeros are included in the .csv file but are eliminated when the .csv file is covertly imported into the empty Excel spreadsheet when you double-clicked the file. You can confirm this by opening the .csv file in a text editor such as Notepad for Windows or TextEdit for MacOS. (see Figure 3)

Figure 3
In Figure 3, the zip codes have been highlighted in yellow, the birth dates in green, to help you see that the leading zeros are actually there. This is what the data handler would see – without the highlighting, of course. The rows of values from the Excel file are on separate lines. The values from each cell are separated, or delimited, by commas.

You can be confident now that the data you’re sending to the data handler is complete, though not so visually appealing and organized as your Excel file.

Beware the middle man...or woman... ah...person…

It seems the person who generates the .csv file is rarely the person who delivers the file to the data handler. All too often, the .csv file is passed off to a purchasing agent or project manager who, wisely, decides to double-check the file to see that is has the correct information. They, too, will double-click the .csv file to peruse it’s contents. Because the data was actually imported, Excel considers the file to be a new file so it will ask if you want to save changes when the file is closed.

Never save changes when opening a .csv file by double-clicking!

Few people will question why they are being asked to save changes. Almost everyone will click the save button and then confirm that when the next screen pops up warning that the file “may contain features that are not compatible with CSV.” The leading zeros are now truly lost. The only way to recover them is to go back to the original Excel file and resave it as .csv.

But what if your data processor presents a .csv file to you for your own project? How do you get your comma-delimited data into Excel with leading zeros intact?

It is actually fairly easy to import your .csv file into Excel without losing your leading zeros. The only real problem with double-clicking the .csv file is that the import that takes place sets the column data format to “General” for all cells. In a cell with a “General” format, if it looks like a number, it’s treated like a number. In order to set the column to a different format before importing the data, you must perform a manual import.

The following instructions are for Excel 2007. Earlier versions of Excel would be very similar, though the menus may be in entirely different locations.

1.

If Excel is not already running, launch Excel from the Start menu.

2.

If you just launched Excel, it will be open with a blank workbook already created so you may skip to Step 3. If Excel was already running it may be necessary to create a blank workbook as follows: (See Figure 4)
Figure 4Select the File tab at the top of the window and select New from the left menu bar.
From the Available Templates window, choose Blank workbook and click the Create button.

3.

Click the Data tab at the top of the window and select From Text from the Get External Data drop-down or section of the top tool bar. (See Figure 5)
Figure 5

4.

When the Import Text File window appears, navigate to and select the .csv file you wish to import.

5.

When the Text Import Wizard window, Step 1 of 3, appears, be sure that Delimited is selected, then click the Next button. (See Figure 6-1)
Figure 6

6.

In the Text Import Wizard window, Step 2 of 3, choose Comma under the list of Delimiters, and choose the double quote (") as the Text qualifier, then click the Next button. (See Figure 6-2)

7.

In the Text Import Wizard window, Step 3 of 3, select the column(s) with the leading zeros and change the Column data format to Text, then click the Next button. (See Figure 6-3)

8.

In the Import Data window that follows, click the cell where you want your data import to begin or leave cell A1 selected and click the OK button. (See Figure 7)
Figure 7Your data appears in the worksheet, complete with leading zeros, and is ready for you to style as you see fit.
15
Comment
Author:Kim Walker
11 Comments
 
LVL 61

Expert Comment

by:mbizup
Very nice article!  Informative and well presented.
0
 
LVL 1

Expert Comment

by:Catherine Burow
Great Article! I had asked this question before, and it was briefly explained that re-opening the .csv file will remove the leading "0", but the background information you've included here really helped me grasp WHY it did that and how I can work with it to prevent losing the leading "0".

Again, GREAT ARTICLE!!
0
 
LVL 38

Expert Comment

by:younghv
Very nicely done article. It is always nice to know the "why" behind some of these computer things.

"Yes" vote above.
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 50

Expert Comment

by:Ingeborg Hawighorst (Microsoft MVP / EE MVE)
Great article and well crafted. "Yes" above!
0
 

Expert Comment

by:DBDATASOLUTIONS
Yes this is very good article, I am changing the .csv extension on my pc to open with NOTEPAD !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
0
 

Expert Comment

by:DAGray_P6C
With great anticipation, I turned to this article, hoping to find some gem that I had overlooked. Although I did, it wasn't the one I hoped to find. I was disappointed that you hadn't found a better way to get a CSV file into Excel without losing its leading zeros than the method that I've used since I started using Excel (version 5.0, in Office 4.2 for Windows 3.1).

BTW, the import wizard has barely changed since Excel 5.0, and the back end, from Figure 6-1 onward, hasn't changed at all.

Nevertheless, knowing that the leading zeros are preserved by the Save As operation made the reading well worth my time.
0
 

Expert Comment

by:mkwest
Thank you for the great article, especially for the backstory along with the solution.  After all these years I didn't know how it happened, so I've just avoided Excel for my .csv files.  Now I know how to safely use excel for my .csv's without losing the leading zeros.  A very helpful article.  Thanks!
0
 
LVL 37

Expert Comment

by:ValentinoV
Nice article, though I prefer Notepad++ ;)
(clicked +1!)
0
 
LVL 13

Expert Comment

by:Jeff Darling
We can thank Microsoft for trying to help out the casual user. Unfortunately, sometimes Microsoft doesn't have all the answers and guesses incorrectly at the data types for fields in csv files.  A similar kind of problem was happening with phone numbers being converted.  

Now, I just rename the .csv files to .txt then Excel brings up the text import wizard.

Or I use Microsoft Access instead.
0
 
LVL 22

Author Comment

by:Kim Walker
@jeffld: After changing the extension of the the file, do you right-click and then "open with" Excel? Or do you go to Excel and do a File/Open?
0
 

Expert Comment

by:DAGray_P6C
Start Excel, and use File, Open, which automatically starts the Get External Data from Text File wizard. Open With treats it as text, and you must use the Text to Columns wizard to split it into columns.
0

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Join & Write a Comment

This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.
This Micro Tutorial demonstrate the bugs in Microsoft Excel for Mac with Pivot Charts.

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month