What really happens to your leading zeros when you save your data in .csv format…
Almost everyone today uses Microsoft Excel to collect their contact information in columns and rows called spreadsheets. With this format, important columns can easily be highlighted and complicated calculations can be executed and updated in real time (see Figure 1). In this data-driven age, a fully-functional spreadsheet is not just convenient, it’s necessary.
So why is it, then, that when you go to send your beautifully stylized, precisely calculated spreadsheet to your data handler, they prefer to receive it in .csv format?
The .csv (or comma-separated-values) file format has seemingly been around forever. It is practically the most basic, human-readable format for raw data; second only to the fixed-length format. It is a simple collection of values – those values being the words or numbers that are displayed in each cell of your spreadsheet. Each row of your spread sheet is represented by a line in the .csv file and the values for each cell are separated by commas. There are no colors or type styles and no formulas to be calculated – just values. After all, the program used by the data handler doesn’t need to recalculate each formula when the spreadsheet has already done that for them. And the program doesn’t need different styles to recognize certain columns. It doesn’t prioritize columns; it handles each column of information with equal significance.
So you comply with your data handler by opening your spreadsheet in Excel and executing the “save as” function, selecting the “save as type” of CSV (Comma delimited) (*.csv)
Now you need to convince yourself that everything saved correctly so you double-click the file to see that all of your data is there. After all, you don’t want the data to go to the data handler wrong. What you find there may be ugly—the styles have all disappeared and the formulas have been replaced by their results—but it appears that the data is all there. (see Figure 2)
However, if you look closely enough, you’ll notice that all your New England zip codes and your eight-digit birthdates have lost their leading zeros. Or maybe you have numeric identification codes that originally began with zeros which now start with the first non-zero number instead.
But why are my leading zeros missing? After all, it’s just another Excel format. Right?
Wrong! You’ve been duped…
Actually, .csv files are simple ASCII text files and should be opened with a text editor application such as Notepad or TextEdit. But when you install Microsoft Excel, Excel claims the file format as its very own and changes the double-click behavior of your computer to open this type of file in Excel even though it is NOT a native Excel file format.
What really happens when you double-click a .csv file is that Excel imports it into an empty spreadsheet without displaying any messages or menus to indicate such. Because of the covert nature of the import, most users are misled into believing they are viewing the native file and incorrectly assume that the leading zeros are lost.
The fact is, the leading zeros are included in the .csv file but are eliminated when the .csv file is covertly imported into the empty Excel spreadsheet when you double-clicked the file. You can confirm this by opening the .csv file in a text editor such as Notepad for Windows or TextEdit for MacOS. (see Figure 3)
In Figure 3, the zip codes have been highlighted in yellow, the birth dates in green, to help you see that the leading zeros are actually there. This is what the data handler would see – without the highlighting, of course. The rows of values from the Excel file are on separate lines. The values from each cell are separated, or delimited, by commas.
You can be confident now that the data you’re sending to the data handler is complete, though not so visually appealing and organized as your Excel file.
Beware the middle man...or woman... ah...person…
It seems the person who generates the .csv file is rarely the person who delivers the file to the data handler. All too often, the .csv file is passed off to a purchasing agent or project manager who, wisely, decides to double-check the file to see that is has the correct information. They, too, will double-click the .csv file to peruse it’s contents. Because the data was actually imported, Excel considers the file to be a new file so it will ask if you want to save changes when the file is closed.
Never save changes when opening a .csv file by double-clicking!
Few people will question why they are being asked to save changes. Almost everyone will click the save button and then confirm that when the next screen pops up warning that the file “may contain features that are not compatible with CSV.” The leading zeros are now truly lost. The only way to recover them is to go back to the original Excel file and resave it as .csv.
But what if your data processor presents a .csv file to you for your own project? How do you get your comma-delimited data into Excel with leading zeros intact?
It is actually fairly easy to import your .csv file into Excel without losing your leading zeros. The only real problem with double-clicking the .csv file is that the import that takes place sets the column data format to “General” for all cells. In a cell with a “General” format, if it looks like a number, it’s treated like a number. In order to set the column to a different format before importing the data, you must perform a manual import.
The following instructions are for Excel 2007. Earlier versions of Excel would be very similar, though the menus may be in entirely different locations.
If Excel is not already running, launch Excel from the Start
If you just launched Excel, it will be open with a blank workbook already created so you may skip to Step 3. If Excel was already running it may be necessary to create a blank workbook as follows: (See Figure 4)
Select the File
tab at the top of the window and select New
from the left menu bar.
From the Available Templates
window, choose Blank workbook
and click the Create
Click the Data
tab at the top of the window and select From Text
from the Get External Data
drop-down or section of the top tool bar. (See Figure 5)
When the Import Text File
window appears, navigate to and select the .csv file you wish to import.
When the Text Import Wizard
window, Step 1 of 3
, appears, be sure that Delimited
is selected, then click the Next
button. (See Figure 6-1)
In the Text Import Wizard
window, Step 2 of 3
, choose Comma
under the list of Delimiters, and choose the double quote (") as the Text qualifier, then click the Next
button. (See Figure 6-2)
In the Text Import Wizard
window, Step 3 of 3
, select the column(s) with the leading zeros and change the Column data format
, then click the Next
button. (See Figure 6-3)
In the Import Data
window that follows, click the cell where you want your data import to begin or leave cell A1 selected and click the OK
button. (See Figure 7)
Your data appears in the worksheet, complete with leading zeros, and is ready for you to style as you see fit.