We help IT Professionals succeed at work.
Troubleshooting Question

Python unzip only csv files from zip

Leo Torres
Leo Torres asked
on
63 Views
Last Modified: 2020-07-24
I am trying to extract only csv files from zip files I pass through a loop. The code does not error in completes successfully but the files are not being extracted. I printed some output below as well to make sure values are being passed correctly and they are. Not I know where else I can look. Anyone got any ideas?



#Find all zip files in source location
SourceRoot='//path/datafeeds/CMS_ResearchStatistics/ZipTemp/'
DestinationRoot='//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/'
df2 = pd.DataFrame()
EXT = "*.zip"
all_csv_files = [file
    for path, subdir, files in os.walk(SourceRoot)
    for file in glob(os.path.join(path, EXT))]
all_csv_files = pd.DataFrame(all_csv_files)
df2 = df2.append(all_csv_files)

df2 = df2.rename(columns={0:'source'})

#Create a destination path for unzipped file
df2['destination'] = df2['source'].str.replace(SourceRoot,DestinationRoot).str.replace(".zip","")


#loop through dataframe and extract csv only from zip file and place it in the destination
for index in range(len(df2)):
    source = df2['source'].values[index]
    destination = df2['destination'].values[index]
    #print(source)
    print(destination)
    with zipfile.ZipFile(source,'r') as zip_ref:
        listOfFileNames = zip_ref.namelist()
        #zip_ref.extractall(destination)
        for filename in listOfFileNames:
        #check filename ends with csv
            if filename.endswith('.csv'):
                print(filename)
        #Extract a single file from zip
                zip_ref.extract(filename)

Output
//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/MAContractServiceArea\ma-contract-service-area-statecounty-april-2020
MA_Cnty_SA_2020_04/MA_Cnty_SA_2020_04.csv
//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/MAContractServiceArea\ma-contract-service-area-statecounty-march-2020
MA_Cnty_SA_2020_03/MA_Cnty_SA_2020_03.csv
//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/MAContractServiceArea\ma-contract-service-area-statecounty-february-2020
MA_Cnty_SA_2020_02/MA_Cnty_SA_2020_02.csv
//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/MAContractServiceArea\ma-contract-service-area-statecounty-january-2020
MA_Cnty_SA_2020_01/MA_Cnty_SA_2020_01.csv
//path/datafeeds/CMS_ResearchStatistics/ZipTemp/UnZipTest/MAContractServiceArea\ma-contract-service-area-statecounty-april-2020_org
MA_Cnty_SA_2020_04/MA_Cnty_SA_2020_04.csv

Comment
Watch Question

David Johnson, CDSimple Geek from the '70s
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
instead of zip_ref.extract(filename)  try zip_ref.extract("*.csv")
Leo TorresSQL Developer
CERTIFIED EXPERT

Author

Commented:
The change suggested does not work it fails
Error
KeyError: "There is no item named '*.csv' in the archive"
CERTIFIED EXPERT

Commented:
try to remove the comment above the extract line.
it is indented differently and likely is your issue

Leo TorresSQL Developer
CERTIFIED EXPERT

Author

Commented:
@skullnobrains removed comment same error.
CERTIFIED EXPERT

Commented:
try and add some error handling. if it does not extract, it should return false and throw some kind of error.

what seems likely is it actually does work, but extracts to a different location. likely the current working directory.

i would first look for files there, and then probably debug with strace
Leo TorresSQL Developer
CERTIFIED EXPERT

Author

Commented:
@Skullnobrains your spot on the files were being created to the working directory. How do I change it so the destination of the file goes to the destination variable passed in the script?
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.