We help IT Professionals succeed at work.
Troubleshooting Question

Python Function

Alan Grimes
Alan Grimes asked
on
82 Views
Last Modified: 2020-08-12
I have an excel file I am trying to transform test_file.xlsx

I have got so far

import  pandas  as      pd

in_folder   =   r'C:\test_folder\\'
filename    =   r'test_file.xlsx'

full_name = in_folder + '\\' + filename

df = pd.read_excel(full_name, 'Sheet1', index_col=None, usecols='A:K')

df = df.rename(columns={    df.columns[0]: '1', df.columns[1]: '2', df.columns[2]: '3', df.columns[3]: "4", df.columns[4]: "5", df.columns[5]: "6", df.columns[6]: "7", df.columns[7]: "8",  df.columns[8]: "9", df.columns[9]: "10", df.columns[10]: "11" })


df = df[4:]
Column 1 only has a value in row 6


I need to populate the value "76" down to where the next value begins to row 376


I then need to pick up the value 92 and populate it down until it meets the next row with a value.

The goal is to have transform the spreadsheet to have the number in "column a" in each row


Comment
Watch Question

NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Alan

You can use ffill for that.
import  pandas  as      pd

in_folder   =   r'C:\test_folder\\'
filename    =   r'test_file.xlsx'

df = pd.read_excel(full_name, 'Sheet1', index_col=None, usecols='A:K', skiprows=3)

df = df.rename(columns={    df.columns[0]: '1', df.columns[1]: '2', df.columns[2]: '3', df.columns[3]: "4", df.columns[4]: "5", df.columns[5]: "6", df.columns[6]: "7", df.columns[7]: "8",  df.columns[8]: "9", df.columns[9]: "10", df.columns[10]: "11" })

print(df.head())
df = df.ffill(axis=0)
print(df.head())

Open in new window

Author

Commented:
Hi Norie

That has worked perfectly in terms of filling column 0 down (also thanks for the skip rows).

The spreadsheet I receive has all the days for this year filled in with blank values from todays date onwards

 
The fill function also fills in all the columns that were previously blank

would I be better off removing any rows that have a date greater than yesterday before applying the fill.
I am sure you know the next question, how do i remove those rows before the fill

Thanks




Analyst Assistant
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Hi Norie

I am sure you have been told before buy you are a genius (Python anyway).

Is there any way I can remove records with a date greater than yesterday, I can do it in the SSIS package before I import the records but have been told that to try nd remove before.

Either way, you have provided the solution.

Thank you Sir
NorieAnalyst Assistant
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
yesterday = datetime.combine(date.today(), datetime.min.time()) - timedelta(days=1)
--------------------------------------------------------------------------- 
NameError                                 Traceback (most recent call last)
<ipython-input-40-2aa01a7bf0a4> in <module> 
----> 1 yesterday = datetime.combine(date.today(), datetime.min.time()) - timedelta(days=1) 
NameError: name 'date' is not defined 

Author

Commented:
I added:

from datetime import date, timedelta
It worked
NorieAnalyst Assistant
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.