Member_2_7966113
asked on
Python Code Challenge to Match Schema
Hello Experts I have a Python task which involves Matching Schemas
Basically I need to modify the columns in the dataset df to match the provided schema and return the first three rows of the dataset
The steps that are to be are completed are as follows:
Write a function matchSchema(df) that achieves the following:
1. Converts column active to type Boolean
2. Creates the column ‘price’ by converting the column ‘counts’ to type Double and dividing by 100
3. Drops the ‘counts’ column
4. Returns the first three rows of the resulting Dataframe
I have attempted the task with the following script:
import numpy as np
import pandas as pd
df = pd.read_csv('D:\matchSchem a.csv')
def matchSchema(df):
df['active'] = df['active'].astype('bool' )
df['price'] = df['cents']/100
df.drop('cents', axis=1, inplace=True)
return df.head(3)
matchSchema(df)
However I'm failing to get the following set of results
Return Array of correct size
Return Array of rows with correnct number of rows
Basically I need to modify the columns in the dataset df to match the provided schema and return the first three rows of the dataset
The steps that are to be are completed are as follows:
Write a function matchSchema(df) that achieves the following:
1. Converts column active to type Boolean
2. Creates the column ‘price’ by converting the column ‘counts’ to type Double and dividing by 100
3. Drops the ‘counts’ column
4. Returns the first three rows of the resulting Dataframe
I have attempted the task with the following script:
import numpy as np
import pandas as pd
df = pd.read_csv('D:\matchSchem
def matchSchema(df):
df['active'] = df['active'].astype('bool'
df['price'] = df['cents']/100
df.drop('cents', axis=1, inplace=True)
return df.head(3)
matchSchema(df)
However I'm failing to get the following set of results
Return Array of correct size
Return Array of rows with correnct number of rows
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi noci,
When i run your code I get the following error:
KeyError: 'counts'
When i run your code I get the following error:
KeyError: 'counts'
For some silly reason the signon button doesn't work to download from databricks.
(i have a severe blockage for various data trackers, possibly the code behind the button is from marketo, facebook or one of the 4 other trackers they have )
So i can't download it.
Anyway the question says Boolean..., your code had 'bool', so maybe try 'Boolean' instead:
Like in:
I can't determine what types are "used" or not. I could tell how to return the data in Python though. (imho the real problem in your code).
(i have a severe blockage for various data trackers, possibly the code behind the button is from marketo, facebook or one of the 4 other trackers they have )
So i can't download it.
Anyway the question says Boolean..., your code had 'bool', so maybe try 'Boolean' instead:
Like in:
import numpy as np
import pandas as pd
df = pd.read_csv('D:\matchSchema.csv')
def matchSchema(df):
df['active'] = df['active'].astype('Boolean')
df['price'] = df['counts']/100
df.drop('counts', axis=1, inplace=True)
return df,df.head(3)
(dataset, sample) = matchSchema(df)
print(dataset)
print(sample)
I can't determine what types are "used" or not. I could tell how to return the data in Python though. (imho the real problem in your code).
The Question state counts, your code had cents..., i used a CSV that had counts in it.. So it balked on the 'cents'
Open in new window
You had all in place, just missing the return of the right items...
To prove added some print statements.
then (dataset, sample) = .... is needed to fetch the return values.