Member_2_7966113
asked on
How to Copy Files using Databricks Utilities
Hello Experts,
This questions requires someone with experience with Python and Databricks
A member from another forum assisted me in copying files to a follow based on date, https://stackoverflow.com/questions/54007074/how-to-truncate-and-or-use-wildcards-with-databrick
I would like to tweak the code to copy file based on certain characters in a filename – in the example that follows the characters are 1111, 1112, 1113 and 1114
So, if we have four files as follows:
File_Account_1111_exam1.cs v
File_Account_1112_testxx.c sv
File_Account_1113_pringle. csv
File_Account_1114_sam34.cs v
I would like File_Account_1114_sam34.cs v copied to the folder only if File_Account_1113_pringle. csv has already been copied to the folder.
Likewise I would only want File_Account_1113_pringle. csv copied if File_Account_1112_testxx.c sv has been already been copied to the folder and so on.
Therefore, if all files have been copied to a folder it would look like the following:
dbutils.fs.put("/mnt/adls2 /demo/file s/file_Acc ount_1111_ exam1.csv" , data, True)
dbutils.fs.put("/mnt/adls2 /demo/file s/file_Acc ount_1112_ testxx.csv ", data, True)
dbutils.fs.put("/mnt/adls2 /demo/file s/file_Acc ount_1113_ pringle.cs v", data, True)
dbutils.fs.put("/mnt/adls2 /demo/file s/file_Acc ount_1114_ sam34.csv" , data, True)
I appreciate there aren't many Experts with experience with Databricks on EE, however any help will be greatly appreciated.
Cheers
This questions requires someone with experience with Python and Databricks
A member from another forum assisted me in copying files to a follow based on date, https://stackoverflow.com/questions/54007074/how-to-truncate-and-or-use-wildcards-with-databrick
I would like to tweak the code to copy file based on certain characters in a filename – in the example that follows the characters are 1111, 1112, 1113 and 1114
So, if we have four files as follows:
File_Account_1111_exam1.cs
File_Account_1112_testxx.c
File_Account_1113_pringle.
File_Account_1114_sam34.cs
I would like File_Account_1114_sam34.cs
Likewise I would only want File_Account_1113_pringle.
Therefore, if all files have been copied to a folder it would look like the following:
dbutils.fs.put("/mnt/adls2
dbutils.fs.put("/mnt/adls2
dbutils.fs.put("/mnt/adls2
dbutils.fs.put("/mnt/adls2
I appreciate there aren't many Experts with experience with Databricks on EE, however any help will be greatly appreciated.
Cheers
ASKER
Hi Norie,
Thanks for reaching out.
The code is probably all Python, but there will be an element of using Databricks dbutilies
Thanks for reaching out.
The code is probably all Python, but there will be an element of using Databricks dbutilies
ASKER
Can I get some help with this please?
Is it the number and name of the file that matters?
Do you have a list of the numbers/names?
Do you have a list of the numbers/names?
ASKER
Hi Norie,
Thanks for reaching out.
Its the number that matters.
Thanks for reaching out.
Its the number that matters.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
I know that's probably where you are using this code but it looks to me like it's pure Python.