rsync

HI

I am using rsync shell script to copy files from one server other.  Content of shell script is below

rsync -e ssh -avzp --remove-source-files $REMOTE_DIR $LOCAL_DIR \
                    --exclude-from=$FILTER > $TRANS_TMP 2>>$RSYNC_ERR_LOG

The shell script runs every 10 mints to copy files between server.  We place files in source directory using sftp tool.  Sometimes rsync copy files which is not ready to move  or we still uploading files to source directory.  Becuase of this reason we sometimes get truncated files pushed from source to target directory.  We want to modify rsync script only rysnc only completed files not partial files. what is option which avoids partial files and only move files which are completed
vadicherlaAsked:
Who is Participating?
 
Ryan SmithConnect With a Mentor Sr. Systems EngineerCommented:
make a temp directory and have the files uploaded there.  In your script have it move *.whatever extension it's looking for to move the whole uploaded files to your rsync directory.
0
 
nemws1Database AdministratorCommented:
I don't think rsync does that (or would be able to know that something is still being uploaded).  What you should do is upload to a different directory on the same server and then move it once its up - there's still a window for this to happen, but much less so.

For example, what if there's a network issue and your upload stops for 20 seconds.  Should 'rsync' consider that 'uploaded'?  (It shouldn't - the file hasn't completed uploading yet).  The logic of "is it done or not" is something that you *can* do - with MD5 checksums or something similar, but it requires more logic than rsync provides.

What you could also do is to use a simple locking mechanism, but this assumes that you are uploading only one file at a time:

1) Remote site connects to your site and changes to the proper directory
2) Remote site creates an 'upload_in_progress' file (contains nothing)
3) Remote site starts uploading new file(s)
4) rsync script runs - checks to see if an 'uploading' file exists
  4a) If *no* 'upload_in_progress' file, run rsync command
  4b) If 'upload_in_progress' file is there, exit without running rsync command
        Next run in 10 minutes will try again
5) Remote site finishes uploading file(s)
6) Remote site removes 'upload_in_progress' file.
0
 
skullnobrainsCommented:
there's still a window for this to happen, but much less so

if the temporary uploads dir is on the same filesystem, there is no window since mv will not rewrite the inode but rather lik it to another name

in scp, you can for example add .part to your filenames, and rename the files once the upload is complete, and exclude those files in the rsync command

--

this is a bit dirty but it may be acceptable to you to use an option like --min-age so recent files will have time to download before they are copied.

--

since you remove the source files, i'm not sure rsync is the best tool.

you probably also can write a shell script that uses lsof to check wether files are completed before they are transferred. this is only meaningful if you cannot change the sftp command which would be way cleaner
0
 
skullnobrainsCommented:
what if the file gets moved while it is still being written to ?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.