Solved

rsync

Posted on 2013-07-01
4
430 Views
Last Modified: 2014-04-01
HI

I am using rsync shell script to copy files from one server other.  Content of shell script is below

rsync -e ssh -avzp --remove-source-files $REMOTE_DIR $LOCAL_DIR \
                    --exclude-from=$FILTER > $TRANS_TMP 2>>$RSYNC_ERR_LOG

The shell script runs every 10 mints to copy files between server.  We place files in source directory using sftp tool.  Sometimes rsync copy files which is not ready to move  or we still uploading files to source directory.  Becuase of this reason we sometimes get truncated files pushed from source to target directory.  We want to modify rsync script only rysnc only completed files not partial files. what is option which avoids partial files and only move files which are completed
0
Comment
Question by:vadicherla
  • 2
4 Comments
 
LVL 23

Expert Comment

by:nemws1
Comment Utility
I don't think rsync does that (or would be able to know that something is still being uploaded).  What you should do is upload to a different directory on the same server and then move it once its up - there's still a window for this to happen, but much less so.

For example, what if there's a network issue and your upload stops for 20 seconds.  Should 'rsync' consider that 'uploaded'?  (It shouldn't - the file hasn't completed uploading yet).  The logic of "is it done or not" is something that you *can* do - with MD5 checksums or something similar, but it requires more logic than rsync provides.

What you could also do is to use a simple locking mechanism, but this assumes that you are uploading only one file at a time:

1) Remote site connects to your site and changes to the proper directory
2) Remote site creates an 'upload_in_progress' file (contains nothing)
3) Remote site starts uploading new file(s)
4) rsync script runs - checks to see if an 'uploading' file exists
  4a) If *no* 'upload_in_progress' file, run rsync command
  4b) If 'upload_in_progress' file is there, exit without running rsync command
        Next run in 10 minutes will try again
5) Remote site finishes uploading file(s)
6) Remote site removes 'upload_in_progress' file.
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
there's still a window for this to happen, but much less so

if the temporary uploads dir is on the same filesystem, there is no window since mv will not rewrite the inode but rather lik it to another name

in scp, you can for example add .part to your filenames, and rename the files once the upload is complete, and exclude those files in the rsync command

--

this is a bit dirty but it may be acceptable to you to use an option like --min-age so recent files will have time to download before they are copied.

--

since you remove the source files, i'm not sure rsync is the best tool.

you probably also can write a shell script that uses lsof to check wether files are completed before they are transferred. this is only meaningful if you cannot change the sftp command which would be way cleaner
0
 
LVL 6

Accepted Solution

by:
Ryan Smith earned 500 total points
Comment Utility
make a temp directory and have the files uploaded there.  In your script have it move *.whatever extension it's looking for to move the whole uploaded files to your rsync directory.
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
what if the file gets moved while it is still being written to ?
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now