[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

File Transfer - Ascii vs Binary

Posted on 2000-02-10
12
Medium Priority
?
274 Views
Last Modified: 2013-12-16
I am a setting up a new web server to replace the one we have now.  The current webserver is windows NT and the new one is Linux.  I need a way to transfer the files from the NT server to the new Linux server in a way that treats .txt and .html files as ascii and the .gif and .jpg files as binary.  I wanted to find a tool that could probe a file before downloading it and determine what mode is best for that file.  We are talking about 12G of data spread across about 1,200 user accounts.  I tried using smbclient, but that does only binary transfers.  The ncftpget does have ascii options, but it doesn't know how to autodetect the filetype on the fly as far as I can tell.  If worse comes worse, I'll just do the whole thing in binary since most the text files look fine when displayed on a web browser even though they are not absolutely correct.  When I edit text files that I downloaded in binary mode from the NT server with vi, it has those ^M characters on the end of each line.  That is what I'm trying to avoid.  
0
Comment
Question by:inet2xtreme
  • 7
  • 5
12 Comments
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509008
Go ahead and do the transfer, then test each file with the file command.  Most ascii files will cause file to return the word text somewhere in its output.  Then, if file says it's text, you can safely remove the ^M with tr.  Thus:

if [ -n "`file $thefile | grep 'text'`" ]
then
    cp $thefile $thefile.bak
    tr -d "\r" < $thefile.bak > $thefile
fi

Hope this helps!

Rob
0
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509047
BTW:  This is a bash script, so #!/bin/bash should be the first line of the script file you incorporate the above segment into.  Also, the quote characters in the if condition are very important.  There are double quotes, backticks, and single quotes.

For example, create a file (with your fav editor) called un-nt and put this in it:

#!/bin/bash

for thefile in *
do
  if [ -n "`file $thefile | grep 'text'`" ]
  then
      cp $thefile $thefile.bak
      tr -d "\r" < $thefile.bak >    $thefile
   fi
done

This will walk thru any files you mention on the command line, doing the conversion if necessary.

e.g.

un-nt /home/rob/html/*

would test and convert all files in /home/rob/html

Rob

0
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509060
Oh, one more thing.  You'll need to do this before the script will execute:

chmod 755 un-nt

Another thing, did you notice the script will retain a backup of the file in case something screwy happens.

That's all, fer real :)

Rob
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
LVL 2

Author Comment

by:inet2xtreme
ID: 2509085
Sorry, I hate to get rejections, but I had to.  Many of the html files don't contain the word "text" in them.  I know that it is possible for ftpclients to detect when I'm ascii mode trasferring binary files since they warn me that it contained some bare end of line characters or something like that.  I really want a automatic transmission so to speak that will get the files and fix them on the fly.  The sheer amount of files is probably over 10,000 or more.  If you can come up with a surefire easy way, you can have the points still.  Thanks for trying.
0
 
LVL 2

Author Comment

by:inet2xtreme
ID: 2509110
Sorry, I hate to get rejections, but I had to.  Many of the html files don't contain the word "text" in them.  I know that it is possible for ftpclients to detect when I'm ascii mode trasferring binary files since they warn me that it contained some bare end of line characters or something like that.  I really want a automatic transmission so to speak that will get the files and fix them on the fly.  The sheer amount of files is probably over 10,000 or more.  If you can come up with a surefire easy way, you can have the points still.  Thanks for trying.
0
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509126
You misunderstood ( or I didn't explain right :)

We are not looking for the word text in the file itself, we are looking for the word text in the output from the file command.  IOW, there is a utility called "file" that examines a file and tries to determine it's type.  Try running it on one of your html files.  Somewhere in the output of this command you should see 'text' if it is an ascii file.  That's what we're looking for.  Do 'man file' for more info.

Rob
0
 
LVL 3

Accepted Solution

by:
RobWMartin earned 800 total points
ID: 2509144
To test the solution in a safe manner, create the script I mentioned above (i.e. un-nt).  Grab some good examples of files that should be converted and some that shouldn't.  Put A COPY of those files in a temporary directory, say /var/tmp/un-nt. Then run the command as follows:

/usr/local/bin/un-nt /var/tmp/un-nt/*

This assumes you put the script file in /usr/local/bin and made it executable.

Then, ls /var/tmp/un-nt and look for *.bak files.  Every one of those should have been converted.

Rob
0
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509152
Actually, the *.bak files are the originals.  The corresponding files (i.e. with .bak) are the converted versions of the file.

Rob
0
 
LVL 2

Author Comment

by:inet2xtreme
ID: 2509203
I'll give that a try.  Maybe I can incorporate that into the download script.  In psudocode:

#Login to the server
smbclient //server/resource -U username || exit 0

#load up the file list I saved/processed from the source server

@filenames=<filelist>;

lcd /tmp/un-nt/
get filenames[fileindex]

call your script with /tmp/un-nt/filenames[fileindex] on the command line:

if [ -n "`file $1 | grep 'text'`" ]
                         then

                             tr -d "\r" < $1 > $1
                         fi

Then copy the fixed or not fixed file from that location to its final path in the /home tree.  

I think this will work.  Thanks for the help.  Sorry for my mis-understanding.  I thought you were simply greping for the word "text".  I've not used the "file" command before except to test if a file exists in a shell script.
0
 
LVL 2

Author Comment

by:inet2xtreme
ID: 2509495
Can you think of any way to make the method you listed above that will step though a directory and fix files that are broken recursive?  Instead of checking each file one by one, I'm downloading all of them as you suggested and I'm going to run the un-nt fix on them, however they all have subdirectories with them.  If no good answer, I can process the output from ls -R >files and cut out only the directories and hack up some script to run that on each dir listed.
0
 
LVL 3

Expert Comment

by:RobWMartin
ID: 2509520
Try this:

find /root/path -type d -exec un-nt \{\}/\* \;

I haven't tested this particular invocation, but I've done similar many times before.  The trick is to get the escapes right.

First try it without the -exec ....

Make sure it lists the subdirectories you're interested in.  Then add the -exec ....  back in.  I would try it on a test directory tree first.

Rob
0
 
LVL 2

Author Comment

by:inet2xtreme
ID: 2509555
Can you think of any way to make the method you listed above that will step though a directory and fix files that are broken recursive?  Instead of checking each file one by one, I'm downloading all of them as you suggested and I'm going to run the un-nt fix on them, however they all have subdirectories with them.  If no good answer, I can process the output from ls -R >files and cut out only the directories and hack up some script to run that on each dir listed.
0

Featured Post

[Webinar] Kill tickets & tabs using PowerShell

Are you tired of cycling through the same browser tabs everyday to close the same repetitive tickets? In this webinar JumpCloud will show how you can leverage RESTful APIs to build your own PowerShell modules to kill tickets & tabs using the PowerShell command Invoke-RestMethod.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses
Course of the Month8 days, 16 hours left to enroll

590 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question