Solved

UNIX command to concatenate files

Posted on 2011-09-26
21
701 Views
Last Modified: 2012-05-12
How can I write a UNIX command (Solaris 9)  that will concatenate two files whose names starts with a known string:
$cat f1*.csv > result.csv
Above does not work.

But:
$cat f1_20997.csv f12_2022.csv f13_2011.csv > result.csv
Above works. But I do not know the file names when writing the command in a batch file.
0
Comment
Question by:toooki
  • 7
  • 7
  • 4
  • +2
21 Comments
 
LVL 4

Accepted Solution

by:
klodefactor earned 119 total points
Comment Utility
The first example works for me (as expected).  Questions:

What shell are you using?

How does the first example fail?  Does the shell return an error, or are the contents of result.csv incorrect?

Be aware that the source files will be catenated using the sort order currently used by your locale.  So for me (ASCII sort order) in the example you give, result.csv will contain the contents of f12_2022.csv, followed by the contents of f13_2011.csv, followed by the contents of f1_20997.csv.

--klodefactor
0
 
LVL 13

Assisted Solution

by:Hugh McCurdy
Hugh McCurdy earned 119 total points
Comment Utility
Could you get a simple example to fail?  

file1
Anderson, Jane
Baker, Brian

file2
Daniels, Danforth
Edwards, Eduardo

file3
Michaels, Michelle
Norton, Naomi

If you cat f* > foo.csv, does it work?

If not, please share a directory listing of all your csv files and tell us how it fails.
0
 

Author Comment

by:toooki
Comment Utility
Thanks a lot.

Sorry, The $cat f1*.csv > result.csv actually works.

$echo $SHELL
/bin/ksh

Thanks for letting me know about the concatenation order.
Is there any easy way to update the cat command so that the concatenation of the files could happen based on the timestamps of the files..?

file1
Anderson, Jane
Baker, Brian

file2
Daniels, Danforth
Edwards, Eduardo

cat f* > foo.csv
will create:
Anderson, Jane
Baker, Brian
Daniels, Danforth
Edwards, Eduardo
If file1 was was created first (older timestamp).


cat f* > foo.csv
will create:
Daniels, Danforth
Edwards, Eduardo
Anderson, Jane
Baker, Brian
If file2 was was created first (older timestamp). ...

Thanks!


0
 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
For that type of scenario I typically include a timestamp in the filename: YYYY-MM-DD-HH-MM-SS with hours in 24-hour format.

If that's not possible in your case, you could do something sleazy like this:
cat `ls -tr f1*.csv` > result.csv
--klodefactor
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
Klode has good advice.  I'd like to add to it in case you'd like to learn.  Type this at the command prompt

ls -ltr f1*csv

Open in new window


that's a small L before the tr, not the numeral one.

And you can visually see that the files are sorted by timestamp.  

For more learning, I suggest
man ls

Open in new window


You can so type man ls to Google for a manual page.
0
 

Author Comment

by:toooki
Comment Utility
Many thanks.

cat `ls -tr f1*.csv` > result.csv

The above commands works for me. But here is one problem:

If there is no such file (like file1 or file2 or so). -- then the script stalls: (it does not exit the program and comes back to command prompt).

$ ./test.cron
/bin/ls: f*: No such file or directory

The content of test.cron is:
#!/bin/sh
/bin/cat `/bin/ls -tr f*` > 3


If I change the content to:
#!/bin/sh
/bin/cat `/bin/ls -tr f*` > 3 >/dev/null 2>&1

It still gets stuck.

Is there any way to let the command execute only if there is a file like f* there ? Or the program to exit if there is no such f* file.?

Thanks!
0
 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
In the case you describe, the script is stalled because it's waiting for input from stdin.  If you type <CTRL>D you'll get your prompt back :-).  This is because the command within backticks returns an empty list, so your command is effectively: cat > result.csv

There are many things you can test for within a shell script, one of which is the existence of files.  Below, the "[" is actually one version of the test command, and the "-f" checks for the existence of files as specified by the pattern.
#!/bin/sh -
if [ -f f1*csv ]; then
    echo hi
fi

Open in new window

You can extend the script by accepting filenames on the command line as arguments to the script, and changing the test command to use e.g. $1 (the first argument on the command line) as the file spec.  A better version would of course check that arguments were given on the command line, and if not would exit with a short help message (the "usage").

--klodefactor
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
I know how to find if a file exists, if I know the name.  Do we know at least one of the names?  

If there are any f1*.csv files then is it guaranteed that one of them would be called f10001.csv (for instance)?  If so, this is easy.  If not, I haven't found an answer.


To give you an idea of what I'm talking about


if [ -a f10001.csv  ]; then
  echo "yes"
else
  echo "no"
fi

Open in new window


I just don't have much confidence this is your solution.
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
klode, I tried something like that and it didn't work.  Using (mostly) your example, I get

#!/bin/sh -
if [ -f foo*  ]; then
  echo "yes"
else
  echo "no"
fi

Open in new window


and I get
z: line 2: [: too many arguments
no

I do have foo and foo.cpp in that folder.

Still, if it worked for you it might work for the author.  It really doesn't matter if it works for me.  Only matters if it works for the author.
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
If the shell won't solve this problem, there is a way to solve the problem -- write a C/C++ program to control the processing.  At this point, if this was my problem, that's what I'd be doing.  But my idea only works for people who know C.

Best idea I have for now.
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
D'oh.  My fault for posting without thinking.  The problem you hit is that if the pattern expands to more then one file, the test fails because it expects only one file (it's a unary test).

More sleaze.  The following is stupid because it runs the "ls" command twice.  Not important if this won't run often, but should still be cleaned up.
#!/bin/sh-
ls f1*.csv > /dev/null 2>&1 \
    && cat `ls -tr f1*.csv` \
    > result.csv

Open in new window

The "\" escapes the end of line so that the following line is considered to be part of the same line that contains the "\".  I use it to make complex commands more readable.  The "&&" is a boolean AND: if the command before the "&&" fails (has non-zero exit code), then the command after the "&&" is not run.  So if no files are found by the first ls, the "cat" (and its backticked "ls") won't run.

Oh and I just realized that if any of the filenames contain spaces, this needs an overhaul anyway.

Hmccurdy: if sh isn't your thing, C/C++ would work but ye gods why not use something like perl or python?

--klodefactor
0
 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
Argh typo "more than one file"
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
Mostly it's because I can code very quickly in C.  I sometimes write faster in C than in English.... (OK, maybe not.)

In any event, I think you have a solution.  Good night.
0
 

Author Comment

by:toooki
Comment Utility
Thanks a lot.

One question...

if [ -a f*.csv  ]; then
   cat `ls -tr f*.csv` > result.csv
else
  exit 1
fi


The above picks up file1 file2 file3 etc. and also exit the program (that is what I wanted) if there is no such file. So it seemed to work ..

Is it not going to work in some cases you meant...? I tested with two files and it seemed to work.
Otherwise I will try using the other command you mentioned
ls f1*.csv > /dev/null 2>&1 \
    && cat `ls -tr f1*.csv` \
    > result.csv

Thanks!
0
 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
Weird.  It should have failed if "f*.csv" expands to more than one filename.  You can always check what the script is really doing by running it as "sh -x scriptname".  This tells sh to print the line just before executing the command, i.e. after variable substitution, filename substitution, etc.  If you want to also see the line as it exists in the shell script (before any substitution), use "sh -vx".  This will print the unmodified line, then the line after substitutions.

Can I assume you added the "else...exit 1" because you need the exit status to be set?  If not you can leave it out.

Lastly, I'd use "-f" instead of "-a" when working with file names.  "-a" checks whether the name exists, but "-f" checks that it exists and is a file.  So in the unlikely event that "f1aaa.csv" is a folder (or socket or pipe or device special file), the "-f" test won't match that name.

--klodefactor
0
 
LVL 35

Assisted Solution

by:Robert Schutt
Robert Schutt earned 118 total points
Comment Utility
To solve the problem of cat waiting for (standard) input when no files match the pattern you could also use:

cat `ls -tr f*.csv 2>/dev/null` < /dev/null > result.csv

Open in new window


if you want to test for an output file size greater than 0 (when no files match the pattern a zero-size result.csv will be created) you can use

if [ ! -s result.csv ]; then exit 1; else echo AOK; fi

Open in new window

0
 
LVL 19

Expert Comment

by:simon3270
Comment Utility
A simpler fix is

   cat `ls -tr f*.csv` /dev/null > result.csv

so that even if there are no f*.csv files, cat still has an (empty) file to list the contents of.
0
 
LVL 4

Expert Comment

by:klodefactor
Comment Utility
robert_schutt, simon3270: that's a neat trick.  However, it ends up creating result.csv and then having to check whether it's empty and if so deleting it.  Better to do the work up front than have to clean up afterwards after needless work.

Remember that such commands might run on heavily-loaded systems or very often, so the less work done by the script the better.  That's why I called my own two-"ls" solution sleazy :-).

--klodefactor
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
Comment Utility
klodefactor, if the system is heavily loaded, wouldn't it be better to write a really tight C program?
0
 
LVL 19

Assisted Solution

by:simon3270
simon3270 earned 119 total points
Comment Utility
The check for emptiness is a shell built-in, so doesn't kick off another process.  You only have to run the extra "rm" process if the file is indeed empty.
0
 

Author Comment

by:toooki
Comment Utility
Thank you all. My apology for late reply.
Everything worked for me.
Thank you.
0

Featured Post

Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

Join & Write a Comment

Java performance on Solaris - Managing CPUs There are various resource controls in operating system which directly/indirectly influence the performance of application. one of the most important resource controls is "CPU".   In a multithreaded…
Recently, an awarded photographer, Selina De Maeyer (http://www.selinademaeyer.com/), completed a photo shoot of a beautiful event (http://www.sintjacobantwerpen.be/verslag-en-fotoreportage-van-de-sacramentsprocessie-door-antwerpen#thumbnails) in An…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now