[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

do count and append

Posted on 2011-05-04
21
Medium Priority
?
437 Views
Last Modified: 2013-12-26
i have 10 files test_aa_20110401.txt,test_aa_20110408.txt and so on
data looks like this pipe line data

test_aa_20110401.txt

aa|bb|cc|dd||||||||
bb||123||||ss||cc||

and so on

i want to append all the files (10 files) and also want to do row count for all files

so first step will do count for all 10 files
2) append all files in one big file and do count for all 10 files

Thanks.
0
Comment
Question by:sam2929
  • 9
  • 6
  • 4
  • +2
21 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35688401
for file in $(ls -1 test_aa_*); do
   wc -l $file
   cat $file >> test_aa_full.txt
 done
wc -l test_aa_full.txt

Please note that it's the digit "1" in "ls -1 test_aa_*", not the letter "l"!

wmp
0
 
LVL 85

Expert Comment

by:ozo
ID: 35688574
wc test_aa_*

cat test_aa_* > full.txt
0
 
LVL 3

Expert Comment

by:stetor
ID: 35688604


cat test_aa_* >test_aa_full.txt
wc -l test_aa*

Open in new window


the ozo solution is missing the option for line count (return also word count and other)
and is missing also the total of the joined file

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 12

Expert Comment

by:tel2
ID: 35688674
Hi wmp,
I think you'll find that ls's "-1" switch is unnecessary in this case.  Try it without.  I doubt it would be of use in many scripts.

Hi sam,
Does this do what you require:
    wc -l test_aa_*
    # You should now have a row count for each file, and a total
    # And if you then really want to end up with a file which contains all the data, add this line:
    cat test_aa_* >test_all.txt
0
 
LVL 12

Expert Comment

by:tel2
ID: 35688686
...but I now see others have already basically covered what I suggested.
0
 

Author Comment

by:sam2929
ID: 35688918
need bit modification here


ok once i do wc -l test_aa_* i have one more file test_aa_20110401.cntrl it contains
just one row having counts like 0000006616 rows so what we need is if wc for test_aa_20110401.txt matches the counts for file test_aa_20110401.cntrl then cat it else don't cat it

so
test_aa_20110401.cntrl have just one row count 0000006616
and
test_aa_20110401.txt contains wc count 6616
then append it
if
test_aa_20110402.cntrl contains count 000000313
and
test_aa_20110402.txt contains wc count 6616
don't append it
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689023
for file in $(ls -1 test_aa_*); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file}.cntrl ]] && cat $file >> test_aa_full.txt
 done
wc -l test_aa_full.txt

0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689027
Typo - forgot a closing parenthesis, sorry!

for file in $(ls -1 test_aa_*); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file}.cntrl) ]] && cat $file >> test_aa_full.txt
 done
wc -l test_aa_full.txt

0
 

Author Comment

by:sam2929
ID: 35689086
ok two things i want to parametize test_aa and the output and also never writtena script before
lets say i want to write script in unix box what all path names and variables i need to define
to write a script .
0
 

Author Comment

by:sam2929
ID: 35689186
when i run it i am getting error below
$ for file in $(ls -1 pureq_*); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file}.cntl) ]] && cat $file >> pureq.txt
 done
wc -l pureq.txt> > >        1 pureq_20110419_003002_000.cntl
cat: cannot open pureq_20110419_003002_000.cntl.cntl
cat: cannot open pureq_20110419_003002_000.cntl.cntl
   10000 pureq_20110419_003002_000.txt
cat: cannot open pureq_20110419_003002_000.txt.cntl
cat: cannot open pureq_20110419_003002_000.txt.cntl
       1 pureq_20110419_003002_001.cntl
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 2000 total points
ID: 35689196
Open an editor of your choice (e.g. vi) and create a file containing this:

#!/bin/sh
PREFIX=$1
OUTPUT=${PREFIX}_FULL.txt
for file in $(ls -1 ${PREFIX}*); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file%%.txt}.cntrl) ]] && cat $file >> $OUTPUT
 done
wc -l $OUTPUT
exit


(Please note that I made a small correction - the "cat ${file%%.txt}.cntrl" thing!)

Save the file in a common place you have write access to, under a meaningful name, e.g. "/usr/local/bin/count_append.sh" (the suffix ".sh" is not really needed, but useful to distinguish scripts from other files).

Now issue

chmod +x /usr/local/bin/count_append.sh

From now on you can start the script with

/usr/local/bin/count_append.sh test_aa

If you put /usr/local/bin (or whatever location you choose) in your PATH variable you can start your new script from anywhere (the script is designed to be run from where your input files are) by just typing

count_append.sh test_aa

wmp




0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689205
Just saw your last comment - of course we must do

#!/bin/sh
PREFIX=$1
OUTPUT=${PREFIX}_FULL.txt
for file in $(ls -1 ${PREFIX}*.txt); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file%%.txt}.cntrl) ]] && cat $file >> $OUTPUT
 done
wc -l $OUTPUT
exit
0
 

Author Comment

by:sam2929
ID: 35689374
it just append one file  and there are more files for sure and also syntax error

$ $ for file in $(ls -1 pureq*.txt); do
   wc -l $file
   [[ $(wc -l < $file) -eq $(cat ${file%%.txt}.cntl) ]] && cat $file >> pureq.txt
 done
wc -l pureq.txtksh: syntax error: `do' unexpected
$     7251 pureq_20110503_003002_002.txt
$ $ ksh: syntax error: `done' unexpected
$ wc pureq.txt
wc: cannot open pureq.txtwc
    7251 pureq.txt
    7251 total
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689427
Which way did you create the script? A Windows editor perhaps?

Write it using Unix tools, or, if you write in on Windows, transfer it via FTP using "binary" mode.
Looks just as if the script contains extra carriage return characters or other messy stuff.
0
 

Author Comment

by:sam2929
ID: 35689588
i just ran this in putty copy paste is that not right way
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689717
Copy and paste might transfer too many/wrong characters.

Run

cat -etv filename

and post the results here.
0
 

Author Comment

by:sam2929
ID: 35689729
what this function do
 [[ $(wc -l < $file) -eq $(cat ${file%%.txt}.cntrl) ]] && cat $file >> $OUTPUT
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35689808
wc -l < $file outputs the number of lines of $file

${file%%.txt}.cntrl takes the $file variable and replaces at the end ".txt" with ".cntrl".

cat outputs the content of the file with the modified name

"[[ ... ]]  && " is a short form of an "if ... then ... do ... done" construct. "-eq" means "equal"

cat input fille >> outputfile appends the content of inputfile to outputfile
0
 
LVL 12

Expert Comment

by:tel2
ID: 35693754
Hi woolmilkporc,

To get a bit off the current issues...

Did you see my last comment about "ls -1"?  Why are you using "-1" it in this context?  Have you tried just "ls"?

Why do you have "exit" at the end of your script?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 35693922
tel2,

please don't forget that not everybody here is such an experienced and wise guru like you obviously are.

ls alone will give one-column output only when used from inside a script, not from commandline.
I often saw people trusting in ls giving a single column and getting upset and clueless when they first used it from commandline and got wrong results. That's why I always recommend ls -1 in such a case. It doesn't do any harm, or does it?

Why should I not use exit? I know there's an implicit exit at end-of-file, but I think it's good for readability having it there, particularly when posting in a forum - you'll recognize the end of the script at first sight.

Don't you think it would be better for our askers and all of us if you tried to help finding solutions or correct obviously wrong suggestions instead of nitpicking?



0
 
LVL 12

Expert Comment

by:tel2
ID: 35694448
Hi woolmilkporc,

I'm not trying to nitpick, and I'm sorry if it came across that way.  Even if I was a wiser guru than you, that shouldn't stop me from trying to help you and the asker to understand what is needed and what is not, in case you both didn't already know (you might know, but the asker might not).  Cutting out the unnecessary stuff can save everyone time in the future, including your explanation of it being "-1", not "-l".  Since you didn't respond to my first post, I didn't know if you'd read it, so I repeated myself.  From your perspective, it seems my comments were not useful.  Hopefully the asker will benefit in some way.

I did try to find a solution, but it looks as if you have it in hand now, and I don't have enough spare time at present, so I'm leaving it to you.

Tel2
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Dynamic window placements and drawing on a form, simple usage of windows registry as a storage place for information. Continuing from the first article about sudoku.  There we have designed the application and put a lot of user int…
The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Loops Section Overview
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question