IT_ETL
asked on
How do I concatenate many files into a single file?
I have got around 100 files. I would like to concatenate them into a single file. Just a note, these are all text files and there should not be any spaces in between file1 and file2 or file2 and file3 and so on. Also, contents of the file2, file3 and so on will start from the next line wherever the previous file end. For example,
file_20140801_1
100 20140101
101 20140102
file_20140802_2
1000 20130101
1001 20130102
file_20140803_3
2000 20140601
2001 20140602
When I concatenate above three files, it should look like below,
file
100 20140101
101 20140102
1000 20130101
1001 20130102
2000 20140601
2001 20140602
Considering above example, how do I concatenate around 100 files into a single file in UNIX. Please advise.
file_20140801_1
100 20140101
101 20140102
file_20140802_2
1000 20130101
1001 20130102
file_20140803_3
2000 20140601
2001 20140602
When I concatenate above three files, it should look like below,
file
100 20140101
101 20140102
1000 20130101
1001 20130102
2000 20140601
2001 20140602
Considering above example, how do I concatenate around 100 files into a single file in UNIX. Please advise.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The files are not sorted, can be concatenated in any order.
Ok, based on your example: are all the filenames named like this:
file_20140801_1
file_20140802_2
file_20140803_3
etc?
Then the 'sort' would just be alphabetical and this should work for you:
cat `ls -1 file_* | sort` > newfile
file_20140801_1
file_20140802_2
file_20140803_3
etc?
Then the 'sort' would just be alphabetical and this should work for you:
cat `ls -1 file_* | sort` > newfile
ASKER
Thanks Gerwin. Above command concatenated all the files. Now I like to check couple of things.
1) How do I check number of files exist start with file* in a certain directory? I am expecting to see total count 111.
2) How do I check combined number of records from all the files equal to the number of records in a single file? I typed below commands,
wc -l file*
#I see number of records in each file plus total number of records on the bottom
wc -l single_file*
#I see total number of records in a single file
Number of record counts matched. Would you suggest any other commands to validate number of record counts are same?
1) How do I check number of files exist start with file* in a certain directory? I am expecting to see total count 111.
2) How do I check combined number of records from all the files equal to the number of records in a single file? I typed below commands,
wc -l file*
#I see number of records in each file plus total number of records on the bottom
wc -l single_file*
#I see total number of records in a single file
Number of record counts matched. Would you suggest any other commands to validate number of record counts are same?
ls file*|wc
Your command wc -l file* list the number of lines in each file.
wc outputs 3 items.
Your command wc -l file* list the number of lines in each file.
wc outputs 3 items.
Number of lines
Number of words
Number of characters
wc -l file*
# above command will count lines in every file matching file* pattern, with a summary line at the end
sample:
wc -l file*
2 file_20140801_1
2 file_20140802_2
2 file_20140803_3
6 total
wc -l newfile
# above command will count lines in the new file named newfile
sample:
wc -l new_file
6 new_file
Note that both commands above give the same output: 6 lines in total.
# above command will count lines in every file matching file* pattern, with a summary line at the end
sample:
wc -l file*
2 file_20140801_1
2 file_20140802_2
2 file_20140803_3
6 total
wc -l newfile
# above command will count lines in the new file named newfile
sample:
wc -l new_file
6 new_file
Note that both commands above give the same output: 6 lines in total.
ASKER
ls file*|wc gives me below output,
$ ls file*|wc
111 111 488
$ ls file*|wc
111 111 488
ls file* | wc -l
is giving you the amount of files - you forgot the -l at the end?
$ ls file* | wc -l
3
is giving you the amount of files - you forgot the -l at the end?
$ ls file* | wc -l
3
No need for ls
cat file* >newfile
cat file* >newfile
Or: ls -1 | sort