nQuote
asked on
Cutting and renaming files
I have some files such as:
abc.dat.xyz1234.001
abc.dat.xyz1234.002
abc.dat.xyz1234.003
and so on.
The files are pipe delimited as:
1|abc|hty|!
2|def|hju|!
and so on.
I would like to remove the first column value and the delimiter from all rows in the file. So the files need to look like this:
abc|hty|!
def|hju|!
I know I can do it using cut but I also need to rename the files as:
abc.dat.xyz1234.001 becomes abc.dat.00
abc.dat.xyz1234.002 becomes abc.dat.01
and so on.
Most important, the number of files is not fixed. The only difference between the file names is the last digit. Is there a simple way to do this?
abc.dat.xyz1234.001
abc.dat.xyz1234.002
abc.dat.xyz1234.003
and so on.
The files are pipe delimited as:
1|abc|hty|!
2|def|hju|!
and so on.
I would like to remove the first column value and the delimiter from all rows in the file. So the files need to look like this:
abc|hty|!
def|hju|!
I know I can do it using cut but I also need to rename the files as:
abc.dat.xyz1234.001 becomes abc.dat.00
abc.dat.xyz1234.002 becomes abc.dat.01
and so on.
Most important, the number of files is not fixed. The only difference between the file names is the last digit. Is there a simple way to do this?
ASKER
I am not clear about your question but the files need to be:
00
01
02
and so on.
The files comes in as:
001
002
003
so the output is one less.
It is sequential with 3 digits when they are coming. They need to be increasing sequential starting with 00 (two digits) when they are changed.
Does that answer your question?
00
01
02
and so on.
The files comes in as:
001
002
003
so the output is one less.
It is sequential with 3 digits when they are coming. They need to be increasing sequential starting with 00 (two digits) when they are changed.
Does that answer your question?
#!bin/bash
n=100;
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*}.${n:1}; let ++n ; done
n=100;
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*}.${n:1}; let ++n ; done
ASKER
ozo, thank you for your solution. It almost works but I need to rename the files as:
abc.dat.xyz1234.001 becomes abc.dat.00
abc.dat.xyz1234.002 becomes abc.dat.01
After running your script, the files are becoming:
abc.dat.xyz1234.00
abc.dat.xyz1234.01
They need to be:
abc.dat.00
abc.dat.01
I mentioned this in my first post.
abc.dat.xyz1234.001 becomes abc.dat.00
abc.dat.xyz1234.002 becomes abc.dat.01
After running your script, the files are becoming:
abc.dat.xyz1234.00
abc.dat.xyz1234.01
They need to be:
abc.dat.00
abc.dat.01
I mentioned this in my first post.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Thanks.
ASKER
ozo, can you please explain your solution a little?
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*.*}.${n:1}; let ++n ; done
I understand this part:
for f in abc.dat.xyz1234.??? ;
You are asking it to go through all files with the 3 number wildcard at the end.
Can you please explain this part?
do cut -d'|' -f 2- $f>${f%.*.*}.${n:1};
Thanks.
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*.*}.${n:1}; let ++n ; done
I understand this part:
for f in abc.dat.xyz1234.??? ;
You are asking it to go through all files with the 3 number wildcard at the end.
Can you please explain this part?
do cut -d'|' -f 2- $f>${f%.*.*}.${n:1};
Thanks.
${f%.*.*} is $f with the last .*.* pathname expansion removed
${n:1} is $n with the first character removed
${n:1} is $n with the first character removed
ASKER
ozo, thank you very much for your response.
So in this example:
abc.dat.xyz1234.001
$f = abc.dat.xyz1234.001
1) do cut -d'|'
means for every row remove the | and anything to its left, right?
2) What is the
-f 2-
for?
3)$(n:1)
How does this become 00?
4)
$f>${f%.*.*}
means abc.dat.xyz1234.001 becomes abc.dat, right? Does the % remove the .*.*?
5)
let ++n
means increment the file counter, right? What is the counter value that it starts with?
Thanks a lot.
So in this example:
abc.dat.xyz1234.001
$f = abc.dat.xyz1234.001
1) do cut -d'|'
means for every row remove the | and anything to its left, right?
2) What is the
-f 2-
for?
3)$(n:1)
How does this become 00?
4)
$f>${f%.*.*}
means abc.dat.xyz1234.001 becomes abc.dat, right? Does the % remove the .*.*?
5)
let ++n
means increment the file counter, right? What is the counter value that it starts with?
Thanks a lot.
man cut
...
The list option argument is a comma or whitespace separated set of numbers and/or number ranges. Number ranges consist of a number, a dash (`-'), and a second number and select the fields or columns from the first
number to the second, inclusive. Numbers or number ranges may be preceded by a dash, which selects all fields or columns from 1 to the last number. Numbers or number ranges may be followed by a dash, which selects
all fields or columns from the last number to the end of the line.
...
-f list
The list specifies fields, separated in the input by the field delimiter character (see the -d option.) Output fields are separated by a single occurrence of the field delimiter character.
#!bin/bash
n=100
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*.*}.${n:1}; let ++n ; done
n=100
for f in abc.dat.xyz1234.??? ; do cut -d'|' -f 2- $f>${f%.*.*}.${n:1}; let ++n ; done
The two-digit string you asked for at the end of the new file name, is that simply sequential? Or is it expected to be mathematically related (as in: "one less than the numerical value of") to the final digit sequence of the old file?