bdhtechnology
asked on
Read in data from 100,000+ files via command line
I had originally asked the following question:
https://www.experts-exchange.com/questions/28329553/Read-in-data-from-100-000-files-via-command-line.html
This at first seemed like it work exactly the way I needed, however I just discovered that lines with spaces were not correctly read in.
Below is the last iteration of the code:
For the following .arf file:
Which resulted in the GROUPID column containing incorrect values. How can the code above be adjusted to process spaces as well?
https://www.experts-exchange.com/questions/28329553/Read-in-data-from-100-000-files-via-command-line.html
This at first seemed like it work exactly the way I needed, however I just discovered that lines with spaces were not correctly read in.
Below is the last iteration of the code:
#!/bin/sh
echo `date`
find . -name "*.arf" | while read f; do
newpath="$(basename $(dirname "$f"))"
#/$(basename $f)"
cat "$f" | gawk -v p="$newpath" '{
attname=substr($1,1,length($1)-1); nlist=nlist"`, `"attname;
attvalue= substr($2,2,length($2)-2); vlist=vlist", '\''"attvalue"'\''";
}
END {
printf "insert into `mydatabase`.`archives` (`NEWPATH%s`) values ('\''%s'\''%s);\n", nlist, p, vlist;
}' >> myinsertfile.sql
#| tee -a myinsertfile.sql
cnt=$((cnt+1))
[ $(($cnt%100)) -eq 0 ] && echo "File #$cnt: $f"
done
echo "Total Files: $cnt"
echo `date`
For the following .arf file:
FILEID: "TIF490336"
PATH: "/optical/incoming/TIF490336"
TYPE: "TIF"
SECLEV: "10"
STATID: ""
USRID: "admin"
REQDATE: "08/02/2012"
REQTIME: "09:02:32"
GENDATE: "08/03/2012"
GENTIME: "09:02:32"
PROGID: ""
GROUPID: "Check Stubs"
DESC: "August"
It produced the following SQL statement:insert into `mydatabase`.`archives` (NEWPATH,FILEID,PATH,TYPE,SECLEV,STATID,USRID,REQDATE,REQTIME,GENDATE,GENTIME,PROGID,GROUPID,`DESC`) values ('TIF18','TIF490336','/optical/incoming/TIF490336','TIF','10','','admin','08/02/2012','09:02:32','08/03/2012','09:02:32','','Chec','August');
Which resulted in the GROUPID column containing incorrect values. How can the code above be adjusted to process spaces as well?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER