Solved

Perl Data Parsing for a NEW Newbie!

Posted on 2010-09-23
21
535 Views
Last Modified: 2013-11-18
OK. I am a new newbie and I have been tasked with doing some data parsing for a file.
I have two versions of the script. Neither seem to work. I am trying to copy data that contains the last numeric term from all stations that contain PM2.5 data but only the set (there is always two for each variable) that has 24 hourly averages not 13 average time periods. I have included the complete file that I would have to parse. I have also included both scripts labeled script 1 & 2 below. Please help!!

This is what the output file should look like

PM2.5 21-9-2010 22:4:49 (dd-mm-yyyy hrs:min:sec)
KA5 4
OV20 10
DH1 2
PA16 8
MV17 0
HL11 3
KN12 17
PC 4
KH19 0
SI2 8


This is what the input file looks like:

BEGIN_FILE
FORMAT_VERSION,2
AGENCY,HI1
FILENAME,090913.HI1
DATA_VERSION,201009091310
TZONE,HST,10
BEGIN_GROUP
VARIABLE,CO
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,PPM
STATIONS,2
BEGIN_DATA
KA5,150030010,0.2,0.2,0.2,0.2,0.2,0.2,-999,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.6,0.6,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0. 5,0.5,0.5
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,CO
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,PPM
STATIONS,2
BEGIN_DATA
KA5,150030010,0.2,0.2,0.2,0.2,0.2,0.2,-999,0.3,0.2,0.2,0.2,0.2,0.2
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G
DH1,150031001,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,NO2
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,PPM
STATIONS,2
BEGIN_DATA
KA5,150030010,0.001,0,0,0,0.001,0.004,-999,0.004,0.005,0.003,0.003,0.002,0.002,0.002,0.002,0.002,0.001,0.002,0.001,0.002,0.002,0.002,0.001, 0
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,0,0,0,0,0,0,-999,0,0.002,0.002,0,0.001,0,0,0,0,0,0,0,0,0,0,0,0
WB6,150030011,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,NO2
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,PPM
STATIONS,2
BEGIN_DATA
KA5,150030010,0,0,0.001,0.001,0.003,0.011,-999,0.009,0.004,0.002,0.002,0.002,0.003
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G
WB6,150030011,0,0,0,0,0,0.002,-999,0.005,0.002,0,0.001,0,0
WB6,150030011,G,G,G,G,G,G,B,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,OZONE
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,PPM
STATIONS,1
BEGIN_DATA
SI2,150031004,0.013,0.014,0.013,0.013,0.013,0.01,0.009,0.007,0.011,-999,0.021,0.019,0.019,0.018,0.019,0.017,0.018,0.018,0.016,0.009,0.014,0.017,0.017,0.017
SI2,150031004,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,OZONE
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,PPM
STATIONS,1
BEGIN_DATA
SI2,150031004,0.016,0.017,0.016,0.01,0.014,0.011,0.006,0.009,0.017,0.018,0.02,0.022,0.02
SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,PM10
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,UG/M3
STATIONS,4
BEGIN_DATA
KA5,150030010,3,5,9,7,4,9,11,24,26,28,22,20,13,18,13,18,11,9,7,3,1,2,5,6
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,3,2,2,6,10,7,3,5,16,9,7,9,16,14,11,8,7,6,7,6,5,4,5,4
WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,10,11,10,7,8,7,8,6,5,5,-999,-999,6,8,8,8,9,9,9,16,10,8,7,7
DH1,150031001,G,G,G,G,G,G,G,G,G,G,B,B,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,19,10,18,12,8,7,12,24,11,12,12,8,18,10,9,8,8,10,11,11,12,13,15,13
PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,PM10
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,UG/M3
STATIONS,4
BEGIN_DATA
KA5,150030010,7,9,11,9,7,8,22,30,14,26,20,19,15
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,3,8,5,3,6,6,8,10,18,11,9,9,9
WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,9,8,5,6,5,5,8,9,8,7,4,-999,5
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,B,G
PC,150032004,12,12,11,11,17,20,13,20,10,8,9,7,-999
PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,M
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,PM2.5
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,UG/M3
STATIONS,10
BEGIN_DATA
KA5,150030010,0,0,0,0,2,1,0,0,0,3,1,0,1,1,0,5,3,2,3,0,-999,0,0,4
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G,G
OV20,150012020,17,17,18,11,11,6,6,16,9,8,10,13,11,8,7,5,6,6,3,5,6,4,9,10
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,5,5,3,1,1,4,4,2,2,3,2,1,3,4,3,2,2,5,4,4,5,4,3,2
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PA16,150012016,8,11,7,6,10,8,6,4,5,6,6,5,3,6,6,3,4,4,6,8,6,6,8,8
PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,1,5,3,0,2,1,1,1,4,6,6,4,2,2,4,3,2,1,1,2,3,2,0,0
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,4,2,-1,0,2,1,2,4,3,2,2,0,-1,0,4,4,2,1,1,3,5,5,2,3
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,8,10,7,7,9,11,11,7,4,7,8,6,5,5,6,5,7,10,18,15,13,19,20,17
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,2,3,2,0,1,0,0,2,3,4,4,1,1,2,0,0,0,0,2,3,4,3,5,4
PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
KH19,150090006,1,0,0,4,7,4,1,0,0,1,1,1,3,2,6,11,18,5,3,2,2,0,0,0
KH19,150090006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
SI2,150031004,7,5,4,6,7,6,6,6,6,6,6,6,6,10,8,5,8,9,8,10,13,9,8,8
SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,PM2.5
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,UG/M3
STATIONS,10
BEGIN_DATA
KA5,150030010,6,2,1,5,1,1,4,1,4,6,3,3,2
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G
OV20,150012020,6,5,11,13,11,11,12,16,17,13,17,21,18
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,5,8,7,2,0,1,3,6,6,4,-999,3,2
DH1,150031001,G,G,G,G,G,G,G,G,G,G,B,G,G
PA16,150012016,8,22,19,14,13,15,13,12,11,6,2,3,4
PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,3,4,4,4,3,2,0,0,1,3,3,2,3
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,3,1,2,1,0,0,1,2,3,2,1,1,-1
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,12,11,11,10,10,9,10,11,8,8,9,10,11
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,1,0,2,3,2,1,4,7,6,2,3,6,-999
PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,M
KH19,150090006,2,0,0,0,1,4,0,1,4,1,0,0,-999
KH19,150090006,G,G,G,G,G,G,G,G,G,G,G,G,M
SI2,150031004,6,6,8,8,7,5,7,8,9,6,5,8,8
SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,SO2
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,PPM
STATIONS,9
BEGIN_DATA
KA5,150030010,0,0,0,0,0,0,-999,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0,0,0,0,0,0
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,0,0,0,0,0,0,-999,0.001,0.001,0.002,0.001,0.001,0.001,0,0,0,0,0,0,0,0,0,0,0
WB6,150030011,G,G,G,G,G,G,M,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
OV20,150012020,0,0.001,0.001,-0.001,-0.001,-0.002,-0.001,0.002,0.006,0,0.003,0.008,0.001,-0.001,-0.001,-0.001,-0.001,-0.002,-0.002,-0.002,-0.001,0,-0.001,-0.001
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0. 001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PA16,150012016,0.087,0.036,0.079,0.13,0.105,0.081,0.102,0.069,0.087,-999,0.007,0.004,0.002,0.001,0.001,0.001,0.001,0.003,0.006,-999,0.013,0.011,0.016,0.053
PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,B,G,G,G,G
MV17,150012017,0.002,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-999,0.004,0.001
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G
HL11,150011006,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0 .001,0.001,0.001,0.001,0.002,-999,0.005,0.002,0.002,0.002
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G,G,G
KN12,150011012,0,0,0,0,0,0,0.001,0.001,0.001,0.001,0.002,0.001,0.001,0.001,0.001,0.001,0.003,0.003,0 .002,0.003,0.004,0.004,0.003,0.002
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PE10,150012010,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0 .001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,SO2
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,PPM
STATIONS,9
BEGIN_DATA
KA5,150030010,0,0,0,0,0,0,-999,0.002,0.001,0.001,0.001,0.001,0.001
KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G
WB6,150030011,0,0,0,0,0,0,-999,0.002,0.001,0.001,0.001,0.001,0.001
WB6,150030011,G,G,G,G,G,G,M,G,G,G,G,G,G
OV20,150012020,-0.001,0.055,0.007,0,-0.001,-0.001,-0.001,-0.001,-0.001,0.001,0.013,0.007,0.004
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001
DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G
PA16,150012016,0.16,0.352,0.37,0.328,0.308,0.265,0.224,0.175,0.051,0.008,0.006,0.003,0.002
PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,0,0,0,0,0,0,0,0,0,0,0,0,0
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,0.002,0.001,0.001,0.001,0.001,0.001,0.001,0.002,0.002,0.002,0.002,0.002,0.002
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,0.001,0.001,0.001,0.001,0,0,0,0.001,0.001,0.001,0.002,0.003,0.003
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G
PE10,150012010,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,WD
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,DEGREES
STATIONS,11
BEGIN_DATA
KA5,150030010,58,66,66,47,43,46,59,44,64,66,66,69,71,78,71,64,62,61,71,66,85,53,49,45
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,63,69,67,65,59,56,63,65,75,81,84,81,70,68,65,68,66,71,56,63,62,66,64,52
WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
OV20,150012020,336,8,343,26,19,28,22,267,229,224,249,256,264,288,239,187,125,207,338,350,34,65,19,36
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,93,-999,-999,-999,-999,-999,62,84,56,48,57,54,54,54,56,54,56,57,61,76,64,68,66,74
DH1,150031001,G,K,K,K,K,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PA16,150012016,325,331,323,303,300,308,294,229,173,-999,106,106,117,113,128,133,180,257,306,267,287,296,328,311
PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,246,249,257,289,263,249,264,275,357,29,37,39,49,36,51,52,52,32,12,274,268,262,259,256
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,264,260,264,250,271,271,257,279,351,76,68,67,73,70,68,78,60,36,311,284,282,280,274,26 5
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,78,95,93,107,104,96,104,122,153,168,189,198,205,232,256,260,239,99,71,64,50,46,78,103
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PE10,150012010,322,318,314,314,319,316,315,319,332,358,4,10,22,26,25,32,24,17,358,344,335,329,331,32 8
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,37,0,338,-999,338,355,335,334,63,55,58,50,50,56,47,54,56,49,51,49,51,51,42,41
PC,150032004,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
SI2,150031004,100,100,23,18,36,38,23,22,16,21,29,32,32,27,28,25,20,26,35,45,42,36,21,24
SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,WD
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,DEGREES
STATIONS,11
BEGIN_DATA
KA5,150030010,37,62,65,61,-999,-999,40,49,62,73,37,65,47
KA5,150030010,G,G,G,G,K,K,G,G,G,G,G,G,G
WB6,150030011,57,54,57,50,-999,-999,65,59,83,81,79,78,79
WB6,150030011,G,G,G,G,K,K,G,G,G,G,G,G,G
OV20,150012020,26,19,27,24,15,23,15,314,273,228,232,269,288
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,64,79,109,-999,112,114,83,94,117,99,84,66,67
DH1,150031001,G,G,G,K,G,G,G,G,G,G,G,G,G
PA16,150012016,311,315,316,316,318,319,306,316,40,99,97,89,93
PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,256,248,278,273,255,240,277,20,26,348,96,26,67
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,261,264,265,258,269,268,271,335,83,265,222,76,49
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,-999,100,107,-999,77,89,82,-999,225,226,227,215,239
KN12,150011012,K,G,G,K,G,G,G,K,G,G,G,G,G
PE10,150012010,324,329,328,315,314,315,316,330,329,356,38,81,97
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,46,67,74,60,-999,-999,52,55,76,69,64,57,-999
PC,150032004,G,G,G,G,K,K,G,G,G,G,G,G,M
SI2,150031004,20,51,92,-999,94,107,76,73,77,75,52,34,48
SI2,150031004,G,G,G,K,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,WS
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009080000
END_DTG,201009082359
INTERVAL,60
START_REF,0
NUMSTEPS,24
AVG_TIME,60
UNITS,M/H
STATIONS,11
BEGIN_DATA
KA5,150030010,3.4,3.4,3,3.3,3.9,3.4,2.9,4,5.4,6.2,6.3,6.2,6.5,7.3,6.4,6.5,6.3,5.7,5.5,3.4,2,2.5,3.4, 3.9
KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
WB6,150030011,7,6.6,6.3,5.2,4.9,2.8,4.4,5.3,9.3,10.7,10.5,10.6,9.6,9.7,8.2,8.9,9.3,8.8,6.6,5.1,5,5.9 ,7.4,5.2
WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
OV20,150012020,1.6,1.9,1.2,3,4.7,4.3,3,1.4,2.8,2.7,3.6,7.2,5.7,5.7,3.4,1.7,4.7,0.7,1.9,1.9,0.8,1.5,4 .3,3.3
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,1.2,-999,-999,-999,-999,-999,1,1.2,2.1,2.3,3.7,4.9,6.1,6.8,7.1,6.8,5.8,5.7,4.5,2.2,4.2,2.9,3.5,2.2
DH1,150031001,G,K,K,K,K,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PA16,150012016,4.9,4.4,3.1,4.3,4.7,4.6,3,1.4,0.9,-999,10.2,9.2,7.3,7,6.6,4.8,2.8,3.5,2.7,1.2,3.8,4.3,5.7,4.1
PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,2.7,1.7,3.4,1.4,3.7,2,3.6,3,3.2,3.2,3.6,3.9,4.5,4.5,4.8,5.3,4.1,3,1.6,0.9,1.3,1.3,2,2 .9
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,3.7,3.5,3.2,2.7,4.7,3.4,3.2,2.7,2,3.3,3.5,4.3,6,4.1,5,5.5,2.7,1.8,1.7,2.9,3.9,3.9,3.8 ,3.4
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,2.9,4.2,5,4.8,4.7,5.9,5.7,4.2,5.1,5.3,6,6.1,5.6,5.4,4.5,2.9,0.9,1.5,2.6,2.2,2.2,2.2,2 .4,1.7
KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PE10,150012010,2.9,3.1,3.4,3.8,3.5,4,4,4.8,4.8,4.5,4.6,4.9,4.9,4.9,4.8,4.2,4.8,4.1,2.5,2.3,3.1,4.6,4 .5,5.5
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,2.1,2,2.1,-999,2.6,2.4,2.1,2,4,6.3,6,6.3,7.9,6.5,7.4,7.6,7,7,5.4,5.3,5.3,3.4,4.1,3.2
PC,150032004,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
SI2,150031004,2.2,2,1.5,2.4,1.5,1.5,2.4,2.3,2.9,3.4,4.8,5.4,6.6,6.9,7.4,7.6,8,7.2,5.6,2.7,5.4,4.9,3. 9,3.5
SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
BEGIN_GROUP
VARIABLE,WS
DATA_TYPE,POINT
MEASUREMENT_TYPE,SAMPLE
CHARACTERISTIC,OBSERVED
START_DTG,201009090000
END_DTG,201009091309
INTERVAL,60
START_REF,0
NUMSTEPS,13
AVG_TIME,60
UNITS,M/H
STATIONS,11
BEGIN_DATA
KA5,150030010,3.9,2.3,1.7,1.6,-999,-999,1.4,2.7,4.2,5,5.5,5.5,6.1
KA5,150030010,G,G,G,G,K,K,G,G,G,G,G,G,G
WB6,150030011,5.3,3.4,2.7,2.3,-999,-999,1.2,3.2,7,8.4,8.3,7.7,8.4
WB6,150030011,G,G,G,G,K,K,G,G,G,G,G,G,G
OV20,150012020,5.5,3.1,3.4,4.6,4.5,3.9,4.8,3.1,3.4,3.2,4.9,5,4.9
OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G
DH1,150031001,4.4,2.4,2,-999,1.8,2.5,2.4,2.1,3,2.9,3,4.4,4.2
DH1,150031001,G,G,G,K,G,G,G,G,G,G,G,G,G
PA16,150012016,4.5,5.4,5.3,5.1,4.7,5.2,3.6,1.9,1.3,6.8,5,3.9,3.7
PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G
MV17,150012017,2.6,2.8,1.8,1.7,2.7,1.9,1.2,1.6,2.3,2.9,2.7,1.9,2.6
MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G
HL11,150011006,3.7,3.7,3.7,3.8,2.8,2.6,2.5,1.5,2.6,1.5,1,1.6,1.9
HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G
KN12,150011012,-999,2.1,2.6,-999,3.4,3.7,3.8,-999,2,3.2,3.6,4.1,3.9
KN12,150011012,K,G,G,K,G,G,G,K,G,G,G,G,G
PE10,150012010,5,3.7,3.6,3.8,4.6,5.1,4,3.5,2.9,2.5,2.4,3.3,3.1
PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G
PC,150032004,3.6,3.5,3.3,2,-999,-999,2.6,2.8,4.1,4.9,5.2,6.6,-999
PC,150032004,G,G,G,G,K,K,G,G,G,G,G,G,M
SI2,150031004,4.9,1.9,2,-999,2.1,3,4,4.8,5.8,7.3,5.9,5.6,5.2
SI2,150031004,G,G,G,K,G,G,G,G,G,G,G,G,G
END_DATA
END_GROUP
END_FILE

for code1 I get

 I get a command line that says "No. of lines parsed 463uila%" and an output file that looks like this
VARIABLE,CO
VARIABLE,CO
VARIABLE,NO2
VARIABLE,NO2
VARIABLE,OZONE
VARIABLE,OZONE
VARIABLE,PM10
VARIABLE,PM10
VARIABLE,PM2.5
VARIABLE,PM2.5
VARIABLE,SO2
VARIABLE,SO2
VARIABLE,WD
VARIABLE,WD
VARIABLE,WS
VARIABLE,WS

For the second attempt at code I get

This was the response "********************uila%
" Nothing else seemed to happen.
#!/usr/local/bin/perl   -w
 
# getting the source code from the file 
#####################################################################
open(IN,"/home/uila3/rhuff/doh/090809.txt")   or die "cannot open file1 for reading\n";
open(OUT,">/home/uila3/rhuff/doh/test.txt") or die "cannot open file2 for writing\n"; 
my $start = 0;
my $count=0;
while(<IN>)
{
   chomp;
  $count++;
  next if( /^\s*$/ );        #skip empty lines
  if( /^END_DATA\s*$/ )  #end if this word found
  {
    $start = 0;
    next;
  }
  
  if( /VARIABLE,\s*(.*)$/ )
  {
     my ($sec,$min,$hour,$day,$month,$yr19,@rest) = localtime(time);
     print OUT $_ ;  
     next;            
  }
  
  if( ($start==0) && ( /^BEGIN_DATA\s*$/ ) )     #starts with this word only
  {    
     $start = 1;
     next;
  }
  print OUT $1," ",$2,"\n" if( (($start==1)) && ( /^([^,]*),.*,\s*([0-9.-]*)$/ ) );  
}
close(OUT);
close(IN);
print "No. of lines parsed $count";

Open in new window

# getting the source code from the file   
  
  
my $target_data;  
 
 
{  
    local $/ = "VARIABLE,PM2.5\n";  
  
    open my $INFILE, '<', '/home/uila3/rhuff/doh/090809.txt'  
        or die "Couldn't open /home/uila3/rhuff/doh/test.txt: $!";  
  
    my $discard = <$INFILE>;  
    $target_data = <$INFILE>;  
  
    close $INFILE;  
}  
  
print $target_data;  
print '*' x 20;  
  
for my $line (split /\n/, $target_data) {  
    if ($line =~ m{  
                     \A   
                     (  
                        \p{Uppercase}{2}   
                        \d+   
                     )  
                     ,  
                     .*  
                     ,  
                     (\d+)  
                  }xms   
        ) {  
  
        print "$1 $2";  
    }  
}

Open in new window

0
Comment
Question by:libertyforall2
  • 12
  • 8
21 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 33750745
PM2.5 21-9-2010 22:4:49 (dd-mm-yyyy hrs:min:sec)
KA5 4
OV20 10

Where in the input file does this come from?
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33755109
This Perl one liner should do it:

perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}'  <INPUT FILE NAME>

0
 
LVL 16

Assisted Solution

by:jmatix
jmatix earned 500 total points
ID: 33755150
Sorry, slight change:

perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}'  <INPUT FILE NAME>
0
 

Author Comment

by:libertyforall2
ID: 33757461
This is what I entered in the script

#!/usr/local/bin/perl   -w
 
# getting the source code from the file


perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}'  </home/uila3/rhuff/doh/090809.txt>


Bareword found where operator expected at dataparseso4F.pl line 8, near "/home/uila3"
      (Missing operator before uila3?)
syntax error at dataparseso4F.pl line 8, near "n -e "
Illegal octal digit '9' at dataparseso4F.pl line 8, at end of line
Illegal octal digit '8' at dataparseso4F.pl line 8, at end of line
Illegal octal digit '9' at dataparseso4F.pl line 8, at end of line
Execution of dataparseso4F.pl aborted due to compilation errors.
uila%



I then entered this directly into the command line at got the message at the bottom. Is there an output file in this or does it modify the original file?

perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}'  </home/uila3/rhuff/doh/090809.txt>

Missing name for redirect.
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33757844
It should be run from the command line.

The input filename should be given without the angle brackets <>. For example if the file name is /home/uila3/rhuff/doh/090809.txt

perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat} /home/uila3/rhuff/doh/090809.txt
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33757852
The output will be printed on console.
0
 

Author Comment

by:libertyforall2
ID: 33758454
I need the output to be in a file. The script will be executed automatically.
0
 

Author Comment

by:libertyforall2
ID: 33758467
Ok. I entered that and got this.


uila% perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat} /home/uila3/rhuff/doh/090809.txt
Unmatched '.
uila%


Also, I need the date to be put on a file that can be accessed.
0
 

Author Comment

by:libertyforall2
ID: 33758529
    Ok. Almost there. I got the output. Now I just need to modify the command to dump the output into a file say  /home/uila3/rhuff/doh/file.txt

uila% perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}' /home/uila3/rhuff/doh/090809.txt
09-24-2010 13:30:24
KA5 4
OV20 10
DH1 2
PA16 8
MV17 0
HL11 3
KN12 17
PC 4
KH19 0
SI2 8
uila%
0
 
LVL 16

Accepted Solution

by:
jmatix earned 500 total points
ID: 33758530
Sorry, the closing quote for script was lost when I copied and pasted. The following will write the output to a file named out.txt

perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}' /home/uila3/rhuff/doh/090809.txt  >out.txt

0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:libertyforall2
ID: 33758593
Ok. This worked from the command line but I need this to be able to be put in a script. For some reason when I do that I get this message


syntax error at dataparseso4F.pl line 5, near "n -e "Illegal octal digit '9' at dataparseso4F.pl line 5, at end of lineIllegal octal digit '8' at dataparseso4F.pl line 5, at end of lineIllegal octal digit '9' at dataparseso4F.pl line 5, at end of lineExecution of dataparseso4F.pl aborted due to compilation errors.
uila%
0
 

Author Comment

by:libertyforall2
ID: 33758641
Is there a way to put this in a shell script instead of a perl script and use the echo command in unix to execute the phrase on the command line?
0
 

Author Comment

by:libertyforall2
ID: 33758652
Got it!!!!!!!!!!

echo `perl -n -e '/VARIABLE,PM2\.5/ && ($sec=1);/NUMSTEPS,24/ && ($s24=1);/BEGIN_DATA/ && ($bd=1); /END_GROUP/ && ($sec=$s24=$bd=0); /^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd); END{@l=localtime(); $dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);print "$dt\n"; print "$_\n" foreach @dat}' /home/uila3/rhuff/doh/090809.txt  >out.txt`
0
 

Author Closing Comment

by:libertyforall2
ID: 33758672
Solution is answered. I figured out how to put it in a shell script so it could be echoed onto the command line, but they did the hard work!
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33758694
If you want to run from a script file create a script file, say, dataparseso4F.pl from following. Then run as perl dataparseso4F.pl /home/uila3/rhuff/doh/090809.txt  /home/uila3/rhuff/doh/file.txt


#!/usr/local/bin/perl

my $in = $ARGV[0] or die "Usage: $0 <inputfile> <outputfile>\n";
my $out = $ARGV[1] or die "Usage: $0 <inputfile> <outputfile>\n";

open FILE, "$in";

while (<FILE>)
{
/VARIABLE,PM2\.5/ && ($sec=1);
/NUMSTEPS,24/ && ($s24=1);
/BEGIN_DATA/ && ($bd=1);
/END_GROUP/ && ($sec=$s24=$bd=0);
/^([^\,]+).+?(\d+)\s*$/ && push(@dat, "$1 $2")  if ($sec && $s24 && $bd);
}
close FILE;

@l=localtime();
$dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);

open OUT, ">$out";
print OUT "$dt\n";
print OUT "$_\n" foreach @dat;
 
close OUT;


0
 

Author Comment

by:libertyforall2
ID: 33759013
Ok. Now I have another request. I was asked to create a separate file for each station. So everything will remain the same EXCEPT. I will have to run a file that creates one file for each station. I could have one script to do all stations or one script for each station then just change the station number for each script. so the files would look like this

Original file
09-24-2010 13:30:24
KA5 4
OV20 10
DH1 2
PA16 8
MV17 0
HL11 3
KN12 17
PC 4
KH19 0

New files (example of files for two separate stations.)

KA5.txt
09-24-2010 13:30:24 4


OV20 10.txt
09-24-2010 13:30:24 10
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33759082
Try this. It will create both the master file and individual station files:

#!/usr/local/bin/perl

my $in = $ARGV[0] or die "Usage: $0 <inputfile> <outputfile>\n";
my $out = $ARGV[1] or die "Usage: $0 <inputfile> <outputfile>\n";

open FILE, "$in";

@l=localtime();
$dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);

while (<FILE>)
{
      /VARIABLE,PM2\.5/ && ($sec=1);
      /NUMSTEPS,24/ && ($s24=1);
      /BEGIN_DATA/ && ($bd=1);
      /END_GROUP/ && ($sec=$s24=$bd=0);
      if ($sec && $s24 && $bd)
      {
            next if !/^([^\,]+).+?(\d+)\s*$/;
            push(@dat, "$1 $2") ;
            open OUT, ">/home/uila3/rhuff/doh/$1.txt";
            print OUT "$dt $2\n";
            close OUT;
      }
}
close FILE;


open OUT, ">$out";
print OUT "$dt\n";
print OUT "$_\n" foreach @dat;
 
close OUT;
0
 

Author Comment

by:libertyforall2
ID: 33792990
Ok. I am using the script below. Currently, the new file for each station has the time stamp and the final value. This final value represents the last hour of data. The original final contains 24 hours of data. What I need to do now is in addition to the current hour of data which is the final value, it needs to take a look at the second to last value (the value immediately preceding the final value) to see if it is equal to, less than or greater than the final value. If the preceeding value is either 999, -999, or equal to the final data, the additional term in the output file should say neutral. If the preceeding term is less less than the final term it should say increasing, and if it is greater than the final term it should say decreasing.

Currently the final output for each station looks like this:
 
KA5.txt
09-24-2010 13:30:24 4


Five represents the 24th hour of data. The new output file should look like this:

KA5.txt
09-24-2010 13:30:24 4 increasing


In the original input file line below., the number immediately preceding 4 was 0 so the word increasing shows the
trend of the data.
KA5,150030010,0,0,0,0,2,1,0,0,0,3,1,0,1,1,0,5,3,2,3,0,-999,0,0,4



[code]
#!/usr/local/bin/perl

my $in = $ARGV[0] or die "Usage: $0 </home/uila3/rhuff/doh/doh.txt> </home/uila3/rhuff/doh/so4table.txt>\n";
my $out = $ARGV[1] or die "Usage: $0 </home/uila3/rhuff/doh/doh.txt> </home/uila3/rhuff/doh/so4table.txt>\n";

open FILE, "$in";

@l=localtime();
$dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);

while (<FILE>)
{
      /VARIABLE,PM2\.5/ && ($sec=1);
      /NUMSTEPS,24/ && ($s24=1);
      /BEGIN_DATA/ && ($bd=1);
      /END_GROUP/ && ($sec=$s24=$bd=0);
      if ($sec && $s24 && $bd)
      {
            next if !/^([^\,]+).+?(\d+)\s*$/;
            push(@dat, "$1 $2") ;
            open OUT, ">/home/uila3/rhuff/doh/$1.txt";
            print OUT "$dt $2\n";
            close OUT;
      }
}
close FILE;


open OUT, ">$out";
print OUT "$dt\n";
print OUT "$_\n" foreach @dat;
 
close OUT;

[/code]
0
 

Author Comment

by:libertyforall2
ID: 33793006
"Five represents the 24th hour of data. The new output file should look like this:" I meant to say 4 not 5.
0
 
LVL 16

Expert Comment

by:jmatix
ID: 33793172
#!/usr/local/bin/perl

my $in = $ARGV[0] or die "Usage: $0 </home/uila3/rhuff/doh/doh.txt> </home/uila3/rhuff/doh/so4table.txt>\n";
my $out = $ARGV[1] or die "Usage: $0 </home/uila3/rhuff/doh/doh.txt> </home/uila3/rhuff/doh/so4table.txt>\n";

open FILE, "$in";

@l=localtime();
$dt = sprintf("%02d-%02d-%04d %02d:%02d:%02d", $l[4]+1, $l[3], $l[5]+1900, $l[2], $l[1], $l[0]);

while (<FILE>)
{
      /VARIABLE,PM2\.5/ && ($sec=1);
      /NUMSTEPS,24/ && ($s24=1);
      /BEGIN_DATA/ && ($bd=1);
      /END_GROUP/ && ($sec=$s24=$bd=0);
      if ($sec && $s24 && $bd)
      {
            next if !/^([^\,]+).+?(\d+)\,(\d+)\s*$/;
            push(@dat, "$1 $2") ;
            open OUT, ">/home/uila3/rhuff/doh/$1.txt";
#            open OUT, ">$1.txt";
                  my @res = ("increasing", "neutral", "decreasing");
            print OUT "$dt $2 $res[($2 cmp $3)+1]\n";
            close OUT;
      }
}
close FILE;


open OUT, ">$out";
print OUT "$dt\n";
print OUT "$_\n" foreach @dat;
 
close OUT;
0
 

Author Comment

by:libertyforall2
ID: 33793601
Ok. This works. Now how could I modify this to create a seperate master file that has all 24 number for all locations as opposed to just the final value. This would be a seperate perl file altogether. It would simply be a perl file that has all locations with all 24 values for each location and would look like this. The infile would still be /home/uila3/rhuff/doh/doh.txt and the outfile would be /home/uila3/rhuff/doh/masterfile.txt

masterfile.txt
09-24-2010 13:30:24
KA5,150030010,0,0,0,0,2,1,0,0,0,3,1,0,1,1,0,5,3,2,3,0,-999,0,0,4
OV20,150012020,17,17,18,11,11,6,6,16,9,8,10,13,11,8,7,5,6,6,3,5,6,4,9,10  
DH1,150031001,5,5,3,1,1,4,4,2,2,3,2,1,3,4,3,2,2,5,4,4,5,4,3,2  
PA16,150012016,8,11,7,6,10,8,6,4,5,6,6,5,3,6,6,3,4,4,6,8,6,6,8,8  
MV17,150012017,1,5,3,0,2,1,1,1,4,6,6,4,2,2,4,3,2,1,1,2,3,2,0,0  
HL11,150011006,4,2,-1,0,2,1,2,4,3,2,2,0,-1,0,4,4,2,1,1,3,5,5,2,3  
KN12,150011012,8,10,7,7,9,11,11,7,4,7,8,6,5,5,6,5,7,10,18,15,13,19,20,17  
PC,150032004,2,3,2,0,1,0,0,2,3,4,4,1,1,2,0,0,0,0,2,3,4,3,5,4  
KH19,150090006,1,0,0,4,7,4,1,0,0,1,1,1,3,2,6,11,18,5,3,2,2,0,0,0  
SI2,150031004,7,5,4,6,7,6,6,6,6,6,6,6,6,10,8,5,8,9,8,10,13,9,8,8  
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

If you haven’t already, I encourage you to read the first article (http://www.experts-exchange.com/articles/18680/An-Introduction-to-R-Programming-and-R-Studio.html) in my series to gain a basic foundation of R and R Studio.  You will also find the …
A short article about a problem I had getting the GPS LocationListener working.
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now