I'm capturing AWS Cloudwatch Metrics for our EC2 and RDS servers in CSV format using 2 lambda functions.
Everyday at 10 pm, the lambda functions get executed and create a folder in a S3 bucket.
So each new folder created everyday will have several csv files like APP-1CPUUtilization.csv , APP-1MEMUtilization.csv , APP-2CPUUtilization.csv , APP-2MEMUtilization.csv and so on.
If i add a New server, lets say APP-10 for example, 2 files will automatically get created as APP-10CPUUtilization.csv & APP-10MEMUtilization.csv.
Each of these files have 2 Columns, |Date | & |CPU Utilization % | for a CPUUtilization file and |Date | & | MEM Utilization % | for a MEMUtilization file.
The date value in the date column i.e. for example 2018-02-28 07:36:00+00:00 , no. of rows i.e. 289 per day and Column names are same for every file.The file name is what differentiates each file.
So i want to create a one single file for all the instances per day with column names |Date | APP1-CPU Utilization % |APP2-MEM Utilization % |APP2-CPU Utilization %|APP2-MEM Utilization % |...and so on.
The column name APP1-CPU Utilization % should be picked from the source file name.
Now lets say if i add a New server called APP-10 for 2 days, hence 2 New files would be created for each day i.e. APP-10CPUUtilization.csv & APP-10MEMUtilization.csv, now the lambda which gets executed at 11:00 PM to concatenate all the files into one,should add New columns to the file ( for those 2 days ) as |Date | APP1-CPU Utilization % |APP2-MEM Utilization % |APP2-CPU Utilization %|APP2-MEM Utilization % | APP10-CPU Utilization % |APP10-MEM Utilization % |
Is this possible please ?
End goal is to ingest this file into Redshift table.