Link to home
Start Free TrialLog in
Avatar of Venkatesh Nagilla
Venkatesh Nagilla

asked on

AWS S3 bucket logging using python

I have a AWS S3 path that contains files in .tsv format. Now i need to get metrics for that file like how many rows, columns are there, start_time and end_time, table name etc.,.
Avatar of Devin Becker
Devin Becker
Flag of United States of America image

I've mostly used the AWS CLI when working with S3, but these docs have a pretty good and extensive overview of the S3 Python API.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html

You would most likely have to download the file first, before finding the rows and columns, along with start_time and end_time, table name, etc.

Hopefully this helps,

Devin Becker
DevOps Associate @ EE
You may want to take a look at Amazon Athena for this: https://aws.amazon.com/athena/

I haven't used it, but it should give you a way to analyze/query data stored in S3.  Also, looks like they support TSV by default: https://docs.aws.amazon.com/athena/latest/ug/supported-format.html
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.