Abhinav Grover
asked on
Cassandra Select Query
I have a 3 node datastax cassandra(Community) cluster with huge data. I have few tables which contain 3-5 billion records in them. I want to delete data that is older than 90 days from those tables.
The problem is how do i run a select query which runs without timeout. I am currently running below query
NOW=$(date -d "-3 month" +"%Y-%m-%d")
select day_ts from table_name where minute_ts < '$NOW' LIMIT 100000 ALLOW FILTERING;
Even if i limit the select query result, it will still parse the whole 3-5 billion records and then filter the data.
Please suggest what can be a efficient way to do this.
The problem is how do i run a select query which runs without timeout. I am currently running below query
NOW=$(date -d "-3 month" +"%Y-%m-%d")
select day_ts from table_name where minute_ts < '$NOW' LIMIT 100000 ALLOW FILTERING;
Even if i limit the select query result, it will still parse the whole 3-5 billion records and then filter the data.
Please suggest what can be a efficient way to do this.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.