Link to home
Start Free TrialLog in
Avatar of Abhinav Grover
Abhinav Grover

asked on

Cassandra Select Query

I have a 3 node datastax cassandra(Community) cluster with huge data. I have few tables which contain 3-5 billion records in them. I want to delete data that is older than 90 days from those tables.

The problem is how do i run a select query which runs without timeout. I am currently running below query

NOW=$(date -d "-3 month" +"%Y-%m-%d")
select day_ts from table_name where minute_ts < '$NOW' LIMIT 100000 ALLOW FILTERING;


Even if i limit the select query result, it will still parse the whole 3-5 billion records and then filter the data.

Please suggest what can be a efficient way to do this.
ASKER CERTIFIED SOLUTION
Avatar of Tomas Helgi Johannsson
Tomas Helgi Johannsson
Flag of Iceland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial