Jerry N
asked on
Converting second timestamp data records into minute records
I have a table that has trading transactions and contains a timestamp field and a ASK price each time the price changes (tick data).
The timestamp has the format:1/1/2009 10:42:00.000000 PM
The records are sequential but do not have a consistent gap. some are the same time, some are 2 seconds apart, some 10 seconds apart.
I wish to load the beginning ask price and the ending ask price for each minute into another table (MIN_Table)
Then I will want to load the beginning and ending ask price for each hour into a third table (Hour_table)
How do I do this in SQL?
I know how to do an Append from one table to another, but How do I have it go through all the records and take the beginning and ending prices for the specific Timestamp time frame?
This is really twisting my head.
The timestamp has the format:1/1/2009 10:42:00.000000 PM
The records are sequential but do not have a consistent gap. some are the same time, some are 2 seconds apart, some 10 seconds apart.
I wish to load the beginning ask price and the ending ask price for each minute into another table (MIN_Table)
Then I will want to load the beginning and ending ask price for each hour into a third table (Hour_table)
How do I do this in SQL?
I know how to do an Append from one table to another, but How do I have it go through all the records and take the beginning and ending prices for the specific Timestamp time frame?
This is really twisting my head.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You don't need to add a "date" column to the table if it currently has an actual "timestamp" column. Older Oracle database versions only supported four data types: CHAR, VARACHAR2, NUMBER and DATE (and "DATE" columns only stored values to the nearest whole second). But, Oracle added a TIMESTAMP datatype in the Oracle10 database. This stores date/time values with much greater precision. By default, these are to the millionth of a second (6 decimal digits) but these can optionally be down to the billionth of a second (9 decimal digits).
You indicated that your timestamp column "has the format:1/1/2009 10:42:00.000000 PM". That may not be the complete truth. If this really is a TIMESTAMP datatype, that may be the way your system is currently configured to display TIMESTAMP values, but Oracle offers *MANY* different ways to display DATE and/or TIMESTAMP values. The actual values in both cases are stored internally as a proprietary type of number value, but Oracle presents them to us in characters strings that can easily be adjusted.
You indicated that your timestamp column "has the format:1/1/2009 10:42:00.000000 PM". That may not be the complete truth. If this really is a TIMESTAMP datatype, that may be the way your system is currently configured to display TIMESTAMP values, but Oracle offers *MANY* different ways to display DATE and/or TIMESTAMP values. The actual values in both cases are stored internally as a proprietary type of number value, but Oracle presents them to us in characters strings that can easily be adjusted.
Markgeer is correct if the column is really of type timestamp - I interpreted the Gnovak's words as having a timestamp column that is of varchar2 type.
ASKER
sorry for the confusion. The dataype for the column is TIMESTAMP. I copied a sample via TOAD.
I'll give the shajukg query a try as well as the other. Curious to see if there is a speed difference as well.
As an aside, someone mentioned that "Partitioning" may be an answer. I have also heard that partitioning is not a good practice in some cases.
I have never utilized partitioning.
Any comments on that?
I'll give the shajukg query a try as well as the other. Curious to see if there is a speed difference as well.
As an aside, someone mentioned that "Partitioning" may be an answer. I have also heard that partitioning is not a good practice in some cases.
I have never utilized partitioning.
Any comments on that?
Partitioning is an excellent option in some cases. It works especially well with historical data (like transactions) that get written once, and never updated or deleted and that include a date or timestamp column that records the time each record was created.
Tables with data like this can be partitioned on the column that includes this date/time value. I usually like to create a separate partition for each month, but this could be done by day, week, 2-month period, quarter, year, or any other time interval that makes sense for the application. After a particular time period is closed, this partition can then be compressed and the index(es) for the partition rebuilt. This often yields a 20-50% reduction in space required for historical data, and that can translate into almost that same speed increase for queries of this data, since the system has fewer physical blocks of data to read, fit into cache buffers, etc.
Indexes on tables like this should be "locally-partitioned" (that is, partitioned to match the way the table is partitioned) so that when you do the compression of each historical period after it closes, you only need to rebuild the part of each index that corresponds to the partition that was just compressed. You don't want to have to rebuild entire index(es), because that will make the table unusable by the application.
Tables with data like this can be partitioned on the column that includes this date/time value. I usually like to create a separate partition for each month, but this could be done by day, week, 2-month period, quarter, year, or any other time interval that makes sense for the application. After a particular time period is closed, this partition can then be compressed and the index(es) for the partition rebuilt. This often yields a 20-50% reduction in space required for historical data, and that can translate into almost that same speed increase for queries of this data, since the system has fewer physical blocks of data to read, fit into cache buffers, etc.
Indexes on tables like this should be "locally-partitioned" (that is, partitioned to match the way the table is partitioned) so that when you do the compression of each historical period after it closes, you only need to rebuild the part of each index that corresponds to the partition that was just compressed. You don't want to have to rebuild entire index(es), because that will make the table unusable by the application.
ASKER
The query works great for obtain the min/max but how do I obtain the first value and the last value?
Did you try my query? Just ignore the part about converting to date type.
there is no impact MAX function in the query posted by me as it sorts all the records based timestamp and takes the dense rank. if there are multiple rows in the first rank then only group function makes impact.
ASKER
The Data looks like this for 1 minute:
ID TICKTIME ASK
1 1/1/09 10:42 PM 0.7063 <<<open
2 1/1/09 10:42 PM 0.70625
3 1/1/09 10:42 PM 0.70635
4 1/1/09 10:42 PM 0.70635
5 1/1/09 10:42 PM 0.70635
6 1/1/09 10:42 PM 0.70635
7 1/1/09 10:42 PM 0.7063
8 1/1/09 10:42 PM 0.7063
9 1/1/09 10:42 PM 0.70635 <<<close
I'm looking to generate the following for the minute 10:42
TICKTIME HIGH LOW OPEN CLOSE
1/1/09 22:42 0.70635 0.70625 0.7063 0.70635
I run the query:
select to_char(ticktime,'dd/mm/yy yy hh24:mi'),max(ask) KEEP (DENSE_RANK first ORDER BY ticktime),
max(ask) KEEP (DENSE_RANK last ORDER BY ticktime)
from Ticktemp
group by to_char(ticktime,'dd/mm/yy yy hh24:mi');
I get:
TO_CHAR(...) MAX(ASK)KEEP(DENSE_RANKFIR STORDERBYT ICKTIME) MAX(ASK)KEEP(DENSE_RANKLAS TORDERBYTI CKTIME)
01/01/2009 22:42 0.70635 0.70635
suggestions?
ID TICKTIME ASK
1 1/1/09 10:42 PM 0.7063 <<<open
2 1/1/09 10:42 PM 0.70625
3 1/1/09 10:42 PM 0.70635
4 1/1/09 10:42 PM 0.70635
5 1/1/09 10:42 PM 0.70635
6 1/1/09 10:42 PM 0.70635
7 1/1/09 10:42 PM 0.7063
8 1/1/09 10:42 PM 0.7063
9 1/1/09 10:42 PM 0.70635 <<<close
I'm looking to generate the following for the minute 10:42
TICKTIME HIGH LOW OPEN CLOSE
1/1/09 22:42 0.70635 0.70625 0.7063 0.70635
I run the query:
select to_char(ticktime,'dd/mm/yy
max(ask) KEEP (DENSE_RANK last ORDER BY ticktime)
from Ticktemp
group by to_char(ticktime,'dd/mm/yy
I get:
TO_CHAR(...) MAX(ASK)KEEP(DENSE_RANKFIR
01/01/2009 22:42 0.70635 0.70635
suggestions?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
are u storing "seconds" values also ?
if it is zero then we can not determine which one is the first record
if it is zero then we can not determine which one is the first record
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
or if u r identifieng the first and last transaction using ID then change " ORDER BY " clause from ticktime to ID
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes - exactly !
I assign the ID based on the order recieved as the data is aquired in order. The seconds do not appear to be accurate.
So I'll use ID and that should work out great.
Thanks so much for the info - I have to check on what KEEP(DENSE_RANK..) does. Not familiar with that.
I assign the ID based on the order recieved as the data is aquired in order. The seconds do not appear to be accurate.
So I'll use ID and that should work out great.
Thanks so much for the info - I have to check on what KEEP(DENSE_RANK..) does. Not familiar with that.
ASKER
Thanks everyone for helping out. The thread and the solutions were great.
ASKER
How can I do the above for say 15 Min, or 4 hours?
ASKER
The time frames I'm looking for are: 5 min, 15 min, 30 min, 1 hour, 4 hour, 1 day , 1 week.
Are these possible using this type of query?
Are these possible using this type of query?
ASKER
Just found out after doing some research that the above DOESNT work,
It appears that it gives me every minute of every 1st day of the month if I use: GROUP BY TO_CHAR (ticktime, 'dd/mm/yyyy hh24:mi');
I'm going to try the first query tomorrow.
It appears that it gives me every minute of every 1st day of the month if I use: GROUP BY TO_CHAR (ticktime, 'dd/mm/yyyy hh24:mi');
I'm going to try the first query tomorrow.
ASKER
CORRECTION - IT DOES WORK
But still trying to find the 5, 15, 30 min & 4hour, day 1 week solutions
But still trying to find the 5, 15, 30 min & 4hour, day 1 week solutions
max(price) KEEP (DENSE_RANK last ORDER BY time_stamp_column)
from your_table
group by to_char(time_stamp_column,
select to_char(time_stamp_column,
max(price) KEEP (DENSE_RANK last ORDER BY time_stamp_column)
from your_table
group by to_char(time_stamp_column,