Data Organization

Posted on 2010-11-10
Medium Priority
Last Modified: 2012-05-10
I want to create an application that can take a certain data as input and hold it in the data base in a fashion that it is quick to search through it. Also, we want to be able to do 1000s of search simultaneously.

The Data:
The data in question is parking regulation information which varies from block to block in any given city. One block could have 3 hour parking from 8:00am to 8:00pm while the other block could have 2 hour parking and the third can have rush hour no parking from 4:00pm to 6:00pm but back to regular parking after 6:00pm till 8:00pm.
The elements are:
• Location
• Regular Parking Hours
• Rush Hour Time
• Truck Loading Time
• Loading Zones
• Valet Parking


GPS_Location = A:
Reg_Parking = <8:00am to 4:00pm> & <6:01pm to 10:00pm>
Rush_hour = 4:01Pm to 6:00pm
No_of_Meters = 3
Loading_Zone = Yes
Truck_Loading = <7:00am to 7:59am>

We want to organize this data so that one can use their mobile phone to send back a GPS location and receive information if they can park at that location.

I am not a programmer and so looking for some expert advice on how to go about thinking about this application such that we can input a CSV file with all the regulations information into the application and the application can organize the data.
Question by:Nkhanna
  • 2
LVL 46

Expert Comment

by:Kent Olsen
ID: 34101828
Hi Nkhanna,

If the data will accessed primarily by the location, you'll want to structure the table so that the location is the primary key or there is a clustered index on the location/time column(s).

Your data shows that you can have multiple times for an "event", in this case regular parking.  That's a level of complexity that you need to avoid.  Instead, define a row so that it defines the parking options over a time range.

A row in the table may cover 13:30 to 16:00 as "normal time".  Another row shows that "rush hour" starts at :16:00 and continues for 2 hours.  Another row would show "normal time" resuming at 18:00 and continuing until midnight.

There are certainly other ways to code the database, but I'd start with something like this.  Any "point in time" for a location is defined in exactly 1 row.

Good Luck,

--  Record parking options over time for this data:
-- GPS_Location = A:
-- Reg_Parking = <8:00am to 4:00pm> & <6:01pm to 10:00pm>
-- Rush_hour = 4:01Pm to 6:00pm
-- No_of_Meters = 3
-- Loading_Zone = Yes
-- Truck_Loading = <7:00am to 7:59am>

  id auto_increment ...,
  GPS_Location    location,
  StartTime       time,
  duration        integer, -- in seconds OR an EndTime
  NumberOfMeters  integer,
  RegularParking  integer,   -- true/false
  RushHour        integer,   -- true/false
  LoadingZone     integer,   -- true/false
  TruckLoading    integer    -- true/false

Open in new window


Author Comment

ID: 34103486
So I guess it would be a linear table like the one I have attached. Can I reduce the size of the database by only putting in the regular hour such as
• Reg_Parking = 8:00am to 10:00pm
• Rush_Hour = 4:30pm to 6:30pm

and then let the logic figure out based on current time if the person can park here or not. Example:
If I send the GPS coordinate back to the server at 4:40pm then the backend can run the logic and figure out that this coordinate falls in the range of location that is now in rush hour time zone and hence will respond saying that "Rush Hour - Cannot park till 6:30pm"

Is there a better approach?
LVL 46

Expert Comment

by:Kent Olsen
ID: 34104086
Hi Nkhanna,,

You can, but you take out some flexibility.  For example, most urban areas have morning and evening rush hours.  Some also have a noon rush hour.  If you want to track all 3, you'll need to have columns for all 3 in the row.  That's actually a very good way to store this data if it's consistent with your usage.

Based on your description, I think that you'll get the best performance with a smaller granularity to each row.  The table will have 1 or 2 indexes.  If you need to search by these other fields you'll actually increase the storage requirements of the indexes because you'll have more indexes on the same columns.  When searches (even by separate tasks) involve several indexes, more system memory is required to buffer the additional index blocks and more physical I/O is required.


Accepted Solution

dragos_craciun earned 2000 total points
ID: 34110621
Hello Nkhanna,

I would recommend the following approach:
You need to answer very quickly to the following question: I am at this place X at this time T, may I park?
So, you need to have a table that looks like this:

CREATE TABLE Computed_Parking_Info
  GPS_Location    location,
  StartTime       time,
  EndTime time,
  YesNo integer -- 0 = NO, 1 = Yes

In this way you can query the database very fast without the need to make other calculations with a query like this:
SELECT YesNo from Computed_Parking_Info
WHERE GPS_Location = X and T >= StartTime and T < EndTime

Make sure to not use BETWEEN because you may get two answers as BETWEEN is equivalent to <= and >= and you get two answers at 6:00.
Do not use BETWEEN with 5:59 as End Time and 6:00 as Start time as you will get no answer for 5:59.30
So stick with StartTime = EndTime of the next insterval and >= + <

Now, since you will not have an awful lot of places you can do whatever design works for you for the base table or tables that will keep the "source" data that will be managed from the front end.
Since the data does not change every hour you can rebuild the computed table every time you have changes.

Also, I will recommend to test performance of the GIS, spatial indexes, location type.
Such extensions are usually complex as they need to answer complex questions but do not perform very well with simple cases.

So I recommend to test the performance of spatial extensions versus simple indexes

I work with Oracle with databases that heavily use Oracle Spatial and for simple latitude/longitude point search we discovered that is more effective to use latitude and longitude columns and indexes on them.
Of course for area intersection calculus or other things spatial extensions are more that welcome, but in simple cases may be slower that the crude approach.

Regarding the design of the rest of the database .. it's a long discussion and it also depends of your skill as database programmer and the way you will build the computed table

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Lotus Notes has been used since a very long time as an e-mail client and is very popular because of it's unmatched security. In this article we are going to learn about  RRV Bucket corruption and understand various methods to Fix "RRV Bucket Corrupt…
Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question