Solved

Data Organization

Posted on 2010-11-10
4
665 Views
Last Modified: 2012-05-10
Hello,
I want to create an application that can take a certain data as input and hold it in the data base in a fashion that it is quick to search through it. Also, we want to be able to do 1000s of search simultaneously.

The Data:
The data in question is parking regulation information which varies from block to block in any given city. One block could have 3 hour parking from 8:00am to 8:00pm while the other block could have 2 hour parking and the third can have rush hour no parking from 4:00pm to 6:00pm but back to regular parking after 6:00pm till 8:00pm.
The elements are:
• Location
• Regular Parking Hours
• Rush Hour Time
• Truck Loading Time
• Loading Zones
• Valet Parking


Example:

GPS_Location = A:
Reg_Parking = <8:00am to 4:00pm> & <6:01pm to 10:00pm>
Rush_hour = 4:01Pm to 6:00pm
No_of_Meters = 3
Loading_Zone = Yes
Truck_Loading = <7:00am to 7:59am>

We want to organize this data so that one can use their mobile phone to send back a GPS location and receive information if they can park at that location.

I am not a programmer and so looking for some expert advice on how to go about thinking about this application such that we can input a CSV file with all the regulations information into the application and the application can organize the data.
0
Comment
Question by:Nkhanna
  • 2
4 Comments
 
LVL 45

Expert Comment

by:Kdo
Comment Utility
Hi Nkhanna,

If the data will accessed primarily by the location, you'll want to structure the table so that the location is the primary key or there is a clustered index on the location/time column(s).

Your data shows that you can have multiple times for an "event", in this case regular parking.  That's a level of complexity that you need to avoid.  Instead, define a row so that it defines the parking options over a time range.

A row in the table may cover 13:30 to 16:00 as "normal time".  Another row shows that "rush hour" starts at :16:00 and continues for 2 hours.  Another row would show "normal time" resuming at 18:00 and continuing until midnight.


There are certainly other ways to code the database, but I'd start with something like this.  Any "point in time" for a location is defined in exactly 1 row.


Good Luck,
Kent

--
--  Record parking options over time for this data:
--
-- GPS_Location = A:
-- Reg_Parking = <8:00am to 4:00pm> & <6:01pm to 10:00pm>
-- Rush_hour = 4:01Pm to 6:00pm
-- No_of_Meters = 3
-- Loading_Zone = Yes
-- Truck_Loading = <7:00am to 7:59am>

CREATE TABLE Parking
(
  id auto_increment ...,
  GPS_Location    location,
  StartTime       time,
  duration        integer, -- in seconds OR an EndTime
  NumberOfMeters  integer,
  RegularParking  integer,   -- true/false
  RushHour        integer,   -- true/false
  LoadingZone     integer,   -- true/false
  TruckLoading    integer    -- true/false
);

Open in new window

0
 

Author Comment

by:Nkhanna
Comment Utility
So I guess it would be a linear table like the one I have attached. Can I reduce the size of the database by only putting in the regular hour such as
• Reg_Parking = 8:00am to 10:00pm
• Rush_Hour = 4:30pm to 6:30pm

and then let the logic figure out based on current time if the person can park here or not. Example:
If I send the GPS coordinate back to the server at 4:40pm then the backend can run the logic and figure out that this coordinate falls in the range of location that is now in rush hour time zone and hence will respond saying that "Rush Hour - Cannot park till 6:30pm"

Is there a better approach?
Parking-Table.xls
0
 
LVL 45

Expert Comment

by:Kdo
Comment Utility
Hi Nkhanna,,

You can, but you take out some flexibility.  For example, most urban areas have morning and evening rush hours.  Some also have a noon rush hour.  If you want to track all 3, you'll need to have columns for all 3 in the row.  That's actually a very good way to store this data if it's consistent with your usage.

Based on your description, I think that you'll get the best performance with a smaller granularity to each row.  The table will have 1 or 2 indexes.  If you need to search by these other fields you'll actually increase the storage requirements of the indexes because you'll have more indexes on the same columns.  When searches (even by separate tasks) involve several indexes, more system memory is required to buffer the additional index blocks and more physical I/O is required.


Kent
0
 
LVL 6

Accepted Solution

by:
dragos_craciun earned 500 total points
Comment Utility
Hello Nkhanna,

I would recommend the following approach:
You need to answer very quickly to the following question: I am at this place X at this time T, may I park?
So, you need to have a table that looks like this:

CREATE TABLE Computed_Parking_Info
(
  GPS_Location    location,
  StartTime       time,
  EndTime time,
  YesNo integer -- 0 = NO, 1 = Yes
);

In this way you can query the database very fast without the need to make other calculations with a query like this:
SELECT YesNo from Computed_Parking_Info
WHERE GPS_Location = X and T >= StartTime and T < EndTime

Make sure to not use BETWEEN because you may get two answers as BETWEEN is equivalent to <= and >= and you get two answers at 6:00.
Do not use BETWEEN with 5:59 as End Time and 6:00 as Start time as you will get no answer for 5:59.30
So stick with StartTime = EndTime of the next insterval and >= + <

Now, since you will not have an awful lot of places you can do whatever design works for you for the base table or tables that will keep the "source" data that will be managed from the front end.
Since the data does not change every hour you can rebuild the computed table every time you have changes.

Also, I will recommend to test performance of the GIS, spatial indexes, location type.
Such extensions are usually complex as they need to answer complex questions but do not perform very well with simple cases.

So I recommend to test the performance of spatial extensions versus simple indexes

I work with Oracle with databases that heavily use Oracle Spatial and for simple latitude/longitude point search we discovered that is more effective to use latitude and longitude columns and indexes on them.
Of course for area intersection calculus or other things spatial extensions are more that welcome, but in simple cases may be slower that the crude approach.

Regarding the design of the rest of the database .. it's a long discussion and it also depends of your skill as database programmer and the way you will build the computed table
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Join & Write a Comment

Introduction This article is designed to assist GIS (Geographic Information System) and GPS (Global Positioning System) developers using ESRI ArcGIS and other spatial information management systems.   For the uninitiated the concept of projectio…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This tutorial walks through the best practices in adding a local business to Google Maps including how to properly search for duplicates, marker placement, and inputing business details. Login to your Google Account, then search for "Google Mapmaker…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now