Link to home
Start Free TrialLog in
Avatar of yairy
yairy

asked on

how lseek work ?

Hi,

I want to build a very fast DB.
I want to fetch records fast by using an entry table at the start of the file, that will give me offsets of all entries ahead.

My question is:
How lseek work ? (not the syntex..., the algorithm behind)
Does lseek work very fast ?


Yair
Avatar of DanRollins
DanRollins
Flag of United States of America image

lseek moves the file pointer to a particular location in the file -- a byte offset, typically from the beginning of the file.  Then the next time you read from the file, the data that was at that location is read from disk into memory.

To satisfy your goal, you will need to maintain a sort of directory of information at the front of the file.  You will read that directory into memory.  Then when you want a particular piece of data, you will look into your "directory" to learn the byte offset where the actual data is stored.  Then you will use lseek() to position the file pointer to that location, then you will use read() to read the data into meory.

>>Does lseek work very fast ?

lseek takes almost exactly 0 time to execute.

-- Dan
Avatar of peterchen092700
peterchen092700

>> lseek takes almost exactly 0 time to execute.
however, you can expect the next read() to take longer because of the lseek
ASKER CERTIFIED SOLUTION
Avatar of nietod
nietod

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>> >> lseek takes almost exactly 0 time to execute.
>> however, you can expect the next read() to
>> take longer because of the lseek
You can't be sure of either of these statements.  it depends on how the implimenter designed things.  If the implimenter makes lseek() perform the work, then it will take time, otherwise the work will be delayed until the next read/write operation.  (on DOS/Windows its very likely that the work will be delayed until the next read/write).  but no matter what, you will eventually have to "pay the price".
Avatar of yairy

ASKER

>For this project you might consider looking at more advanced file organizations.  Like using external
indexes to speed access, for example.

Can you expand a few words on that.

Yair
>> Can you expand a few words on that.
Unfortunatley the expanded words are called "books".

There are many good books dealing with algorithms, especially sorting and especially external sorting (meaning the sorted material is stored on disk) as well as books on DB design.

I like using seperate index and database files.  The database files stores the actual data, sort of like a disk-based array but in random order.  Well basically in the order in which the records were added as each record is added at the end.  Then a seperate index file is used to track the sort keys and the offsets or record nubmers to the records they sort for.  So the index files is the only file that needs to be kept "sorted"  (its only sorted in a sense) and since it is comparitibly small it can be kept sorted more quickly than the larger Database file.  You can access the database records in sorted order by going through the index in sorted order and for each key in the index, looking up (a quick direct seek) the record in the database file.  There are many different types of indexing algorithms, I like knuth's N-way trees.  They are extradonarily fast for, adding, deleting, exact and near searching,  and sequential operations, they are relatively small, having no more than 50% wasted space.  
>>it depends on how the implimenter designed things.
>> If the implimenter makes lseek() perform the work, then it will take time...

I'm trying to imagine a professional programmer sitting down to write an operating system and or to implement  stream handling and deciding to position the drive head and do other time-consuming operations in the lseek call.  Tyying.... trying... trying... Nope, I just can't imagine it.

-- Dan
>> deciding to position the drive head and do other >> time-consuming operations in the lseek call
Where did I say that fseek() woudl reposition the drive head?  fseek() would never position the drive head.  That makes no sense since there can be multiple files open.  However it certainly might (very likely will) reposition the OS file pointer and that is not likely to be a trivial operation.  It will usually involve searching through file allocation tables and other data structures in order to calculate the location on disk cooresponding to the requested file offset.
Avatar of yairy

ASKER

I'll take nietod answer, and I want to thank DanRollins.

Yair