Solved

C - fast Searching for a string in a textfile

Posted on 1998-08-22
3
192 Views
Last Modified: 2010-04-15
hi folks,

i'm developing with borland c++ 4.52 under nt 4.0 but the project is compiled as an easywin 16-bit application.

is it possible to open a very large textfile (2-4 GB) and is so how can i find a certain string-token (e.g. a filename) as fastest as it can be? is this the right way or are there any other opportunities to do that, which lead to a better performance?
0
Comment
Question by:tha_incredible_bo
  • 2
3 Comments
 
LVL 11

Accepted Solution

by:
alexo earned 50 total points
ID: 1252263
A 16-bit application is limited by the underlying OS architecture.  Unfortunately, files larger than 2^31 bytes (2GB) are not supported on 16bit applications because all the functions that deal with file positions are designed to work with 32bit signed values.

Finding a string in a file is easier if the file consists of fixed-size records.  In that case, you can read a buffer at a time (each buffer will consist of n records) and search each record using strcmp().  If there are no fixed-size records than you'll have to deal with the possibility that the string may spawn across two buffers so you'll have to make sure that the buffers you read overlap by the length of the string.

In any case, if you are accessing the file in a purely sequential way, you can save some overhead by using OS commands directly instead of C run-time functions.
0
 
LVL 1

Author Comment

by:tha_incredible_bo
ID: 1252264
hi alexo,

thanks for your answer. Unfortunately the records are not with fixed length so that the work to handle them is a little bit difficult. Allright I think this will cost some time for me.

If you'd be so kind will you please explain what you mean with using OS commands directly and which ones to use (instead of C run-time functions)?
0
 
LVL 11

Expert Comment

by:alexo
ID: 1252265
Well, C is a very small language but it comes with an extensive run-time library (it is either linked statically to your program or dynamically - as a DLL).  This library has many functions that deal with files.  Those can be categorized into:
1. High level: fopen(), fread(), fwite(), etc.
2. Low level: open(), read(), write(), etc.

Now, those functions try to be efficient for the *general case* by using caching, bufferring, etc.  However, if you only want to access a file sequentially (open it, read it, close it) you can save some unnecessery (in your case) overhead by using the underlying OS functions instead.  In the case of Win3.x those are _lopen(), _hread(), _hwrite(), etc.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question