Solved

C - fast Searching for a string in a textfile

Posted on 1998-08-22
3
174 Views
Last Modified: 2010-04-15
hi folks,

i'm developing with borland c++ 4.52 under nt 4.0 but the project is compiled as an easywin 16-bit application.

is it possible to open a very large textfile (2-4 GB) and is so how can i find a certain string-token (e.g. a filename) as fastest as it can be? is this the right way or are there any other opportunities to do that, which lead to a better performance?
0
Comment
Question by:tha_incredible_bo
  • 2
3 Comments
 
LVL 11

Accepted Solution

by:
alexo earned 50 total points
ID: 1252263
A 16-bit application is limited by the underlying OS architecture.  Unfortunately, files larger than 2^31 bytes (2GB) are not supported on 16bit applications because all the functions that deal with file positions are designed to work with 32bit signed values.

Finding a string in a file is easier if the file consists of fixed-size records.  In that case, you can read a buffer at a time (each buffer will consist of n records) and search each record using strcmp().  If there are no fixed-size records than you'll have to deal with the possibility that the string may spawn across two buffers so you'll have to make sure that the buffers you read overlap by the length of the string.

In any case, if you are accessing the file in a purely sequential way, you can save some overhead by using OS commands directly instead of C run-time functions.
0
 
LVL 1

Author Comment

by:tha_incredible_bo
ID: 1252264
hi alexo,

thanks for your answer. Unfortunately the records are not with fixed length so that the work to handle them is a little bit difficult. Allright I think this will cost some time for me.

If you'd be so kind will you please explain what you mean with using OS commands directly and which ones to use (instead of C run-time functions)?
0
 
LVL 11

Expert Comment

by:alexo
ID: 1252265
Well, C is a very small language but it comes with an extensive run-time library (it is either linked statically to your program or dynamically - as a DLL).  This library has many functions that deal with files.  Those can be categorized into:
1. High level: fopen(), fread(), fwite(), etc.
2. Low level: open(), read(), write(), etc.

Now, those functions try to be efficient for the *general case* by using caching, bufferring, etc.  However, if you only want to access a file sequentially (open it, read it, close it) you can save some unnecessery (in your case) overhead by using the underlying OS functions instead.  In the case of Win3.x those are _lopen(), _hread(), _hwrite(), etc.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now