Link to home
Start Free TrialLog in
Avatar of zizi21
zizi21

asked on

big o notation for reading data in blocks

Hi,

If there is a file of p bytes (characters), you would take O(p) time to read the file. But, if you read the file in blocks of size o, would the big o notation for the time complexity be O((p/o)*p)?

Thanks a lot
Avatar of ozo
ozo
Flag of United States of America image

Unlikely, but it may be conceivable that some implementation could have been written to behave that way.
Avatar of zizi21
zizi21

ASKER

I did a program that read the first o characters and then the second o characters. But, I am not sure of the complexity. To read the entire file, you would use p/o blocks. So, the complexity is (p/o)*p. However, I am not sure if this is correct for this implementation. Thanks a lot.
SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of zizi21

ASKER

Thanks for replying.

I only read a block each time (and not the entire file).
So what is the time to read the first o characters?
Avatar of zizi21

ASKER

The time to read the first o characters is O(o).
The time to read the second o characters is O(o)
The time to read the last o characters is O(o).

At first, I thought, it should be added. O(o)+O(o) which gives O(p). But, sometimes, I see people put the number of blocks or the number of rounds that is required to read the entire O(p). So, p/o gives you the number of blocks or rounds. So, you need p/o rounds or blocks to read the entire file. I see, you don't times with p but with o. That is, (p/o)*o. So, it is O(p) I think.

I understand that constant is ignored. p/o looks like a constant but it would given different values for different data. So, it can be considered a variable. That is the reason, I am multipying with  p/o...
That is, (p/o)*o. So, it is O(p/o^2)
Look more carefully.
[Edit: sorry, I see you already did]
Avatar of zizi21

ASKER

Sorry ozo, that is what I wrote and then, I realise that it is o^2.  

This means that the big o notation is very simple. That is, O(p). It doesn't matter whether you read the files in blocks or at once.

Sorry for sounding stupid. p/o is a variable (and not a constant) because it gives a different value for different files. But, it would give similar value for the same input file. Did I understand this wrong ?
What if you try using actual values for demonstration?

Let's say you have a file of length 10 bytes (p). If you read 1 byte at a time, then you have 10 rounds, and that would be O(p). If you now break that up into 2 byte chunks (o), you now have 5 rounds. Since it takes 5 rounds to read the file, should the overall time not be O(p/o)--the overall size of the file divided by the size of the chunk. (I am assuming you're considering the time taken to read a chunk to be constant time, and not factoring in the reading of each byte into the complexity of reading a chunk.) I am not sure why you would want to multiply p or o to the division.
Avatar of zizi21

ASKER

Hi kaufmed,

I am a newbie in big o notations (I am trying to understand myself). This is what I understand (I may be wrong).

Say o is 5 and p is 10
It takes O(5) to read the first 5 characters
Another O(5) to read the second 5 characters

In total, it takes O(5+5) = O(10)
Now, we can get the same number if take the number of rounds which is 2 here and multply it with O(5) which gives O(10).

The confusion is that you need to ignore the constants but you keep the variables..
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of zizi21

ASKER

Thanks very much for the replies..
Avatar of zizi21

ASKER

I think, i understand now. When p doubles, the time doubles but the doubling is never reflected in the big oh notation.
I would say that the doubling is reflected in the big oh notation in the sense that
O(p) implies that when p doubles, the time doubles