Performance of Register variable and Array variable

Posted on 2003-03-08
Medium Priority
Last Modified: 2010-04-21
  I am using RedHat 8 running on Intel Pentium III 533Mhz Dell PC, GCC 3.2
  I have some codes that I need to read data from file, process it, and write back to file, and the total time used are recorded. as below:

// declaration
  register unsigned long A, B, C, D;
  unsigned long data[2048];

I will read data from file into data[2048]
then do a loop processing of the data using the A, B, C, D (4 unsigned long per loop) and result in A, B, C, D (it is actually modified MD5 processing)

  and lastly I need to XOR the result back with the data:
  data[i++]^=A; data[i++]^=B; data[i++]^=C; data[i++]^=D;

  after process all 2048 unsigned long, write data[2048] to output file.

and I found with the the following xor
  data[i++]^=A; data[i++]^=B; data[i++]^=C; data[i++]^=D;
the result is around 3.6 seconds.
but if the statement become
  A^=data[i++]; B^=data[i++]; C^=data[i++]; D^=data[i++];

the result is only 1.9 seconds.
but it is not the case because I can't write A, B, C, D to output file directly. (Even I used putchar, the result even worst)

I need your advice how to solve the problem.


Question by:siakhooi
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Expert Comment

ID: 8124741
>I need your advice how to solve the problem

But you didn't say what the problem is.

Is the problem that you need for the job to be done in 1.9 seconds?  It is apparently not possible on your system to do it that fast.

Is the problem that you don't understand why the 2nd variation takes less time than the first?  Here's why:  The data[i++] ^= A does all the same things as A ^= data[i++], PLUS stores the result in memory.  An exlusive or operation takes place in registers.  When memory locations (such as data[i] are involved, the program has to load or store in addition to doing the xor.

btw, the "register" data attribute is ignored by Gcc.  Gcc decides for itself what variables should be registers.

Author Comment

ID: 8124927
Oh. is my problem.  :-)

OK. What I actually need is how to reduce the time from 3.6s to less than 2s.


Expert Comment

ID: 8129158
You can probably squeeze a little more speed out of it by making it an array of 512 of struct {unsigned long A, unsigned long B, unsigned long C, unsigned long D} and thus incrementing the index 1/4 as many times.  Gcc's -funroll_loops optimization option may help a little too.

But you really can't get around the time it takes to store 2048 words into memory.
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!


Expert Comment

ID: 8182436
i am not too sure,but will O3 flag in compilation help?

Author Comment

ID: 8186098
no, even I put -mcpu=pentium3 also no help.

LVL 20

Expert Comment

ID: 10102007
No comment has been added lately, so it's time to clean up this TA.
I will leave the following recommendation for this question in the Cleanup topic area:

PAQ with points refunded

Please leave any comments here within the next seven days.

EE Cleanup Volunteer

Accepted Solution

modulo earned 0 total points
ID: 10156958
PAQed, with points refunded (100)

Community Support Moderator

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever been frustrated by having to click seven times in order to retrieve a small bit of information from the web, always the same seven clicks, scrolling down and down until you reach your target? When you know the benefits of the command l…
The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
Suggested Courses

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question