• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 520
  • Last Modified:

c code optimization

i have attached a C code for the  which will  run on  DSP.
I need to optimise it for speed (less number of cycles). please suggest some ways..

note:
WIDTHX is the width of the image, HEIGHTY is its height. the frame_1 is a 1-D array
pixels are stored sequentially. i have attached the original and the modified code.
original was taking around 111million. I modified it to take 55. furthur modifications are necessary
i need to bring it below 10million.. modified.c
Original.c
0
srimallikarthik
Asked:
srimallikarthik
  • 3
  • 2
2 Solutions
 
satsumoSoftware DeveloperCommented:
You could build a 255 by 255 table that says which combinations of gradientX and gradientY pass the edge test.  Rather than doing _mpyu twice, just look up the edgemap value (0 or 255) in the table.  If you used 1 byte per combination the table is 64k.  Though you only need half of it and theoretically only 1 bit per combination, so 4k as a minimum.
0
 
satsumoSoftware DeveloperCommented:
Will locality of reference make a difference on the DSP?  If the image is very wide, will it miss a cache when referring to the line below?  If so you might consider dividing the image into smaller sub images, if that helps the caching.
0
 
satsumoSoftware DeveloperCommented:
You might also try referring to each source pixel with its own pointer.  So there will be 4 increment/decrement operations but no arithmetic on indexes.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ozoCommented:
If you are multiplying gradientX and gradientY by itself, there is no need for the abs operation
But getting from 55 to 10million would probably require better use of parallelism in the DSP
0
 
srimallikarthikAuthor Commented:
Experts,

Thanks for the support and comments.

Firstly, Sorry for the delayed response.

I made it to 10 million IPC by changing the code as per the satsumo's suggestion of using pointer arithmatic for incrementing and decrementing

And this is not the total solution. Only index calculation i used satsumo's suggestion of using pointers, Rest of the implementation i did it in assembly.


~Karthik
0
 
srimallikarthikAuthor Commented:
suggestions was helpful indeed, but are not one shot answers which saved my time. so GRADE B
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now