c code optimization

Posted on 2011-04-22
Last Modified: 2012-05-11
i have attached a C code for the  which will  run on  DSP.
I need to optimise it for speed (less number of cycles). please suggest some ways..

WIDTHX is the width of the image, HEIGHTY is its height. the frame_1 is a 1-D array
pixels are stored sequentially. i have attached the original and the modified code.
original was taking around 111million. I modified it to take 55. furthur modifications are necessary
i need to bring it below 10million.. modified.c
Question by:srimallikarthik
    LVL 12

    Expert Comment

    You could build a 255 by 255 table that says which combinations of gradientX and gradientY pass the edge test.  Rather than doing _mpyu twice, just look up the edgemap value (0 or 255) in the table.  If you used 1 byte per combination the table is 64k.  Though you only need half of it and theoretically only 1 bit per combination, so 4k as a minimum.
    LVL 12

    Expert Comment

    Will locality of reference make a difference on the DSP?  If the image is very wide, will it miss a cache when referring to the line below?  If so you might consider dividing the image into smaller sub images, if that helps the caching.
    LVL 12

    Accepted Solution

    You might also try referring to each source pixel with its own pointer.  So there will be 4 increment/decrement operations but no arithmetic on indexes.
    LVL 84

    Expert Comment

    If you are multiplying gradientX and gradientY by itself, there is no need for the abs operation
    But getting from 55 to 10million would probably require better use of parallelism in the DSP

    Assisted Solution


    Thanks for the support and comments.

    Firstly, Sorry for the delayed response.

    I made it to 10 million IPC by changing the code as per the satsumo's suggestion of using pointer arithmatic for incrementing and decrementing

    And this is not the total solution. Only index calculation i used satsumo's suggestion of using pointers, Rest of the implementation i did it in assembly.


    Author Closing Comment

    suggestions was helpful indeed, but are not one shot answers which saved my time. so GRADE B

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Join & Write a Comment

    This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
    This is about my first experience with programming Arduino.
    The goal of this video is to provide viewers with basic examples to understand how to create, access, and change arrays in the C programming language.
    In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    14 Experts available now in Live!

    Get 1:1 Help Now