SIMD - Advantages/Disadvantages and the way to go ....

Posted on 2006-04-01
Last Modified: 2013-12-26
hi all :)

recently i got a cpu, which is able to use sse/sse2 .. since my projects are mainly
3D & physics - simulations, i started playing around with it a little and tried to
see, what advantages/disadvantages coming up from implementing sse into my basic
layers. i did lots of benchmarking and found, that basically vector-normalization
and matrix-transformation of vector-arrays really have a time improvement. sure both
are very important for my kind of librarys ..

so i have an important decision to make .. it affects all my libs and apps, since data
must be prepared for that functions and there would be no longer vector3 & vector4 - types,
each must be replaced with a homogeneous vector4 and 3x3 rotation matrices must be replaced
with 4x4 matrices

here are my pro & contras i see so far:

* prepared for the future ?!?
* time improvement

* data must be aligned to 16 byte and must fit into a 128-bit register
* to have a consistent library, i have to use vectors with 4 components always, even
   if i only need three components, the same for 3x3 matrices
* code-maintenance is more complex at the lower layer, since some functions are
   implemented in 2 ways
* library-runtime-checking for sse and set functionpointer to decide, which function
   to use, with or without sse
* pure c/c++ - code seems longer to be valid and is cpu-independent, and fpu's are
   getting faster
* increasing memory-size, but thats not really a point for me in these days ..

here are my benchmarks on win with vc71 and pentium4

V3_NORMALIZE            23%
V3_LENGTH_SQR        -11%
V3_LENGTH                   5%
V3_ADD                       -3%
V3_SUB                       -1%
V3_MUL                       -1%
V3_DOTPRODUCT         -3%
V4_DOTPRODUCT         -0%
M_MUL_V                      3%
M_BATCH_MUL_V         22%

M_MUL_V            -> vector4     = matrix44 * vector4
M_BATCH_MUL_V -> vector4[n] = matrix44 * vector4[n]

the processors to use are mainly intel & athlon 32-bit & 64-bit
platforms are win & linux

so my questions are:

1. did you face the same question, and how did you decide? what was your pro & contras

2. i'd like to have a discussion, to see some aspects i didn't see yet or that way..
    not only including time-improvement
actually my intuition tells me, its too much costs. but i think its an important decision,
so i'd like to have as much input as i can

so thanks for input in advance :)

Question by:ikework
    1 Comment
    LVL 48

    Accepted Solution

    Try to play with compiler optimization settings. Compilers make optimizations for specific processor types, and can generate SSE code. This can improve program performance.
    If you want to use SSE, use compiler intrinsics instead of Assembly if they are availble in your compiler.
    From my experience, using Assembly gives minimal anvantage over optimized C++ code. I think using SSE and other low-level technologies is important for library developers (like OpenGL or Intel's IPL and PPL), and not so important for application developers.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Suggested Solutions

    Title # Comments Views Activity
    Game Center error 5 622
    Python and Network Datagrams 7 374
    Spell-checking 1 88
    General  PC Gaming question about re-playing a game 5 109
    As game developers, we quickly learn that Artificial Intelligence (AI) doesn’t need to be so tough.  To reference Space Ghost: “Moltar, I have a giant brain that is able to reduce any complex machine into a simple yes or no answer. (http://www.youtu…
    Recently, in one of the tech-blogs I usually read, I saw a post about the best-selling video games through history. The first place in the list is for the classic, extremely addictive Tetris. Well, a long time ago, in a galaxy far far away, I was…
    Internet Business Fax to Email Made Easy - With eFax Corporate (, you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    759 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    11 Experts available now in Live!

    Get 1:1 Help Now