[x]
Posted via EE Mobile

Search, ask, and monitor your questions on the go with EE Mobile. Visit Experts Exchange from your mobile device and never be out of touch again.

Question
[x]
Attachment Details
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

7.8

Algorithm loops in PIII assembly: Help requested

Asked by dude_1967 in Assembly Programming Language

Tags: assembly, mul, piii

Hello,

Here is a rather long question with several parts. Hopefully there are some patient experts out there...

I'm developing a multiplication algorithm in a mixed C++/ASM project. The critical inner loops are shown in the C++ language below. I would like to work out a hand-crafted PentiumIII assembler implementation of these inner loops.

I have moderate experience with such matters and would be in a position to design such loops using 386 assembler. However, I would very much like to find out if it is possible to improve run-time characteristics using MMX instructions and 64-bit registers. I have never programmed MMX assembler. I do not want to go all the way up to PentiumIV and SSE2, but rather stay with MMX since a lot of clients do not have support for SSE2.

The questions are then:

1) In the algorithm below, can the variables carry and sum be effectively manipulated using 64-bit MMX registers and are there any advantages here?
2) For the multiplication in the inner loop, is there a MMX method for u32xu32-->u64 multiply which superior to the old mul command with results in eax:edx?

The algorithm in C++ is shown here. Please take a look.

static void test_mul_loop(const UINT32* const pu,
                          const UINT32* const pv,
                                UINT32* const pw)
{
  // Standard Order(N^2) multiplication in C++, taken from
  // an empirical analysis of polynomial multiplication
  // created for several cases using Mathematica 4.1.
  // Constraint on assembler implementation: The array
  // 'number' can not be unlimited in size since the division
  // carry = sum / mask should remain u64/u32--> u32 result
  // with u32 remainder.
  UINT64 carry = 0;
  for(INT32 j = number - 1; j >= 0; j--)
  {
    UINT64 sum = carry;
    for(INT32 i = 0; i <= j; i++)
    {
      sum += pu[j - i] * static_cast<UINT64>(pv[i]);
    }
    pw[j + 1] = static_cast<UINT32>(sum % mask);
    carry     = sum / mask;
  }
}

Thank you very much for any assistance.

Sincerely, Chris.
[+][-]02/25/04 06:32 AM, ID: 10450619Accepted Solution

View this solution now by starting your 30-day free trial. Setting up your free trial is quick, easy, and secure. We will return you to this solution, unlocked, when you're done.

About this solution

Zone: Assembly Programming Language
Tags: assembly, mul, piii
Sign Up Now!
Solution Provided By: Dancie
Participating Experts: 2
Solution Grade: A
 
[+][-]02/24/04 08:32 AM, ID: 10442218Assisted Solution

Assisted solutions are selected by the member who asked the question as a comment that contributed to their question's solution.

Start your 30-day free trial to view this Assisted Solution or ask the Experts your question.

 
[+][-]02/24/04 09:09 AM, ID: 10442628Author Comment

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 30-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]02/25/04 09:16 AM, ID: 10452131Author Comment

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 30-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]02/25/04 09:26 AM, ID: 10452219Expert Comment

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 30-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]02/26/04 12:27 AM, ID: 10458175Author Comment

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 30-day free trial to view this Author Comment or ask the Experts your question.

 
 
Loading Advertisement...
20091111-EE-VQP-92