Solved

A amazing problem use inline assembler in vs.net

Posted on 2003-11-15
1
559 Views
Last Modified: 2007-12-19
the code below is to use SSE to compute the product of two 4*4 matrix,i use the visual studio.net 2003 to built it,it have runtime error,but it works well when i change
float m2[4][4],m1[4][4],m3[4][4];
to
float m1[4][4],m2[4][4],m3[4][4];
is it a bug in the complier?can try it in other environment.
 

/////////////////////////////

#include<stdio.h>

#define FILE_IN "input.txt"
#define FILE_OUT "output.txt"

void MultiMatrix(float dest[4][4],float src1[4][4],float src2[4][4])
{
    _asm
    {
        mov     ecx,src1;
        mov     edx,src2;
        mov     eax,dest;
        movss   xmm0,[ecx];
        shufps  xmm0,xmm0,00h;
        movss   xmm1,[ecx+4];
        shufps  xmm1,xmm1,00h;
        movss   xmm2,[ecx+8];
        shufps  xmm2,xmm2,00h;
        movss   xmm3,[ecx+12];
        shufps  xmm3,xmm3,00h;
        movaps  xmm4,[edx];
        movaps  xmm5,[edx+16];
        movaps  xmm6,[edx+32];
        movaps  xmm7,[edx+48];
        mulps   xmm0,xmm4;
        mulps   xmm1,xmm5;
        mulps   xmm2,xmm6;
        mulps   xmm3,xmm7;
        addps   xmm0,xmm1;
        addps   xmm0,xmm2;
        addps   xmm0,xmm3;
        movups  [eax],xmm0;
       
        movss   xmm0,[ecx+16];
        shufps  xmm0,xmm0,00h;
        movss   xmm1,[ecx+20];
        shufps  xmm1,xmm1,00h;
        movss   xmm2,[ecx+24];
        shufps  xmm2,xmm2,00h;
        movss   xmm3,[ecx+28];
        shufps  xmm3,xmm3,00h;
        mulps   xmm0,xmm4;
        mulps   xmm1,xmm5;
        mulps   xmm2,xmm6;
        mulps   xmm3,xmm7;
        addps   xmm0,xmm1;
        addps   xmm0,xmm2;
        addps   xmm0,xmm3;
        movups  [eax+16],xmm0;

        movss   xmm0,[ecx+32];
        shufps  xmm0,xmm0,00h;
        movss   xmm1,[ecx+36];
        shufps  xmm1,xmm1,00h;
        movss   xmm2,[ecx+40];
        shufps  xmm2,xmm2,00h;
        movss   xmm3,[ecx+44];
        shufps  xmm3,xmm3,00h;
        mulps   xmm0,xmm4;
        mulps   xmm1,xmm5;
        mulps   xmm2,xmm6;
        mulps   xmm3,xmm7;
        addps   xmm0,xmm1;
        addps   xmm0,xmm2;
        addps   xmm0,xmm3;
        movups  [eax+32],xmm0;
       
        movss   xmm0,[ecx+48];
        shufps  xmm0,xmm0,00h;
        movss   xmm1,[ecx+52];
        shufps  xmm1,xmm1,00h;
        movss   xmm2,[ecx+56];
        shufps  xmm2,xmm2,00h;
        movss   xmm3,[ecx+60];
        shufps  xmm3,xmm3,00h;
        mulps   xmm0,xmm4;
        mulps   xmm1,xmm5;
        mulps   xmm2,xmm6;
        mulps   xmm3,xmm7;
        addps   xmm0,xmm1;
        addps   xmm0,xmm2;
        addps   xmm0,xmm3;
        movups  [eax+48],xmm0;
    }
}

int main()
{
    int i,j;
    FILE *fin,*fout;
    float m2[4][4],m1[4][4],m3[4][4];
   
    fin = fopen(FILE_IN,"r");
    fout = fopen(FILE_OUT,"w");

    for(i=0;i<4;++i)
        for(j=0;j<4;++j)
            fscanf(fin,"%f",&m1[i][j]);
           
    for(i=0;i<4;++i)
        for(j=0;j<4;++j)
            fscanf(fin,"%f",&m2[i][j]);            
           
    MultiMatrix(m3,m1,m2);
   
    for(i=0;i<4;++i)
    {
        for(j=0;j<4;++j)
            fprintf(fout,"%f ",m3[i][j]);
        fprintf(fout,"\n");
    }

    fclose(fin);
    fclose(fout);

    return 0;

}

//////////////////////
input.txt

1.2 2.1 3.2 0
1.0 1.0 1.0 1.0
3.0 4.1 5.2 192.1
2.3 4.3 5.8 6.0

1.2 2.1 1.0 0
11.0 1.0 1.0 1.0
3.0 24.1 1.0 192.1
22.3 4.35 1.0 6.01

/////////////////////
0
Comment
Question by:atlantis13579
1 Comment
 
LVL 5

Accepted Solution

by:
mtmike earned 250 total points
ID: 9753852
General protection fault?

The "movaps" instruction can only be used to load/store 16-byte (tword) aligned data. You should use "movups" to load/store unaligned data.

Floats are only guaranteed to be 4-byte (dword) aligned.

You can also ask the compiler to align the float matrices.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/vcrefalign.asp
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
FSM's, Flow Charts or Something else 5 370
Bufbomb Phase 2: Firecracker 13 8,954
Convert from Binary to Hexadecimal 15 690
Binary Bomb: Phase 4 func4 3 310
This article lists the top 5 free OST to PST Converter Tools. These tools save a lot of time for users when they want to convert OST to PST after their exchange server is no longer available or some other critical issue with exchange server or impor…
Note: This is the second blog post in a series on email clearinghouses (https://www.xmatters.com/alert-management/blog-email-has-failed-us?utm_campaign=70138000000ydLoAAI&utm_source=exex&utm_medium=article&utm_content=blog-post).   Every month t…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now