Solved

300pt 80x86 ASM question

Posted on 2002-03-21
7
428 Views
Last Modified: 2011-09-20
Actually, more of a challenge.

http://www.experts-exchange.com/jsp/qManageQuestion.jsp?ta=assembly&qid=20272901

Write it in C/C++ and I'll compile with full optimization and test your code against my soon-to-be-posted ASM solution.  Whiner Take All!

-- Dan
0
Comment
Question by:DanRollins
  • 4
  • 2
7 Comments
 
LVL 2

Expert Comment

by:jonnin
ID: 6888435
Hand crafted asm can be over 100 times faster. Even bad asm will get 10% faster, whats the point?
I'll give it a spin, but I know the result already...

0
 
LVL 2

Expert Comment

by:jonnin
ID: 6888705
Here is the best I can do in short order!

#include<memory> //lazy me!        
#include<cstdlib> //rand()
#include<ctime>
using namespace std;

double jtimer;
inline void start_time()
{
     jtimer = (double)(clock());
}

inline void elapsed_time(char *ch)
{
     jtimer = clock() - jtimer;
     jtimer = jtimer/CLOCKS_PER_SEC;
     printf("%s time %f seconds\n",ch, jtimer);    
}


static unsigned char ascii[255]; //map '.' into the required ascii fields
static unsigned char *data;    
//the data. you did not say well written, you said fast!

inline void f()
{
    printf("%x %x %x %x %x %x %x %x-%x %x %x %x %x %x %x %x",
        data[0],data[1],data[2],data[3],data[4],data[5],data[6],
        data[7],data[8],data[9],data[10],data[11],data[12],data[13],
        data[14],data[15]);
    printf("  %c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c\n",
        ascii[data[0]],
        ascii[data[1]],
        ascii[data[2]],
        ascii[data[3]],
        ascii[data[4]],
        ascii[data[5]],
        ascii[data[6]],
        ascii[data[7]],
        ascii[data[8]],
        ascii[data[9]],
        ascii[data[10]],
        ascii[data[11]],
        ascii[data[12]],
        ascii[data[13]],
        ascii[data[14]],
        ascii[data[15]]);
}
int d[4];
int main()
{
    memset(ascii,'.',255);
    short i;
    for(i = 32; i < 128; i++)
        ascii[i] = i;
   
       
    data = (unsigned char *)d;
    i = 1;
    start_time();
    while(i++)
    {
        d[0] = rand();
        d[1] = rand();
        d[2] = rand();
        d[3] = rand();
        f();
    }
    elapsed_time("");
    /*
     I ran re-directed to a file
     b > a.txt
     the result was less than
     0.98 seconds for 65535 of them
     on a 1000 mhz AMD processor, win 2k.
     all optimizations tweaked for this app
     including:
     no c++ exceptions
     intel 386 exe format for speed
     inline expansion
     no debug stuff
     maximize speed chosen
     this can be faster still but for
     30 min is all I can afford for today.
     */
    return(3);
}
0
 
LVL 1

Accepted Solution

by:
Pavlik earned 100 total points
ID: 6889537
Here is quite fast one. But you understand that you can always get fastest C implementation, get its optimized assembly listing and optimize it even more.

I'm only wondering if you can make implementation 20% faster than this one.

=======================================================
void f(const char* dump, char* buf)
{


  static const long digits[256] =
  {
    0x3030, 0x3130,0x3230, 0x3330, 0x3430, 0x3530, 0x3630, 0x3730, 0x3830, 0x3930, 0x6130, 0x6230, 0x6330, 0x6430, 0x6530, 0x6630,
    0x3031, 0x3131,0x3231, 0x3331, 0x3431, 0x3531, 0x3631, 0x3731, 0x3831, 0x3931, 0x6131, 0x6231, 0x6331, 0x6431, 0x6531, 0x6631,
    0x3032, 0x3132,0x3232, 0x3332, 0x3432, 0x3532, 0x3632, 0x3732, 0x3832, 0x3932, 0x6132, 0x6232, 0x6332, 0x6432, 0x6532, 0x6632,
    0x3033, 0x3133,0x3233, 0x3333, 0x3433, 0x3533, 0x3633, 0x3733, 0x3833, 0x3933, 0x6133, 0x6233, 0x6333, 0x6433, 0x6533, 0x6633,
    0x3034, 0x3134,0x3234, 0x3334, 0x3434, 0x3534, 0x3634, 0x3734, 0x3834, 0x3934, 0x6134, 0x6234, 0x6334, 0x6434, 0x6534, 0x6634,
    0x3035, 0x3135,0x3235, 0x3335, 0x3435, 0x3535, 0x3635, 0x3735, 0x3835, 0x3935, 0x6135, 0x6235, 0x6335, 0x6435, 0x6535, 0x6635,
    0x3036, 0x3136,0x3236, 0x3336, 0x3436, 0x3536, 0x3636, 0x3736, 0x3836, 0x3936, 0x6136, 0x6236, 0x6336, 0x6436, 0x6536, 0x6636,
    0x3037, 0x3137,0x3237, 0x3337, 0x3437, 0x3537, 0x3637, 0x3737, 0x3837, 0x3937, 0x6137, 0x6237, 0x6337, 0x6437, 0x6537, 0x6637,
    0x3038, 0x3138,0x3238, 0x3338, 0x3438, 0x3538, 0x3638, 0x3738, 0x3838, 0x3938, 0x6138, 0x6238, 0x6338, 0x6438, 0x6538, 0x6638,
    0x3039, 0x3139,0x3239, 0x3339, 0x3439, 0x3539, 0x3639, 0x3739, 0x3839, 0x3939, 0x6139, 0x6239, 0x6339, 0x6439, 0x6539, 0x6639,
    0x3061, 0x3161,0x3261, 0x3361, 0x3461, 0x3561, 0x3661, 0x3761, 0x3861, 0x3961, 0x6161, 0x6261, 0x6361, 0x6461, 0x6561, 0x6661,
    0x3062, 0x3162,0x3262, 0x3362, 0x3462, 0x3562, 0x3662, 0x3762, 0x3862, 0x3962, 0x6162, 0x6262, 0x6362, 0x6462, 0x6562, 0x6662,
    0x3063, 0x3163,0x3263, 0x3363, 0x3463, 0x3563, 0x3663, 0x3763, 0x3863, 0x3963, 0x6163, 0x6263, 0x6363, 0x6463, 0x6563, 0x6663,
    0x3064, 0x3164,0x3264, 0x3364, 0x3464, 0x3564, 0x3664, 0x3764, 0x3864, 0x3964, 0x6164, 0x6264, 0x6364, 0x6464, 0x6564, 0x6664,
    0x3065, 0x3165,0x3265, 0x3365, 0x3465, 0x3565, 0x3665, 0x3765, 0x3865, 0x3965, 0x6165, 0x6265, 0x6365, 0x6465, 0x6565, 0x6665,
    0x3066, 0x3166,0x3266, 0x3366, 0x3466, 0x3566, 0x3666, 0x3766, 0x3866, 0x3966, 0x6166, 0x6266, 0x6366, 0x6466, 0x6566, 0x6666
  };
  static const char chars[256] =
  {
    '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.',
    '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.',
    '\x20', '\x21','\x22','\x23','\x24','\x25','\x26','\x27','\x28','\x29','\x2a','\x2b','\x2c','\x2d','\x2e','\x2f',
    '\x30', '\x31','\x32','\x33','\x34','\x35','\x36','\x37','\x38','\x39','\x3a','\x3b','\x3c','\x3d','\x3e','\x3f',
    '\x40', '\x41','\x42','\x43','\x44','\x45','\x46','\x47','\x48','\x49','\x4a','\x4b','\x4c','\x4d','\x4e','\x4f',
    '\x50', '\x51','\x52','\x53','\x54','\x55','\x56','\x57','\x58','\x59','\x5a','\x5b','\x5c','\x5d','\x5e','\x5f',
    '\x60', '\x61','\x62','\x63','\x64','\x65','\x66','\x67','\x68','\x69','\x6a','\x6b','\x6c','\x6d','\x6e','\x6f',
    '\x70', '\x71','\x72','\x73','\x74','\x75','\x76','\x77','\x78','\x79','\x7a','\x7b','\x7c','\x7d','\x7e','\x7f',
    '\x80', '\x81','\x82','\x83','\x84','\x85','\x86','\x87','\x88','\x89','\x8a','\x8b','\x8c','\x8d','\x8e','\x8f',
    '\x90', '\x91','\x92','\x93','\x94','\x95','\x96','\x97','\x98','\x99','\x9a','\x9b','\x9c','\x9d','\x9e','\x9f',
    '\xa0', '\xa1','\xa2','\xa3','\xa4','\xa5','\xa6','\xa7','\xa8','\xa9','\xaa','\xab','\xac','\xad','\xae','\xaf',
    '\xb0', '\xb1','\xb2','\xb3','\xb4','\xb5','\xb6','\xb7','\xb8','\xb9','\xba','\xbb','\xbc','\xbd','\xbe','\xbf',
    '\xc0', '\xc1','\xc2','\xc3','\xc4','\xc5','\xc6','\xc7','\xc8','\xc9','\xca','\xcb','\xcc','\xcd','\xce','\xcf',
    '\xd0', '\xd1','\xd2','\xd3','\xd4','\xd5','\xd6','\xd7','\xd8','\xd9','\xda','\xdb','\xdc','\xdd','\xde','\xdf',
    '\xe0', '\xe1','\xe2','\xe3','\xe4','\xe5','\xe6','\xe7','\xe8','\xe9','\xea','\xeb','\xec','\xed','\xee','\xef',
    '\xf0', '\xf1','\xf2','\xf3','\xf4','\xf5','\xf6','\xf7','\xf8','\xf9','\xfa','\xfb','\xfc','\xfd','\xfe','\xff'
  };

  long* pBuf = (long*) buf;

  for (int i = 4; i > 0; --i)
  {
    *pBuf = (digits[(*dump)]) | 0x200000L | ((digits[*(dump+1)] & 0xffL) << 24);
    ++pBuf;
    *pBuf = (digits[*(dump+1)] >> 8 ) | (0x2000L) | (digits[*(dump+2)] << 16);
    ++pBuf;
    *pBuf = (digits[*(dump+3)] << 8) | 0x20000020L;
    ++pBuf;
    dump += 4;
  }

  dump -= 16;

  *pBuf =
    chars[*dump] |
    (chars[*(dump+1)] << 8) |
    (chars[*(dump+2)] << 16) |
    (chars[*(dump+3)] << 24);
  ++pBuf;
  dump += 4;
  *pBuf =
    chars[*dump] |
    (chars[*(dump+1)] << 8) |
    (chars[*(dump+2)] << 16) |
    (chars[*(dump+3)] << 24);
  ++pBuf;
  dump += 4;
  *pBuf =
    chars[*dump] |
    (chars[*(dump+1)] << 8) |
    (chars[*(dump+2)] << 16) |
    (chars[*(dump+3)] << 24);
  ++pBuf;
  dump += 4;
  *pBuf =
    chars[*dump] |
    (chars[*(dump+1)] << 8) |
    (chars[*(dump+2)] << 16) |
    (chars[*(dump+3)] << 24);
  ++pBuf;

  *pBuf = 0;
}
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 49

Author Comment

by:DanRollins
ID: 6889565
hi jonnin!
Thanks for your submission.  I posted the core of it into the ASM question and will post the timimg results tomorrow.  Check it out, and please post in that thread rather than this one.  Thanks!

-- Dan
0
 
LVL 49

Author Comment

by:DanRollins
ID: 6890628
Hi Pavlik,
I posted your code and timing of it on the other thread.  You are looking good.

http://www.experts-exchange.com/jsp/qManageQuestion.jsp?ta=assembly&qid=20272901

You may want to go there and subscribe to that Q so you can get a notif of the final outcome.

-- Dan
0
 
LVL 49

Author Comment

by:DanRollins
ID: 6909254
The original Assembly TA thread has been continued here:
   http:Q.20280946.html
and then here:
   http:Q.20283475.html
Join in the discussion, we are having fun!

-- Dan
0
 
LVL 49

Author Comment

by:DanRollins
ID: 6922043
Hi Pavlik,
I added 100 points to this Q.  Pleaase accept them with my complements for your excellent C++ code.  Yours beet everyone until we started messing around with MMX operations.

You might want to check in on the thread.  'able' built a version with a 143MB lookup table.  It's all too, too, tooo... well what the heck.  

-- Dan
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Templates For Beginners Or How To Encourage The Compiler To Work For You Introduction This tutorial is targeted at the reader who is, perhaps, familiar with the basics of C++ but would prefer a little slower introduction to the more ad…
  Included as part of the C++ Standard Template Library (STL) is a collection of generic containers. Each of these containers serves a different purpose and has different pros and cons. It is often difficult to decide which container to use and …
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now