D4Ly
asked on
HW - pipelined datapath mips - worst performance
I need to take the following code and re-arrange it to MINIMIZE performance...making the most clockcylces occur that I can on a piplelined datapath with forwarding and stalls on a use following a load. The result of the code must remain the same.
1 lw $2, 100($6)
2 lw $3, 200($7)
3 add $4, $2, $3
4 add $6, $3, $5
5 sub $8, $4, $6
6 lw $7, 300($8)
7 beq $7, $8, Loop
It is unclear to me how to re-arrange my lw's in order to minimize the performance...further, I can only see swapping lines 3 and 4 being done without ruining the end result, but am not sure how this would effect performance, if at all.
Suggestions?
1 lw $2, 100($6)
2 lw $3, 200($7)
3 add $4, $2, $3
4 add $6, $3, $5
5 sub $8, $4, $6
6 lw $7, 300($8)
7 beq $7, $8, Loop
It is unclear to me how to re-arrange my lw's in order to minimize the performance...further, I can only see swapping lines 3 and 4 being done without ruining the end result, but am not sure how this would effect performance, if at all.
Suggestions?
ASKER
yes, i was trying to look at the data dependencies, but so many seem to already be in a mandatory order...ie:
3 must be after 1 & 2
7 must be last
5 must come after 4 and 3
6 must come after 5
etc...
so lost on this one :\
3 must be after 1 & 2
7 must be last
5 must come after 4 and 3
6 must come after 5
etc...
so lost on this one :\
Could it be a question written by someone like my EE lab TA, who somehow managed to put several NOTs into each question, until you were never sure what they meant?
"Your goal is to find the resonant frequency that is not passing the signal not at all and therefore not indicating nothing on the null meter."
"Your goal is to find the resonant frequency that is not passing the signal not at all and therefore not indicating nothing on the null meter."
ASKER
haha wow what a bad question. null meter makes it hysterical!
Here's word for word out of the book.
Rewrite the following code to _minimize_ performance on this datapath - that is, reorder the instructions so that this sequence takes the _most_ clock cycles to execute while still obtaining the same result.
1 lw $2, 100($6)
2 lw $3, 200($7)
3 add $4, $2, $3
4 add $6, $3, $5
5 sub $8, $4, $6
6 lw $7, 300($8)
7 beq $7, $8, Loop
The datapath has forwarding and stalls on a use following a load (so, stalls when add follows directly after a lw i assume...)
Here's word for word out of the book.
Rewrite the following code to _minimize_ performance on this datapath - that is, reorder the instructions so that this sequence takes the _most_ clock cycles to execute while still obtaining the same result.
1 lw $2, 100($6)
2 lw $3, 200($7)
3 add $4, $2, $3
4 add $6, $3, $5
5 sub $8, $4, $6
6 lw $7, 300($8)
7 beq $7, $8, Loop
The datapath has forwarding and stalls on a use following a load (so, stalls when add follows directly after a lw i assume...)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
yes! I was staring at this for 2 hours before class, and came up with exactly that before submitting it. The data dependency on that model forces stalls from the dependency between line 1 and 3 and line 2 and 4...creating two more stall points than the original. Thanks VERY much for confirming my thoughts! I will post back when I receive the graded solutions.
You are welcome.
If the goal was to improve things a bit, you could load $3 first, then you can move the add of $3 and $5 up a notch, and the lw of $7 could go anywhere higher. But all those are improvements, so no good.