wood Posted November 5, 2003 Share Posted November 5, 2003 Hey... some of them are damn easy... but that's what I'm afraid of... get all the difcicult ones and make hasty decision on easy ones. So, let's go. Q1) What are the limits on how much a processor's performance can be improved using pipelining?? Q2) Given an unpipelined processor with a 10ns cycle time and pipeline latches with 0.5ns latency: (a) what are the cycle times of pipelined versions of the processor with 2, 4, 8, and 16 stages (b) how many stages of pipelining are required to achieve a cycle time of 2ns? 1ns? © what is the minimum cycle time achievable with a 4-stage pipeline if additional logic is assigned to the final stage to balance the additional latency of the pipeline latches in the other stages? Q3) Consider this instruction sequence: DIV r2, r5, r8 SUB r9, r2, r7 ASH r5, r14, r6 MUL r11, r9, r5 BEQ r10, #0, r12 OR r8, r15, r2(a) Identy all of the RAW hazards (b) Identify all of the WAR hazards © Identify all of the WAW hazards © Identify all of the control hazards Q4) Suppose the branch frequencies (as percentages of all instructions) are as follows: 15% - Conditional branches 1% - Jumps and calls 60% - Taken condiitional branches We are examining a four-deep pipeline where the branch is resolved at the end of the second cycle for unconditional branches and at the end of the third cycle for conditional branches. Assuming that only the first pipe stage can always be done independent of whether the branch goes and ignoring other pipeline stalls, how much faster would the machine be without any branch hazards? Q5) Here is an unusual loop. First, list the dependences and then rewrite the loop so that it is parallel. for( i=1; i a[i] = b[i] + c[i]; /* S1 */ b[i] = a[i] + d[i]; /* S2 */ a[i+1] = a[i] + e[i]; /* S3 */ } Quote Link to comment Share on other sites More sharing options...
AlbaLed Posted November 6, 2003 Share Posted November 6, 2003 Q1) What are the limits on how much a processor's performance can be improved using pipelining?? Hey, if latches were free (i.e. no latency at all) we could have a processor with googol stage pipeline, but that is not the case on earth so here is the equation Cycle time = longest stage time + latch time Hazards, branches are other barriers, not physical though, logical. Q2) Given an unpipelined processor with a 10ns cycle time and pipeline latches with 0.5ns latency: (a) what are the cycle times of pipelined versions of the processor with 2, 4, 8, and 16 stages 2 stages Cycle time = (unpiped cycl. tyme)/stage # + latch time = 10/2 + 0.5 = 5.5ns Similarly 4 stage pipe. cycle time = 3ns 8 stage pipe. cycle time = 1.75ns 16 stage pipe. cycle time = 1.125ns (b) how many stages of pipelining are required to achieve a cycle time of 2ns? 1ns? cycle time = 2ns Cycle time = (unpiped cycl. tyme)/stage # + latch time 2 = 10/stages # + 0.5 stages = 10/1.5 = 7 stages (I think we should round up) 7 stages -> 1.9 ns cycle time © what is the minimum cycle time achievable with a 4-stage pipeline if additional logic is assigned to the final stage to balance the additional latency of the pipeline latches in the other stages? minumum cycle time = (total time thru pipeline)/(stage number) min cycl. = [10 + (latches * latch latency)]/4 = 1 4 stages = 3 latches ( stage(-), latch(|) -|-|-|-) min cycl. = 11.5/4 = 2.87 ns Not really sure tho!!! Q3) Consider this instruction sequence: DIV r2, r5, r8 SUB r9, r2, r7 ASH r5, r14, r6 MUL r11, r9, r5 BEQ r10, #0, r12 OR r8, r15, r2 (a) Identy all of the RAW hazards (b) Identify all of the WAR hazards © Identify all of the WAW hazards © Identify all of the control hazards assumming operation result, operand1, operand2 DIV r2, r5, r8 SUB r9, r2, r7 RAW hazard ASH r5, r14, r6 MUL r11, r9, r5 RAW hazard SUB r9, r2, r7 ASH r5, r14, r6 MUL r11, r9, r5 RAW hazard (depending on how deep is the pipeline) DIV r2, r5, r8 SUB r9, r2, r7 ASH r5, r14, r6 WAR hazard Don't think there is any WAW hazards BEQ r10, #0, r12 OR r8, r15, r2 Control hazard (if BEQ = branch on equal) Quote Link to comment Share on other sites More sharing options...
wood Posted November 6, 2003 Author Share Posted November 6, 2003 Good job AlbaLed. All of them are right !! Yes, you should round up. On Q2) b), you'll get 20 stages for 1ns. I just posted the missing instruction sequence! :) Wood Quote Link to comment Share on other sites More sharing options...
nonevent99 Posted November 6, 2003 Share Posted November 6, 2003 Wow, you guys know a lot. What's a latch frequency, and how does it figure in? Quote Link to comment Share on other sites More sharing options...
AlbaLed Posted November 6, 2003 Share Posted November 6, 2003 latch frequency ?? where did you see that ??? Latch latency, maybe ?? Between each stage in the pipeline there is latches (small buffers) that hold the output of stage n-1 for a fraction of time untill stage n reads it in. The reading and writing of these latches consumes time pure overhead. (no free lunch !!!! eeeeveeeeeerrrrr!!!!!!!! ) Quote Link to comment Share on other sites More sharing options...
wood Posted November 6, 2003 Author Share Posted November 6, 2003 I believe you meant latch latency, right? Between each stage of the pipeline, there are latches, which generate latency. These latencies are usually small, but can become significant as the number of stages grows. Quote Link to comment Share on other sites More sharing options...
wood Posted November 6, 2003 Author Share Posted November 6, 2003 Originally posted by AlbaLed Q3) Consider this instruction sequence: DIV r2, r5, r8 SUB r9, r2, r7 ASH r5, r14, r6 MUL r11, r9, r5 BEQ r10, #0, r12 OR r8, r15, r2 (a) Identy all of the RAW hazards (b) Identify all of the WAR hazards © Identify all of the WAW hazards © Identify all of the control hazards assumming operation result, operand1, operand2 DIV r2, r5, r8 SUB r9, r2, r7 RAW hazard ASH r5, r14, r6 MUL r11, r9, r5 RAW hazard SUB r9, r2, r7 ASH r5, r14, r6 MUL r11, r9, r5 RAW hazard (depending on how deep is the pipeline) DIV r2, r5, r8 SUB r9, r2, r7 ASH r5, r14, r6 WAR hazard Don't think there is any WAW hazards BEQ r10, #0, r12 OR r8, r15, r2 Control hazard (if BEQ = branch on equal) Right !! You missed these though: DIV r2, r5, r8 OR r8, r15, r2 RAW ! You don't know how deep the pipeline is... DIV r2, r5, r8 OR r8, r15, r2 WAR ! Quote Link to comment Share on other sites More sharing options...
AlbaLed Posted November 6, 2003 Share Posted November 6, 2003 Very much correct wood, thank you !!!!!!!!!! Quote Link to comment Share on other sites More sharing options...
nonevent99 Posted November 6, 2003 Share Posted November 6, 2003 Oh, those things are called "latches". Ok. Now your guys earlier posts make a lot more sense. Originally posted by AlbaLed latch frequency ?? where did you see that ??? Latch latency, maybe ?? Between each stage in the pipeline there is latches (small buffers) that hold the output of stage n-1 for a fraction of time untill stage n reads it in. The reading and writing of these latches consumes time pure overhead. (no free lunch !!!! eeeeveeeeeerrrrr!!!!!!!! ) Quote Link to comment Share on other sites More sharing options...
ic10503 Posted April 7, 2010 Share Posted April 7, 2010 Can anyone post the answers to problems 4 and 5 please? Quote Link to comment Share on other sites More sharing options...
eng muna Posted May 11, 2011 Share Posted May 11, 2011 file:///C:/Users/user/AppData/Local/Temp/moz-screenshot-5.png[ATTACH]6065[/ATTACH] Good Luck:grad: Quote Link to comment Share on other sites More sharing options...
eng muna Posted May 11, 2011 Share Posted May 11, 2011 solution for Q5 the solution in the picture and i upload the file that contain the solution you can find it file:///C:/Users/user/AppData/Local/Temp/moz-screenshot-6.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-7.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-8.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-9.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-10.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-11.pngfile:///C:/Users/user/AppData/Local/Temp/moz-screenshot-12.png[ATTACH=CONFIG]6066[/ATTACH] good luck Quote Link to comment Share on other sites More sharing options...
eng muna Posted May 11, 2011 Share Posted May 11, 2011 please I need a solution for this question Determine the improvement from branch folding for unconditional branches. Assume a 90% hit rate, a base CPI without unconditional branch stalls of 1, and an unconditional branch frequency of 5%. How much improvement is gained by this enhancement versus a processor whose effective CPI is 1.1? Quote Link to comment Share on other sites More sharing options...
eng muna Posted May 12, 2011 Share Posted May 12, 2011 please I need a solution for this problem Determine the improvement from branch folding for unconditional branches. Assume a 90% hit rate, a base CPI without unconditional branch stalls of 1, and an unconditional branch frequency of 5%. How much improvement is gained by this enhancement versus a processor whose effective CPI is 1.1? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.