Jump to content
Urch Forums

driller for hardware


nonevent99

Recommended Posts

Ok. Time to drill some more.

 

Q1) How many NAND gates are required to implement (x+yz)(x'+y'z')? How about with NOR?

 

Q2) Are your solutions to Q1 glitch-free? If not, then how long is/are the glitches, assuming a propagation delay of 10ns for all gates.

 

Q3) Gimme a SOP and a POS for

x y z f
0 1 1 1
0 0 1 1
1 1 0 1
1 1 1 1

 

Q4) Functionally speaking what's the difference between an SR flip flop and a JK flip flop? How about between a D flip flop and an SR flip flop?

 

Q5) Classify as sequential or combinational: mux, demux, SR flip flop, decoder, magic memory, ripple counter, barrel shifter, half-adder

 

Q6) Represent in 8 bits as 2's complement, 1's complement, bias 2^7, bias 2^7-1, and sign magnitude. Indicate if out of range: 110, -20, -1

 

Q7) Represent in IEEE-754: -145.1875

 

Q8) What's the difference between a synchronous bus and an asynchronous bus (if any), and who cares?

 

Q9) What are page mode and interleaving (in memory hardware), and who cares?

 

Q10) If my disk has a seek time of 3ms, rotates at 20000 RPM, and has a transfer rate of 10MB/s, how long does it take to transfer two sectors (each of 0.5 MB) on different cylinders? Assume that they are requested sequentially.

 

Q11) List the three or four main impediments to aggressive pipelining.

 

Q12) Where in the OSI model do these protocols belong? What are their purposes? udp, smtp, icmp, csma/cd, ethernet, ssl

 

Q13) Under what conditions does MRU give a very poor fault rate? Under what conditions does LRU give a very poor fault rate?

 

Q14) Which of the following, if any, are true? (Assume a fully-associative TLB, a set-associative cache, and a fully-associative virtual memory.)

 

I. If a memory reference generates a TLB miss, then it must also generate a cache miss.

II. If a memory reference generates a cache miss, then it must also generate a TLB miss.

III. If a memory reference generates a cache miss, then it must also generate a page fault.

IV. If a memory reference generates a page miss, then it must also generate a TLB miss.

 

Q15) Suppose 10 1GHz processors each have a base CPI of 1, plus a miss penalty of 5 cycles per cache miss. Each processor is running a piece of software which, every 10 instructions, reads a 4B word from memory and writes it to a shared network card that wastes 20% of bandwidth on overhead. Each of these instructions has a 10% chance of generating a cache miss. Assuming that this is a split cache system with negligible instruction cache miss, and assuming that the network card's bandwidth is the critical system resource, how much bandwidth must be available through the card to satisfy the needs of these processors?

 

 

Link to comment
Share on other sites

Q1) How many NAND gates are required to implement (x+yz)(x'+y'z')? How about with NOR?

 

I believe 8 ??

 

Q2) Are your solutions to Q1 glitch-free? If not, then how long is/are the glitches, assuming a propagation delay of 10ns for all gates.

 

It is not glitch-free. 10ns?

 

Q3) Gimme a SOP and a POS for

x y z f
0 1 1 1
0 0 1 1
1 1 0 1
1 1 1 1

SOP = xy + x'z.

POS = (x'+y)(x+z)

 

 

Link to comment
Share on other sites

Q14) Which of the following, if any, are true? (Assume a fully-associative TLB, a set-associative cache, and a fully-associative virtual memory.)

 

I. If a memory reference generates a TLB miss, then it must also generate a cache miss.

 

True. False. (corrected by randhawa. thanks!)

 

II. If a memory reference generates a cache miss, then it must also generate a TLB miss.

 

False.

 

III. If a memory reference generates a cache miss, then it must also generate a page fault.

 

False.

 

IV. If a memory reference generates a page miss, then it must also generate a TLB miss.

 

True.

 

Link to comment
Share on other sites

Q4) Functionally speaking what's the difference between an SR flip flop and a JK flip flop? How about between a D flip flop and an SR flip flop?

SR action for R = S = 1 is not defined for JK J = K = 1 take the complement

D either sets or resets whereas SR also enables no change of the state.

 

Q5) Classify as sequential or combinational: mux, demux, SR flip flop, decoder, magic memory, ripple counter, barrel shifter, half-adder

mux, demux, decoder, half-adder, barrel shifter == combinational

SR, Ripple counter = sequential

What the hell is magic memory? I guess it is probably sequential.

Link to comment
Share on other sites

Q2) wood - did you find it by subtracting the number of gates in longest path to the shortest path and then multiplying the diff with the propogation delay ??

 

Q6) What is bias 2^7 representation - is it similar to Excess-3... every o/p is added an excess of 2^7=10000000

 

Q11) impediments to aggressive pipelining:-

Unconditional/Conditional Branch

Memory Read/Write stalls

Data hazard (instruction depends on result of previous instruction)

 

Q13) MRU gives a poor fault rate in multiple-Loop mem access (arrays, etc)

LRU gives a poor fault rate in sequencial instructions access

 

Q14) d (a is false since TLB miss does not mean that the page in not in memory)

 

 

Someone on Q7 ??

Link to comment
Share on other sites

Q7) Represent in IEEE-754: -145.1875

 

I assume you mean single format (32 bits). So, you have 1 bit sign, 8 bits exponent, 23 bits fraction.

 

Number: -145.1875

Integer part: 145 is 1001 0001

Fraction part: 0.1875 is 0.0011 (1/8+1/16)

 

So the number is 1001 0001.0011 in binary representation.

 

IEEE-754 defines the floating-point representation as +/- 1*{mantissa} x 2^({exponent}-127). It however requires that the number is normalized.

 

Normalizing it, we get 1.00100010011 * 2^7.

 

Exponent:

 

2^7 = 2^(134-127), so the exponent is 134. In binary, it is 10000110.

 

Mantissa:

 

The mantissa is 00100010011 as normalized above. We need to add zeros at the end to fill out the 23 bit fraction space we have. So the mantissa is

 

0010 0010 0110 0000 0000 000

 

Result:

signed  exponent            mantissa
  1  	10000110    0010 0010 0110 0000 0000 000

See http://www.dgp.toronto.edu/~ajr/270/notes/representation.html

 

Hope that helps.

 

Wood

Link to comment
Share on other sites

Here is an answer key, compiled from all your answers, and my own. (In some cases, your answers were so much better than anything I could come up with that I just plagiarized your work. Good job, guys!)

 

Items that could really use more discussion will be highlighted in orange. I will update this entry until no orange items remain.

 

Q1) How many NAND gates are required to implement (x+yz)(x'+y'z')? How about with NOR?

Can anybody give an algorithm for finding the minimal number of gates? I posted it, but don’t know how to solve it!

 

Q2) Are your solutions to Q1 glitch-free? If not, then how long is/are the glitches, assuming a propagation delay of 10ns for all gates.

 

Q3) Gimme a SOP and a POS for

x y z f
0 1 1 1
0 0 1 1
1 1 0 1
1 1 1 1

 

SOP = xy + x'z.

POS = (x'+y)(x+z)

 

Q4) Functionally speaking what's the difference between an SR flip flop and a JK flip flop? How about between a D flip flop and an SR flip flop?

SR action for R = S = 1 is not defined for JK J = K = 1 take the complement

 

D flip flop protects you from accidentally putting in R=S=1 by taking the complement of your input and feeding it into R. D either sets or resets whereas SR also enables no change of the state.

 

 

Q5) Classify as sequential or combinational: mux, demux, SR flip flop, decoder, magic memory, ripple counter, barrel shifter, half-adder

mux, demux, decoder, barrel shifter, half-adder: combinational

SR, magic memory, ripple counter: sequential

 

“Magic memory” is a hardware implementation of test-and-set memory.

 

Q6) Represent in 8 bits as 2's complement, 1's complement, bias 2^7, bias 2^7-1, and sign magnitude. Indicate if out of range: 110, -20, -1

Any mistakes?

 

110 -> 55R0 -> 27R1 -> 13R1 -> 6R1 -> 3R0 -> 1R1 -> 0R1 => 1101110

2’s complement: 01101110

1’s complement: 01101110

bias-128: 11101110

bias-127: 11101101

sign magnitude: 01101110

 

20 -> 10R0 -> 5R0 -> 2R1 -> 1R0 -> 0R1 => 00010100

2’s complement: 11101100

1’s complement: 11101011

bias 128: 01101100

bias 127: 01101011

sign-mag: 10010100

 

1 => 00000001

2’s complement: 11111111

1’s complement: 11111110

bias 128: 01111111

bias 127: 01111110

 

Q7) Represent in IEEE-754: -145.1875

IEEE has a sign bit (set here), 8 bits for a bias-127 exponent, and 23 bits for a mantissa with implied 1.

 

 

145 -> 72R1 -> 36R0 -> 18R0 -> 9R0 -> 4R1 -> 2R0 -> 1R0 -> 0R1 => 10010001

.1875 = 3/16 => .0011

 

-145.1875 = 10010001.0011 = 1.00100010011 * 2^7

 

exponent = 7, which in bias 127 means we need to represent 127+7 = 134

134 -> 67R0 -> 33R1 -> 16R1 -> 8R0 -> 4R0 -> 2R0 -> 1R0 -> 0R1 => 10000110

 

So our number in IEEE-754 is 1 10000110 001000100110000000000

 

(Thanks for the correction, Wood. My bad.)

 

 

Q8) What's the difference between a synchronous bus and an asynchronous bus (if any), and who cares?

A synchronous bus is a bus where all devices on the bus are controlled by a clock. An asynchronous bus is not controlled by a central clock.

 

Synchronous busses are generally a good thing because they are easy to implement, generally backward compatible with almost all existing devices, and best of all, they support bus pipelining.

 

However, on a synchronous bus, the clock speed cannot be any higher than what the slowest device on the bus can support. That is, all bus elements are held back by the slowest device.

 

Anything to add?

 

Q9) What are page mode and interleaving (in memory hardware), and who cares?

The difference between these escapes me. Does anybody know?

 

Q10) If my disk has a seek time of 3ms, rotates at 20000 RPM, and has a transfer rate of 10MB/s, how long does it take to transfer two sectors (each of 0.5 MB) on different cylinders? Assume that they are requested sequentially.

At 20000 RPM, our latency is 30000/20000 ms = 1.5 ms. So seek plus latency is 4.5ms. We also need .5/10 = .05 seconds = 50ms for the transfer. (Darn slow.) So each sector costs 54.5 ms, and two such transfers cost 109 ms.

 

Q11) List the three or four main impediments to aggressive pipelining.

Expanding opcodes and/or instructions of widely varying lengths

Lots of instructions that set condition codes / flags

Different instructions take very different amounts of time to execute

Lots of memory read/write stalls (branch/control hazards)

Lots of instructions that depend on results of previous instructions (data hazards)

 

Q12) Where in the OSI model do these protocols belong? What are their purposes? udp, smtp, icmp, csma/cd, ethernet, ssl

Going from memory:

udp = transport

smtp = application

icmp = network (part of ip)

csma/cd = physical

ethernet = physical (does csma/cd always imply Ethernet? Does Ethernet always imply csma/cd?)

SSL = session

 

Q13) Under what conditions does MRU give a very poor fault rate? Under what conditions does LRU give a very poor fault rate?

 

MRU is usually the algorithm of choice (vs LRU), since the principle of temporal locality suggests that if you’ve used a piece of data recently, that you’ll probably need it again. MRU performs much better than LRU when you have a construct like

 

array X[10000000000], Y[1000000000000];
for (int j = 0 to X.length) do something; // this effectively fills up the cache

// now we don’t care about X anymore, but MRU refuses to kick X out of cache
for (int k = 0 to 1000)
   for (int m = 0 to Y.length) do something;   

 

Once in a while, MRU will perform better than LRU, pretty much any time you loop over the same array repeatedly, and no preceding code has already filled up the cache…

 

array Z[100000000000];
for (int n = 0 to 1000)
    for (int p = 0 to Z.length) do something;

 

Q14) Which of the following, if any, are true? (Assume a fully-associative TLB, a set-associative cache, and a fully-associative virtual memory.)

 

I. If a memory reference generates a TLB miss, then it must also generate a cache miss.

Not true. If your TLB has 10 slots and your cache has 100 sets, you could access 1 byte from 90 different pages (thereby blowing out your TLB but not your cache), and then come back to the first location that you read. Said location’s page lookup will no longer be in the TLB, but the byte will still be in the cache.

 

II. If a memory reference generates a cache miss, then it must also generate a TLB miss.

Not true. Suppose your pages could contain 1000 cache lines, but your cache has only space for 50 items. If you loop through the entire page, you’ll blow out your cache fairly quickly, but every single request will have a TLB hit.

 

III. If a memory reference generates a cache miss, then it must also generate a page fault.

Not true. Use same example as in II.

 

IV. If a memory reference generates a page miss, then it must also generate a TLB miss.

True. The TLB is a strict subset of the pages in memory. The only way an entry can get into the TLB is if the page was also requested into physical memory.

 

Q15) Suppose 10 1GHz processors each have a base CPI of 1, plus a miss penalty of 5 cycles per cache miss. Each processor is running a piece of software which, every 10 instructions, reads a 4B word from memory and writes it to a shared network card that wastes 20% of bandwidth on overhead. Each of these instructions has a 10% chance of generating a cache miss. Assuming that this is a split cache system with negligible instruction cache miss, and assuming that the network card's bandwidth is the critical system resource, how much bandwidth must be available through the card to satisfy the needs of these processors?

 

Please check.

 

Each processor ordinarily executes one instruction every cycle of 1 ns. But every 10 instructions, there’s a 10% chance of a cache miss, which costs 5 cycles. So in fact, we average one instruction every 1ns + (1/10)*.10*5 = 1.05 ns.

 

Now, every 10 instructions (10.5ns), the processor outputs 4B to the card. That’s 4B every 10.5ns. Since this is a fairly beefy system with 10 such processors, we are actually outputting 40B every 10.5ns, or about 38 B/ns = 38 GB/s.

 

Now, 20% of the card is wasted on overhead, so actually we need to divide by 0.8 as well. So the card actually has to be connected to a network of 38/0.8 GB/s = 47.5 GB/sec.

 

That’s a pretty awesome network.

 

Link to comment
Share on other sites

http://www.lans.ece.utexas.edu/course/ee360n/sol4_su02.pdf

 

In a memory hierarchy that includes a TLB and a cache (typical translation flow diagram shown on pg. 242 of notes), a memory reference can encounter three different types of misses: a cache miss, a TLB miss and a page miss (i.e. page fault). Consider all the combinations of these three events with one or more occurring (seven possibilities). For each possibility, state whether this event can actually occur, and under what circumstances.

 

Answer: 1. Only TLB miss ¡V possible when the entry of the virtual address is not in TLB. 2. Only cache miss ¡V possible when TLB hit and the page is not in cache but in main memory. 3. Only page miss ¡V impossible because a page miss can only happen after a cache miss. 4. TLB and cache miss ¡V possible when the entry is not in TLB and the data with the address is not in cache but in main memory. 5. TLB and page miss ¡V impossible due to the same reason in 3. 6. Cache miss and page miss ¡V impossible since a hit in TLB indicates the data must be in main memory. 7. All miss ¡V possible when the data is not in main memory but on secondary disk.

 

Link to comment
Share on other sites

Originally posted by wood

 

Nonevent99, why did you use 1/10 and then .10 on (1/10)*.10*5 = 1.05 ns ??

 

Because one tenth of the instructions are the instructions that interest us (the ones that read data from memory and write to network). And there's only a 10% chance of a miss when said instruction happens. So there's two factors of .1 here.

Link to comment
Share on other sites

Originally posted by wood

 

Q1) How many NAND gates are required to implement (x+yz)(x'+y'z')? How about with NOR?

 

I believe 8 ??

 

Q2) Are your solutions to Q1 glitch-free? If not, then how long is/are the glitches, assuming a propagation delay of 10ns for all gates.

 

It is not glitch-free. 10ns?

 

 

 

Wood, I'm getting 9, though I haven't a clue how to prove that it's minimal. Here's my formula:

 

x+yz = ((x+yz)')' = (x' * (yz)')' = x' nand (yz)' = x' nand (y nand z)) =a

x'+y'z' = ((x'+y'z')')' = (x*(y'z')')' = x nand (y'z')' = x nand (y' nand z') =b

 

Then I nand a and b, and invert w/ another nand. I also use nands to invert x, y, and z before feeding them into the function.

 

Q2) My inverters and the black nand change state at t=10 ns, the green nand changes state at t=20, the red nand changes state at t=20, whereas the orange one changes state at t=10 and then again at t=30. Thus, the blue nand changes state at t=20, then again at 30, and again at 40. The final yellow nand changes state at t=30, again at t=40, and again at t=50. The duration of the glitch is from t=30 to t=50 ie 20 ns.

 

 

Link to comment
Share on other sites

I've seen problems like this in many other books but they seem to take the inversions for granted i.e a and a' are supplied. I know you can get a converter with NAND or NOR but the problem is what about if a question like this appears in the test and there are two choices, one for ONLY NAND/NOR and the other with NAND/NOR and NOT, and that magic word ONLY does not appear what do you guys think we should shoot for? I think ETS is not only testing prob. solving but also reading :D (as you guys might have seen I usually skip reading parts of a problem, I guess I have to be more concentrated on the reading AND solving, give 'em what they want :D)
Link to comment
Share on other sites

Originally posted by AlbaLed

 

I've seen problems like this in many other books but they seem to take the inversions for granted i.e a and a' are supplied. I know you can get a converter with NAND or NOR but the problem is what about if a question like this appears in the test and there are two choices, one for ONLY NAND/NOR and the other with NAND/NOR and NOT, and that magic word ONLY does not appear what do you guys think we should shoot for? I think ETS is not only testing prob. solving but also reading :D (as you guys might have seen I usually skip reading parts of a problem, I guess I have to be more concentrated on the reading AND solving, give 'em what they want :D)

 

In real life, NOT operators are apparently free. You just wire up a different type of transistor backward (in CMOS), I think. So they don't impose much additional timing delay. In real life, you definitely wouldn't waste an entire NAND gate just inverting an input.

 

But as for the GRE, the instructions at the beginning of every test clearly indicate that NOT is a separate gate. Moreover, on question 51, that gate is actually used, and it has a timing delay, which is similar to the problem under discussion above. (Number 56 omits the gate and instead uses inverter bubbles, but the point of that problem is different.)

 

So I would have to guess that we should assume that "only NAND" means "only NAND," ie "not NOT". Such is the Way of ETS.

 

Link to comment
Share on other sites

Alba, I guess they will be clear on that part if a question comes out. On question 56, they spent the time to put into parentesis that the bubble is a logical NOT.

 

But, if they don't [}:)]... since ETS is testing more our theoretical knowledge, I would definitely use NAND to get a NOT if the question is "only NAND". It doesn't matter what's the technology today that makes a NOT almost free.

 

Wood

Link to comment
Share on other sites

Guys, I took the time today to try doing Q1 by first representing the function in SOP form (rather than POS) since it's generally more amenable to a two-level NAND implementation using gates with an arbitrary number of inputs. I found that I needed 10 gates!!!!!

 

Do you know of any general rule whereby we can ascertain whether the POS or the SOP will involve fewer NAND or NOR gates???????

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...