|
1
|
|
|
2
|
- System:
- Collection of “Processes” linked by Channels
- Channels pass messages with guaranteed delivery
- Processes synchronize
- Processes can be decomposed into smaller processes
|
|
3
|
- In case of edge triggered stages
- During the cycle: Process
- At the edge of the clock: Pass to successor
|
|
4
|
|
|
5
|
|
|
6
|
- Distributed Synchronization
- Sender
- Receiver
|
|
7
|
- Channel: A bundle of wires and a
protocol for communicating data/control called a token
- Data/control encoding: Dual-rail or single-rail
- Communication protocol: Specific form of handshaking over request and
acknowledgement wires
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
|
|
13
|
|
|
14
|
- Enclosed Handshaking
- B completes handshake w/ C before starting handshake w/ D
- Operation associated with C occurs before operation associated with D
- B can enclose both handshakes in handshake w/ A
- Completion of handshake w/ A is ack that C and D’s task are done
- Pipelining Handshake
- B overlaps handshake w/ C and handshake w/ A
- Creates pipeline behavior
- Increases throughput
|
|
15
|
- Internal handshake on R represents the completion of some operation
- The enclosed handshake represents a “function call”
- Initiated by the request on on L
- Terminated by the acknowledgement on L
|
|
16
|
- Handshakes on S1 first then S2
- Used to sequence operations associated with S1 and S2
- Both handshakes enclosed in handshake on R
- Initiated by request on R
- Terminated by acknowledgement of R
|
|
17
|
|
|
18
|
- Purpose
- Pulls data from its input channel and pushes it onto its output channel
- All enclosed in handshake on request channel R
- Operation
- Waits for a request on its passive nonput port
- Then initiates a handshake on its pull input port
- The handshake on the pull input channel is relayed to the push output
channel
- Finally, it completes the handshaking on the passive nonput channel
|
|
19
|
- Pipeline handshaking enables multiple tokens to exist in pipeline
- Each token represents intermediate result of different problem instance
- Increases throughput of system
- No tokens lost despite relative speed of stages – has implicit flow control
- Two types
- Full buffers can support distinct tokens on inputs/output channels
- Half buffers cannot support distinct tokens on inputs/outputs
- N-stage pipeline of half-buffers can support a maximum of N/2 tokens
|
|
20
|
|
|
21
|
- Handshaking constraint that leads to a half-buffer
- Output channel must be acknowledged (e.g., c2ack+)
- indicating that the output token (e.g., on channel c2) has been
consumed (and thus is in the subsequent channel (e.g., in channel c3))
- Before the reset phases of the input channel is complete (e.g., before
c1ack-) which is before
- A new token on the input channel (e.g., c1) can be generated.
|
|
22
|
- MERGE
- Wait for token on S.
- Depending on value,
wait for token on either A or B and send onto O
- SPLIT
- Wait for token on S and A.
- Dependent upon value of S,
send copy of token on A to O1 or O2
|
|
23
|
- Assumptions (in this example)
- full-buffer two-phase handshaking
- dual-rail select signal
- Functionality
- Token on A consumed first
- After token on S = 0
- I.e., S0 changes
- Token on B stalled until consumed second
- After token on S = 1
- I.e., Once S1 changes
- Result: two tokens on O
- First = Oreq+
- Second = Oreq-
|
|
24
|
|
|
25
|
|
|
26
|
|
|
27
|
- Purpose
- Used to control access to shared resource
- Approach
- Acknowledge handshake on request port that arrives first, granting
access
- Requires four-phase protocol
- winner maintains mutually-exclusive access of resource until it resets
request
- Caveat
- Make take an exponential amount of time to determine who came first
when requests arrive very close together
- Sometimes called slackless arbiter
|
|
28
|
- Slackless Tree Arbiter cell
- Used to build multi-way arbiters
- Approach
- Add T channel to normal arbiter
- Delay ack of request channels until T channel acknowledged
- Can send request on T channel as soon as any request arrives
|
|
29
|
- Purpose
- Used to control access to shared resource in a pipelined design
- Approach
- Acknowledge of request not used to signify winner
- Instead, additional W channel used to identify who won
- Handshake on W simultaneously with acknowledging winning request
- Note
- Still may take exponential time
- In principle, this can use a two or four-phase protocol
|
|
30
|
- Tree Arbiter cell
- Used to build multi-way pipelined arbiters
- Approach
- Add synchronization channel O to 2-way (pipelined) arbiter
- Send request on O channel as soon as any request arrives
- Question
- How can use this cell as the basis of a 4-way pipelined arbiter?
|
|
31
|
- The Problem Scenario
- All 4 requests arrive at same time
- Output generates 1-bit output
- Which of the 4 requests does this 1-bit output identify?
- Need notion of addresses to distinguish between 4 requests
|
|
32
|
- Features
- Provides any sender to any receiver communication
- Concurrent communication enabled
- S0 -> R0 && S1 -> R1
- S0 ->R1 && S1 -> R0
- No packets lost
- Implicit flow control
|
|
33
|
- Design notes
- Addresses are 1-bit wide
- Special Splits
- Splits 1-bit input into one of two synchronization tokens
- Not slack elastic
- Adding pipeline buffers can cause design to deadlock
- Occurs when control path has more slack than data channel
|
|
34
|
- Design notes
- Each half of the pipeline is controlled by a repeater (denoted ‘*”)
- Repeatedly handshake its active nonput channel with a sequencer
- Sequencer (denoted ‘SEQ’) is responsible for
- first transferring the input to the corresponding variable
- and then onto the next stage.
- The join element, denoted by a ‘·’, is responsible for synchronizing the transfer between the
two variables.
|
|
35
|
- Control-driven 2-place FIFO
- Transfer of data is control-driven and transfer elements have active
(pull) inputs.
- Data stored in designated variable elements “x” and “y”
- BUF-based linear pipeline
- Data stored in channels
- Can store two tokens (on C1 and C2) assuming BUF is a full-buffer
- Data-driven consisting of pipeline buffers that have passive (push)
inputs.
|