TileLink is a protocol designed to be a substrate for cache coherence transactions implementing a particular cache coherence policy within an on-chip memory hierarchy. Its purpose is to orthogonalize the design of the on-chip network and the implementation of the cache controllers from the design of the coherence protocol itself. Any cache coherence protocol that conforms to TileLink’s transaction structure can be used interchangeably with the physical networks and cache controllers we provide.
TileLink is roughly analogous to the data link layer in the IP network protocol stack, but exposes some details of the physical link necessary for efficient controller implementation. It also codifies some transaction types that are common to all protocols, particularly the transactions servicing memory accesses made by agents that do not themselves have caches with coherence policy metadata.
TileLink assumes a Client/Manager architecture where agents participating in the coherence protocol are either:
A client may be a cache, a DMA engine, or any other component that would like to participate in the coherent memory domain, regardless of whether or not it actually keeps a copy of the data locally. A manager may be an outer-level cache controller, a directory, or a broadcast medium such as a bus. In a multi-level memory hierarchy, a particular cache controller can function as both a client (wrt. caches further out in the hierarchy) and a manager (wrt. caches closer to the processors).
TileLink defines five independent transaction channels. These channels may be multiplexed over the same physical link, but to avoid deadlock TileLink specifies a priority amongst the channels that must be strictly enforced. Channels may contain both metadata and data components. The channels are:
At present time, all channels are routed from clients to managers or from managers to clients. (In the future, client-to-client Grants may be added.)
The prioritization of channels is Finish >> Grant >> Release >> Probe >> Acquire. Preventing messages of a lower priority from blocking messages of a higher priority from being sent or received is necessary to avoid deadlock.
There are two types of transaction that can occur on a cache block managed by TileLink. The first type enables clients to acquire a cache block:
The second type of transaction is supports clients voluntarily releasing a cache block:
TileLink does not make any assumptions about the ordering of messages sent point-to-point over particular channels. Therefore, concurrency must be managed by agents at several points in the system.
Transactions can be merged in certain situations. One specific situation that must be handled by all manager agents is receiving a voluntary Release for a block which another client is currently attempting to Acquire. The manager must accept the voluntary Release as well as any Releases resulting from Probe messages, and provide Grant messages to both clients before the transaction can be considered complete.
When running on networks that provide guaranteed ordering of messages between any client/manager pair, the Finish acknowledgment of a Grant (and the Grant acknowledgement of a voluntary Release) can be omitted.
This section details the specific signals contained in each channel of the TileLink protocol.
Every channel is wrapped in the Chisel.DecoupledIO interface, meaning that each contains ready and valid signals as well as the following signals. Channels with data may send the data over multiple beats; the width of the underlying network is exposed to improve the efficiency of refilling data into caches whose data array rows are of a matching size.
Initiates a transaction to acquire access to a cache block with proper permissions. Also used to write data without caching it (acquiring permissions for the write as it does so).
addr_block | UInt | Physical address of the cache block, with block offset removed |
client_xact_id | UInt | Client’s id for the transaction |
data | UInt | Client-sent data, used for Put transactions |
addr_beat | UInt | Offset of this beat’s worth of data within the cache block |
built_in_type | Bool | Whether the transaction is a built-in or custom type |
a_type | UInt | Type of the transaction For built-in transactions, one of: [UncachedRead, UncachedWrite, UncachedAtomic, UncachedReadBlock, UncachedWriteBlock] Otherwise defined by the coherence protocol |
union | Union | Used to derive the following subfields: |
allocate | Bool | R/W: Hints whether to allocate data in outer caches when servicing this request |
op_code | UInt | Memory op code (Read, Write, or AMO ALU op) |
op_size | UInt | A: Size of the AMO operands (Byte, Half, Word, Double) |
addr_byte | UInt | A: Address of the AMO operands within the block |
amo_shift_bits | UInt | A: Number of bits to shift block to extract AMO operands |
wmask | UInt | W: Byte write mask for Write op’s data |
There are seven built-in types of Acquire that are available to all clients that want to participate in the coherence protocol, even if they themselves will not keep cached copies of the data. Because these transactions do not create a new private copy of the targeted cache block, they are termed “uncached” transactions. The available uncached transactions are as follows:
The PutBlock message is unique in that may contains multiple beats of data (if the cache block size is larger than TLDataBits). The client controller that generates this message is responsible for generating multiple sequential PutBlock messages and incrementing the addr_beat field as it does so.
Queries an agent to determine whether it has a cache block or revoke its permissions on that cache block.
addr_block | UInt | Physical address of the cache block, with block offset removed |
p_type | UInt | Transaction type, defined by coherence protocol |
Acknowledgement of probe receipt, releasing permissions on the line along with any dirty data. Also used to voluntarily write back data or cede permissions on the block.
addr_block | UInt | Physical address of the cache block, with block offset removed |
client_xact_id | UInt | Client’s id for the transaction |
data | UInt | Used to writeback dirty data |
addr_beat | UInt | Offset of this beat’s worth of data within the cache block |
r_type | UInt | Transaction type, defined by coherence protocol |
voluntary | Bool | Whether this release is voluntary or in response to a Probe |
Release messages may contains multiple beats of data (if the cache block size is larger than TLDataBits). The client controller that generates this message is responsible for generating multiple sequential Release messages and incrementing the addr_beat field as it does so.
Provides data or permissions to the original requester granting, access to the cache block. Also used to acknowledge voluntary Releases.
built_in_type | Bool | Whether transaction type is built-in or custom |
g_type | UInt | Type of the transaction For built-in transactions, one of: [VoluntaryAck, PrefetchAck, PutAck, GetDataBeat, GetDataBlock] Otherwise defined by the coherence protocol |
client_xact_id | UInt | Client’s id for the transaction |
manager_xact_id | UInt | Manager’s id for the transaction, passed to Finish |
data | UInt | Used to supply data to original requestor |
addr_beat | UInt | Offset of this beat’s worth of data within the cache block |
There are five built-in types of Grant that are available to all managers that want to participate in the coherence protocol. Because “uncached” transactions do not create a new private copy of the targeted cache block, we use these Grant types mostly as acknowledgements. The available types are as follows:
The GetDataBlock message may contains multiple beats of data (if the cache block size is larger than TLDataBits). The manager controller that generates this message is responsible for generating multiple sequential GetDataBlockmessages and incrementing the addr_beat field as it does so.
Final acknowledgement of transaction completion from requestor, used for transaction ordering.
manager_xact_id | UInt | Manager’s id for the transaction |
For the convenience of designers implementing Client and Manager agents, we provide TileLinkNetworkPort modules which abstract away the details of the on-chip network implementation. These network ports automatically generate networking headers, perform SerDes for narrower physical network channels, and generate appropriate control flow logic. The ports then expose simplified subsets of the TileLink channels to the agent modules.
ClientTileLinkIO consists of standard Acquire, Probe, Release, and Grant message channels. It does not include the Finish channel as generating those acknowledgements is handled by the ClientTileLinkNetworkPort. ManagerTileLinkIO consists of Acquire, Probe, Release, and Grant message channels that have additional data appended about the source or destination of messages, expressed in terms of the Client’s id. It does include a Finish channel so that the manager knows when to register the transaction as complete.
This section defines the parameters that are exposed by the TileLink to the top-level design. All agents that implement TileLink should either work for all values of these parameters within the specified ranges, or should add Chisel.Constraints to the design to define functional limits on hem.
Name | Type | Function |
TLId | String | Ids a TileLink in a multi-level hierarchy |
TLCoherencePolicy | CoherencePolicy | Coherency policy used on this TileLink |
TLNManagers | Int | Number of manager agents |
TLNClients | Int | Number of client agents |
TLNCachingClients | Int | Number of client agents that cache data |
TLNCachelessClients | Int | Number of client agents that do not cache data |
TLMaxClientXacts | Int | Max number of concurrent transactions per client |
TLMaxClientsPerPort | Int | Max number of clients sharing a single network port |
TLMaxManagerXacts | Int | Max number of concurrent transactions per manager |
TLBlockAddrBits | Int | Address size |
TLDataBits | Int | Amount of block data sent per beat, must be > 64b |
TLDataBeats | Int | Number of beats per cache block |
TLNetworkIsOrderedP2P | Boolean | Whether the underlying physical network preserved point-to-point ordering of messages |