pub unsafe fn sched_group_barrier<const MASK: u32, const SIZE: u32, const SYNC_ID: u32>()stdarch_amdgpu #149988)Expand description
Creates schedule groups with specific properties to create custom scheduling pipelines.
The ordering between groups is enforced by the instruction scheduler. The intrinsic applies to the code that precedes the intrinsic. The intrinsic takes three values that control the behavior of the schedule groups.
mask: Classify instruction groups using thesched_barriermask values.size: The number of instructions that are in the group.sync_id: Order is enforced between groups with matching values.
The mask can include multiple instruction types. It is undefined behavior to set values beyond the range of valid masks.
Combining multiple sched_group_barrier intrinsics enables an ordering of specific instruction types during instruction scheduling.
For example, the following enforces a sequence of 1 VMEM read, followed by 1 VALU instruction, followed by 5 MFMA instructions.
// 1 VMEM read
sched_group_barrier::<32, 1, 0>()
// 1 VALU
sched_group_barrier::<2, 1, 0>()
// 5 MFMA
sched_group_barrier::<8, 5, 0>()This intrinsic does not behave like a normal function call; it is a “convergent” operation and as such has non-standard control-flow effects which need special treatment by the language. Rust currently does not properly support convergent operations. This operation is hence provided on a best-effort basis. Using it may result in incorrect code under some circumstances.