Something Something, 256-wide compute shader groups run efficiently on all IHV GPUs Nvidia have this weird situation where 512 or 768 invocation wide workgroups will get you optimal occupancy despite max size being 1024 because some archs are weird and have 1536 max co-resident.
Something Something, 256-wide compute shader groups run efficiently on all IHV GPUs Nvidia have this weird situation where 512 or 768 invocation wide workgroups will get you optimal occupancy despite max size being 1024 because some archs are weird and have 1536 max co-resident.
143
95
5K
928K
3K