Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

【DONT MERGE】 test softmax speed #1326

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from

Conversation

phlrain
Copy link

@phlrain phlrain commented Apr 3, 2023

其中
cinn/ir/fuse_block_model_fp16_test.cc
是softmax 在fp16下的测试case,
kernel耗时,86微秒, 接近phi kernel的 82 微秒,

但是落后torch的 77.47 微秒

原因是,部分for loop没有进行merge,需要进一步merge,手动merge后,实测性能为 75 微秒,能够追平torch的实现

@paddle-bot
Copy link

paddle-bot bot commented Apr 3, 2023

Thanks for your contribution!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant