fix graph on ascend#1345
Open
ShaneWoof wants to merge 1 commit into
Open
Conversation
基于 ACL ModelRI 在昇腾 Ascend 上实现graph特性。 修改内容: 1、/src/infinirt/ascend/infinirt_ascend.cc:将原先返回 DEVICE_TYPE_NOT_SUPPORTED 的桩函数替换为基于 aclmdlRICaptureBegin/aclmdlRICaptureEnd/aclmdlRIDestroy/aclmdlRIExecuteAsync 的完整实现; 2、/src/infinicore/graph/ascend/graph.cc:DeviceGraph 构造函数增加 graph/exec/node 的 nullptr 显式初始化,防止未初始化指针导致的未定义行为。 现状:当前实现可通过/test/infinicore/graph/attention.py用例测试,但是与static attn不兼容,不能单独开启用于性能测试,需要依赖paged-attn或者flash-attn。与当前未优化版本的paged-attn一同开启会发生OOM,无法使用,待实现flash-attn。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
基于 ACL ModelRI 在昇腾 Ascend 上实现graph特性。
修改内容:
1、/src/infinirt/ascend/infinirt_ascend.cc:将原先返回 DEVICE_TYPE_NOT_SUPPORTED 的桩函数替换为基于 aclmdlRICaptureBegin/aclmdlRICaptureEnd/aclmdlRIDestroy/aclmdlRIExecuteAsync 的完整实现; 2、/src/infinicore/graph/ascend/graph.cc:DeviceGraph 构造函数增加 graph/exec/node 的 nullptr 显式初始化,防止未初始化指针导致的未定义行为。
现状:


1、当前实现可通过/test/infinicore/graph/attention.py用例测试:
2、与static attn不兼容,不能单独开启用于性能测试,需要依赖paged-attn或者flash-attn。与当前未优化版本的paged-attn一同开启会发生OOM,无法使用,待实现flash-attn: