Comments (1)
Thinkng about this more, we can reuse our memory_coloring pass and then just do some post processing.
So first we would lower the pointwise operators to an inner_pointwise that takes an allocate instruction. The size of allocation can just be the shape of the pointwise instruction(the size of the instruction is not really important as long as each one is the same size).
Then we run memory_coloing and then replace the inner_pointwise with the first load with a pointwise and the other loads we would replace to reference the first pointwise:
std::unordered_map<std::size_t, instruction_ref> load2ins;
for(auto ins:iterator_for(m))
{
if (ins->name() != "gpu::inner_pointwise")
continue;
auto out = ins->inputs().back();
auto inputs = ins->inputs();
auto offset = out.to_value()["offset"].to<std::size_t>();
if(contains(load2ins, offset))
{
auto i = load2ins[offset];
inputs.back() = i;
m.replace_instruction(ins, ins->get_operator(), inputs, ins->module_inputs());
}
else
{
load2ins[offset] = ins;
inputs.pop_back();
m.replace_instruction(ins, make_op("pointwise"), inputs, ins->module_inputs());
}
}
We will need to update the codegen to handle aliased variables. So when an instruction aliases, instead of generating a return variable(ie auto zn = f(...)
) instead it would generate the statement standalone(ie f(x)
) and then update the mapping to refer to the original variable that is aliased.
There is still one issue that would need to be solved. If two inner_pointwise
use different data types then we shouldn't reuse the buffers. We could possibly fix this by running memory_coloring multiple times for each type.
from amdmigraphx.
Related Issues (20)
- Bug in find_concat_op
- Pooling JIT kernel causes random perf drop and end to end performance is low HOT 1
- Reduce compile time by reducing calls to `compute_shape()` for each IR transformation HOT 11
- Find a way to test JIT pooling kernel
- Symbolic shapes
- Padding as fusion
- ResNet34 Perf
- Bug in find_concat_op with predicate func
- Simplify using distributive property of matrix multiplication
- license_stamper.py not resolving check_stamped.py errors
- [Documentation]: Integration Onboarding
- Make `enable_splitk_for_tuning` unit attr
- Automate Stable Diffusion 2.1 Model
- Integrate codegen API for CK gemm-multiple-d HOT 1
- JIRA Ticket: ModuleNotFoundError: No module named 'onnxruntime' HOT 2
- Bump rocMLIR SHA with Navi3x Accuracy Fix
- Optimize Dot + Slice HOT 1
- Remove `qlinear_reused` matcher and instead fuse MLIR `quant_dot` with base pointwise operators HOT 1
- Create a static output shape verify of OneHot
- Investigate test_verify error for DPP instructions on Navi21
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amdmigraphx.