Comments (17)
Proper latency is a subject of microarchitecture. For now, let's consider following:
Control flow: execution (generation of condition) + memory (generation of result)
- beq bne blez bgtz beql bnel blezl bgtzl btlz bgez bltzl bgezl bltzal bgezal bltzall bgezall
- j jal jalx jr jalr
- movci movn movz
- syscall break tge tgeu tlt tltu teq tne tgei tlti tltiu teqi tnei
ALU-only: execution
- add addu sub subu and or xor nor dadd daddu dsub dsubu
- addi addiu slti sltiu andi ori xori lui daddi daddiu
- sll srl sra srlv srav slt sltu sllv dsllv dsrlv dsrav
- movz movn syscall
- mfhi mthi mflo mfh
Multiplication/Division: 3 stages of execution
- mult multu div divu dmult dmultu ddiv ddivu
Store/Load: execution + memory:
- ldl ldr lb lh lwl lw lbu lhu lwr ll lwu lwc1 lwc2 pref lld ldc1 ldc2 ld
- sb sh swl sw sdl sdr swr sc swc1 swc2 scd sdc1 sdc2 sd
from mipt-mips.
I think I should start here with the split of remained pipeline stages.
from mipt-mips.
Could you please explain?
from mipt-mips.
It would be nice to have decode, execute and memory stages in the separate classes.
from mipt-mips.
Do you have anything in progress? I would like to reorganize directories in repository a little, I don't want to interfere with your changes
from mipt-mips.
Yes, I have implemented an infrastructure of the complex pipeline. I will endeavour to describe my current progress:
- I have added new substages to Execute stage (class) to simulate ALU for complex arithmetic which is internally pipelined. They share an access to datapath ports (DECODE_2_EXECUTE, EXECUTE_2_MEMORY, EXECUTE_2_WRITEBACK) which are used in order to communicate with the world outside. Intersubstage datapath ports are used when an access to the complex arithmetic unit is required.
- I have introduced a concept of the pipeline route to DataBypass class in order to update current stage of the traced instructions. I have also added
cycles_till_writeback
to RegisterInfo. - Furthermore, taking into account Hennessy& Patterson, I made a questionable assumption on the way Writeback stage would be simulated. An idea was to have a writeback stage with a single writeport to the register file but with multiple routes so as to get several instructions completed at the same cycle in the writeback stage when all conditions for performed checks (these checks are performed in
is_stall
method of DataBypass class) on RAW, WAW dependencies and structural hazards are satisfied.
However, I faced some difficulties with the functional simulator. I have not come up with the nice solution to this problem yet except for some bad and obvious ones but I am looking forward to work it out as soon as possible. I am sorry for such a long delay.
I think I won't have any problems with integration of changes in repository structure.
from mipt-mips.
Intersubstage datapath ports are used when an access to the complex arithmetic unit is required.
Is it possible to use data ports with increased latency instead of 'intersubstage' ports? They would allow to change latency dynamically, simulating different CPUs.
An idea was to have a writeback stage with a single writeport to the register file but with multiple routes so as to get several instructions completed at the same cycle in the writeback stage when all conditions for performed checks
I think I'm missing something. If instructions complete at the same cycle, there cannot be a single write port to register file, right?
If you have some code that can be checked-in independently from your other changes, it will be nice to have it committed to main repository to ensure its safety from contribution of other developers.
from mipt-mips.
Is it possible to use data ports with increased latency instead of 'intersubstage' ports? They would allow to change latency dynamically, simulating different CPUs.
Yes, but discarding invalid instructions ( and also bypassed data in some possible implementations) with a flush signal would become non-trivial. In my opinion, it would also lead to the reduced information value of disassembly. I will try to come up with an idea how mispedictions can be tackled in a nice way here.
I think I'm missing something. If instructions complete at the same cycle, there cannot be a single write port to register file, right?
For example, I think store
and add
can approach the Writeback stage at the same cycle despite the fact that store
writes to zero register in our simulator.
from mipt-mips.
For example, I think store and add can approach the Writeback stage at the same cycle
It is completely wrong. Please recall the latest lecture on memory disambiguation. In real HW stores write data after retirement. We do it on mem stage just because in-order execution is guaranteed, so no younger load can read old data, and no older load reads updated data.
Additionally, store has a lot of opportunities to trap: unaligned address, segmentation fault, page permission fault, etc. Traps should be applied in-order, otherwise you may update the architectural state by next-after-trap instruction as well.
There are only two solutions:
- Add a ROB to get real out-of-order engine (but not superscalar) – that is an over-engineering and dramatically increases complexity of maintaining
- Keep in-order execution with scoreboarding.
from mipt-mips.
Yes, but discarding invalid instructions ( and also bypassed data in some possible implementations)
I do not see a problem here, there are two opportunities:
- Make flush cycle active for N cycles, where N is the latency of port it affects
- Add a flushing interface to the writing port (worse as it complicates code)
In my opinion, it would also lead to the reduced information value of disassembly.
How?
from mipt-mips.
It is completely wrong. Please recall the latest lecture on memory disambiguation. In real HW stores write data after retirement. We do it on mem stage just because in-order execution is guaranteed, so no younger load can read old data, and no older load reads updated data.
Thank you, I have realized the dullness of my proposal.
How?
There would not be the information about the specific execute stage ( execute-0, execute-1, execute-2, etc.) which operates on a complex arithmetic instruction at the current moment.
from mipt-mips.
There would not be the information about the specific execute stag
I see. The better term is 'logging', as disassembly is a string output of instruction opcode.
Logging is required only for debugging, and if we manage to extract Cycle Analysis from EduMIPS64 or make our own one, we will not need it as latencies will be observable in the tool.
from mipt-mips.
@denislos What is estimated time of arrival for the next pull request? I'm not urge you, I just want to know.
from mipt-mips.
I am sorry, I will open a pull request by the end of the week.
from mipt-mips.
Please proceed with documentation.
from mipt-mips.
The link to the page https://github.com/MIPT-ILab/mipt-mips/wiki/Data-Bypass-and-Scoreboard.
Please use that page as example: https://github.com/MIPT-ILab/mipt-mips/wiki/BPU-model
from mipt-mips.
Thank you, I will endeavour to complete it as soon as possible.
from mipt-mips.
Related Issues (20)
- System calls should flush pipeline HOT 3
- Verify correctness of read-after-write behavior
- Make statistic dump better
- Support BSS and SBSS sections in ELF loader
- SLLI/SRLI/SRAI should ignore shamt bits which exceed register size
- External register writes should flush pipeline
- Use const std::array& here
- Fold RISC-V B instruction testing
- Implement ALU as a template class HOT 6
- Add code prefetching algorithms
- Deliver unit tests for JSON output
- assert() is always ignored HOT 1
- RISCV32 Physical memory mapping ? HOT 1
- using noexcept in potentially throwing functions HOT 4
- Wrong "bfp" and "pack" instruction encoding on branch "ooo_window". HOT 2
- Build failing due to specifying template argument in ctor HOT 1
- Build fail: instr_#name not autogenerated HOT 1
- Update rori instruction according to specification HOT 1
- Update bseti instruction according to specification
- Adopt Clang-Format
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mipt-mips.