Comments (3)
Hi @rajrana22,
Some of the files you listed like ec_base_aliases.c and ec_base_vxs.c would only run on a system with no arch-specific optimizations or no modern instruction sets. This is likely why your print statements are not called. Other versions do split up operations for balancing calculation and what temp can be kept in registers. For example we may opt to load sources and calculate 6 parity at a time before looping through sources for the next 6 parity calculations. Within the inner loop, loads from each source really only "slice" based on the size of vector registers. In the file gf_6vect_dot_prod_avx512.asm
you can see we load 64 bytes at a time. Other slicing or blocking is really up to the user as they can send in chunks as they see benefit. The term slice is usually used in RAID or EC for how a single source is split into sources and this is done before passing to ISA-L EC functions.
from isa-l.
Hi @gbtucker,
Thank you for your reply! That was very helpful for me.
Unfortunately, I am not very experienced with assembly, so I was wondering if you could help answer some other questions that I have, which are particular to the erasure coding assembly files:
For reference, I am looking specifically at the gf_2vect_dot_prod_avx512.asm
file.
- What is the control flow of
func(gf_2vect_dot_prod_avx512)
? What instruction does it execute aftermov dest1, [dest1]
? How does it get into the.next_vect
loop? - What exactly does the bulk of the
.next_vect
loop do? Specifically, I'm confused about what all of the masks and nibbles are and what they do. - Upon calling the assembly file, apart from the function arguments, what values are loaded into registers? I'm a bit unsure about what values
ptr
,vec_i
,dest2
, andpos
have, and specifically what they are for. - Overall, I think a high-level understanding of how the assembly files work will help me out a lot, because I'm trying to make modifications to them for my purposes.
from isa-l.
- What is the control flow of
func(gf_2vect_dot_prod_avx512)
? What instruction does it execute aftermov dest1, [dest1]
? How does it get into the.next_vect
loop?
The vpxorq xp1, xp1, xp1
instruction just zeroes out the first accumulator. After that it falls through from the outer loop to the inner loop.
- What exactly does the bulk of the
.next_vect
loop do? Specifically, I'm confused about what all of the masks and nibbles are and what they do.
The flow is simply two loops. The inner loop .next_vect
goes through each coefficient and source to multiply and accumulate. The outer loop writes out the parity and resets for next inner loop.
- Upon calling the assembly file, apart from the function arguments, what values are loaded into registers? I'm a bit unsure about what values
ptr
,vec_i
,dest2
, andpos
have, and specifically what they are for.
Only function arguments are passed to these functions. The ptr, vec_i, dest2, and pos are temporary variables to help index the proper offsets into the arrays passed to the functions.
- Overall, I think a high-level understanding of how the assembly files work will help me out a lot, because I'm trying to make modifications to them for my purposes.
I hope this helps.
from isa-l.
Related Issues (20)
- Upcoming release v2.31 HOT 9
- Broken compilation on ARM/PowerPC HOT 1
- Looking for ARM/PowerPC owner/maintainers HOT 11
- Add isal_zlib_header_init HOT 2
- Error in functional tests for ppc64le HOT 2
- Failed to load symbolec_init_tables HOT 3
- undefined reference to `ec_init_tables' HOT 2
- erasure_code/gf_vect_mul: If the value of len is not aligned with 32B, a non-zero value should be returned
- v2.31.0 tag not annotated HOT 2
- Failed to decompress gz file which has multi-header
- how use raid lib to recover lost data?
- ARM OSX build fails HOT 7
- raid: why does the simd version xor_gen not provide an entry for loop64?
- Windows build fails - yasm doesn't understand %use HOT 4
- If raid xor_gen passed-in array pointers is not aligned to 32B, what's the impact?
- Mention Julia language bindings HOT 7
- Make new release HOT 1
- AVX512 detection failed when cpu supports AVX512 HOT 3
- Does the ISAL library have an API compatible with Rocksdb CRC32C HOT 6
- why crc16_t10dif() may make mistakes but crc16_t10dif_copy() running OK in file storage test ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from isa-l.