bondhugula / pluto Goto Github PK
View Code? Open in Web Editor NEWPluto: An automatic polyhedral parallelizer and locality optimizer
Home Page: http://pluto-compiler.sourceforge.net
License: MIT License
Pluto: An automatic polyhedral parallelizer and locality optimizer
Home Page: http://pluto-compiler.sourceforge.net
License: MIT License
On the pluto+ branch in examples/heat-2dp, 'make par' fails.
This is due to the pet-based extraction extracting undesirable statements (those that should have been eliminated as dead).
heat-2dp.par.c: In function ‘main’:
heat-2dp.par.c:96:7: error: lvalue required as left operand of assignment
96 | 0 = (0 + 1);;
| ^
heat-2dp.par.c:98:7: error: lvalue required as left operand of assignment
98 | 0 = (0 + 1);;
| ^
heat-2dp.par.c:106:9: error: lvalue required as left operand of assignment
106 | 0 = (0 + 1);;
| ^
make: *** [../common.mk:79: par] Error 1
When I run ./configure I get this error
configure: error: llvm-config not found
if I use --with-clang-prefix=/path_to_my_llvm-build
i get this error
configure: error: clang header file not found
I tried this, with several versions: 3.4, 5.0 and 7.0.0
The temporary files .regtile .srcfilename .outfilename remain in the present working directory post polycc running.
Pluto+: ILPs for test/matmul-seq3 and test/tce-4index... hang with the ISL solver. They work with --pipsolve or --glpk. These used to work with an older version of ISL (0.12.1) -- prior to the recent major update to submodules.
Reproduce with the master branch on test/wavefront.c or any stencil like test/heat-2d.c/heat-3d.c. For eg. for test/wavefront.c, (j, i) is a bad loop permutation. Similarly, for heat-3d.c, t+k as the innermost intra-tile loop is the best (as opposed to t).
./polycc test/wavefront.c --notile --noparallel
[pluto] compute_deps (isl)
[pluto] Number of statements: 1
[pluto] Total number of loops: 2
[pluto] Number of deps: 2
[pluto] Maximum domain dimensionality: 2
[pluto] Number of parameters: 1
[pluto] Affine transformations [<iter coeff's> ]
T(S1): (i, j)
loop types (loop, loop)
[pluto] After intra-tile optimize
T(S1): (j, i)
loop types (loop, loop)
[pluto] using Cloog -f/-l options: 1 2
[Pluto] Output written to wavefront.pluto.c
[pluto] Timing statistics
[pluto] SCoP extraction + dependence analysis time: 0.005205s
[pluto] Auto-transformation time: 0.001592s
[pluto] Total constraint solving time (LP/MIP/ILP) time: 0.000928s
[pluto] Code generation time: 0.001203s
[pluto] Other/Misc time: 0.014638s
[pluto] Total time: 0.022638s
This error throws up on,
./configure
The last few lines of the error message is as follows-
checking for llvm-config... yes
checking clang/Basic/SourceLocation.h usability... no
checking clang/Basic/SourceLocation.h presence... no
checking for clang/Basic/SourceLocation.h... no
configure: error: clang header file not found
checking for pet/Makefile... no
configure: error: configure in pet/ failed
This is in the case of Ubuntu 16.04 LTS and clang version 3.8 and LLVM version 3.8.
The master branch works clean.
Cheers.
On the distmem branch, examples/heat-2d, heat-3d output all zeros during verification testing! Fix this.
Since the time diamond tiling has been turned on by default, test/multi-stmt-lazy-lin-ind.c asserts when run with options: --tile; would have had this failure with --tile --lbtile earlier. Passes with --tile --nodiamond-tile.
With libpluto branch, warning below appears - run ./test.sh --silent
isl_ctx.c:253: isl_ctx freed, but some objects still reference it
configure currently checks in /usr/lib64/llvm. Also, check based on --with-clang-prefix value.
Followed the instructions and I got the following error when I run make (relevant part of the output):
CC pet_codegen-pet_codegen.o
CCLD pet_codegen
CXX dummy.o
CC pet_check_code-pet_check_code.o
CXXLD pet_check_code
/usr/bin/x86_64-linux-gnu-ld: pet_check_code-pet_check_code.o: relocation R_X86_64_32S against `.rodata' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/x86_64-linux-gnu-ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Makefile:889: recipe for target 'pet_check_code' failed
make[4]: *** [pet_check_code] Error 1
make[4]: Leaving directory '/work/pluto-distmem/pluto/pet'
Makefile:1085: recipe for target 'all-recursive' failed
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory '/work/pluto-distmem/pluto/pet'
I tried to fix this by adding -fPIC to CCFLAGS, then I got this different error:
CXX dummy.o
CC pet_check_code-pet_check_code.o
CXX scan.lo
CXXLD libpet.la
ar: u' modifier ignored since
D' is the default (see U') CCLD pet_codegen CXXLD pet_check_code ./.libs/libpet.so: undefined reference to
clang::DeclarationName::getAsStringabi:cxx11 const'
./.libs/libpet.so: undefined reference to llvm::sys::getDefaultTargetTriple[abi:cxx11]()' ./.libs/libpet.so: undefined reference to
clang::getClangFullVersionabi:cxx11'
./.libs/libpet.so: undefined reference to clang::DeclarationNameInfo::getAsString[abi:cxx11]() const' ./.libs/libpet.so: undefined reference to
clang::QualType::getAsString[abi:cxx11](clang::Type const*, clang::Qualifiers)'
collect2: error: ld returned 1 exit status
Makefile:889: recipe for target 'pet_check_code' failed
make[4]: *** [pet_check_code] Error 1
make[4]: Leaving directory '/work/pluto-distmem/pluto/pet'
Makefile:1085: recipe for target 'all-recursive' failed
llvm-config and clang both 3.4 version (but its a prebuilt one I got error when i try to build it from tarball), gcc (Ubuntu 4.8.5-4ubuntu8) 4.8.5. Ubuntu 18.04.
Try to find the soulution and what I found it is a linker error which might be caused by compiling libraries with different version of gcc, but still dont know how to resolve this.
(LLVM build output:
_[ 94%] Building CXX object tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/sanitizer_stacktrace_libcdep.cc.o
[ 94%] Building CXX object tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/sanitizer_stoptheworld_linux_libcdep.cc.o
/opt/llvm/llvm/tools/compliler-rt/lib/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc: In function ‘int _sanitizer::TracerThread(void*)’:
/opt/llvm/llvm/tools/compliler-rt/lib/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc:243:22: error: aggregate ‘sigaltstack handler_stack’ has incomplete type and cannot be defined
struct sigaltstack handler_stack;
^~~~~~~~~~~~~
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-c99-extensions’
cc1plus: warning: unrecognized command line option ‘-Wno-gnu’
tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/build.make:158: recipe for target 'tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/sanitizer_stoptheworld_linux_libcdep.cc.o' failed
make[2]: *** [tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/sanitizer_stoptheworld_linux_libcdep.cc.o] Error 1
CMakeFiles/Makefile2:18819: recipe for target 'tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/all' failed
make[1]: *** [tools/compliler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommonLibc.x86_64.dir/all] Error 2
Makefile:151: recipe for target 'all' failed
make: *** [all] Error 2
)
Got any idea what should I do?
I built pluto on ubuntu 16.04 with gcc 8. When I ran the "multi-stmt-lazy-lin-ind" test in test folder, the case failed at an assertion, "hyp_search_mode == LAZY || num_sols_left == num_ind_sols_req - num_ind_sols_found"
I printed some info at the fail point:
num_sols_left=1, num_ind_sols_req=2, num_ind_sols_found=2
Could you help me with this case?
Hi,
I met the following error when I compiled the pet branch:
program.c: In function ‘pet_to_pluto_stmts’:
program.c:4556:18: error: ‘struct pet_stmt’ has no member named ‘stmt_text’
if (pstmt->stmt_text) {
^~
then I checked the definition of struct pet_stmt
in pet.h
struct pet_stmt {
int line;
isl_set *domain;
isl_map *schedule;
struct pet_expr *body;
unsigned n_arg;
struct pet_expr **args;
};
and found that stmt_text
was not a member of struct pet_stmt
Does this problem caused by different versions of pet ? I have tried from pet0.03
to pet0.11
, and
got the same error.
When I mark the code of interest like this:
void mat_mul0(int n_a_rows, int n_a_cols, const float *a, int n_b_cols, const float *b, float *m)
{
int i, j, k;
#pragma scop
for (i = 0; i < n_a_rows; ++i) {
for (j = 0; j < n_b_cols; ++j) {
float t = 0.0;
for (k = 0; k < n_a_cols; ++k)
t += a[i*n_a_rows+k] * b[k*n_a_cols+j];
m[i*n_a_rows+j] = t;
}
}
#pragma endscop
}
and run polycc matmul.c --tile --parallel -o matmul
, all I get is a syntax error:
[Clan] Error: syntax error at line 129, column 8.
Error extracting polyhedra from source file: 'matmul.c'
The line 129 corresponds to float t = 0.0;
.
GCC version 6.3.0
make
...
scan.cc: At global scope:
scan.cc:5073:35: error: ‘pet_scop* PetScan::extract’ is not a static data member of ‘struct PetScan’
struct pet_scop *PetScan::extract(StmtRange stmt_range, bool block,
^~~~~~~~~
scan.cc:5073:35: error: ‘StmtRange’ was not declared in this scope
scan.cc:5073:57: error: expected primary-expression before ‘bool’
struct pet_scop *PetScan::extract(StmtRange stmt_range, bool block,
^~~~
scan.cc:5074:2: error: expected primary-expression before ‘bool’
bool skip_declarations)
^~~~
scan.cc:5074:24: error: expression list treated as compound expression in initializer [-fpermissive]
bool skip_declarations)
^
Hello,
I am trying to install pluto on Mac OS with intel 19.0.0.117 compilers. I am using the latest commit in the master branch.
Following the installation steps, I successfully configure pluto with: ./configure CC=icc CXX=icpc
. I get the following warnings and errors during compilation: make -j4
CCLD cloog
CCLD test/generate_test_advanced
ld: warning: directory not found for option '-L/lib'
ld: warning: directory not found for option '-L/lib'
Making all in clan
YACC source/parser.c
/Users/whj/Dev/pluto/clan/source/parser.y:226.27-34: syntax error, unexpected type, expecting string or identifier
Did I miss something? Thank you!
I have an ICCG solver with data dependencies between iterations. For eg, for 3D matrix the i,j,k value depends on i-1, j-1, k-1, i, j, k, i+1, j+1, k+1
. I used the model test case for gauss seidel example and modified it for my loop. The original loop looks like this -
#pragma scop
for (i=1; i<=nx; i++) {
for (j=1; j<=ny; j++) {
for (k=1; k<=nz; k++) {
dummy = COEFF6[i][j][k] * p_sparse_s[i][j][k];
if (PeriodicBoundaryX && i == 1) dummy += COEFF0[i][j][k] * p_sparse_s[nx ][j][k];
else dummy += COEFF0[i][j][k] * p_sparse_s[i-1][j][k];
if (PeriodicBoundaryX && i == nx) dummy += COEFF1[i][j][k] * p_sparse_s[1 ][j][k];
else dummy += COEFF1[i][j][k] * p_sparse_s[i+1][j][k];
if (PeriodicBoundaryY && j == 1) dummy += COEFF2[i][j][k] * p_sparse_s[i][ny ][k];
else dummy += COEFF2[i][j][k] * p_sparse_s[i][j-1][k];
if (PeriodicBoundaryY && j == ny) dummy += COEFF3[i][j][k] * p_sparse_s[i][ 1][k];
else dummy += COEFF3[i][j][k] * p_sparse_s[i][j+1][k];
if (PeriodicBoundaryZ && k == 1) dummy += COEFF4[i][j][k] * p_sparse_s[i][j][nz ];
else dummy += COEFF4[i][j][k] * p_sparse_s[i][j][k-1];
if (PeriodicBoundaryZ && k == nz) dummy += COEFF5[i][j][k] * p_sparse_s[i][j][ 1];
else dummy += COEFF5[i][j][k] * p_sparse_s[i][j][k+1];
ap_sparse_s[i][j][k] = dummy;
pipi_sparse += p_sparse_s[i][j][k] * ap_sparse_s[i][j][k];
}
}
}
#pragma endscop
For this pluto fails pointing to some syntax error refering to the first if statement. if I remove all if else clause (excluding the periodic boundaries in question) and write the loop as
#pragma scop
for (i=1; i<=nx; i++) {
for (j=1; j<=ny; j++) {
for (k=1; k<=nz; k++) {
ap_sparse_s[i][j][k]= COEFF0[i][j][k] * p_sparse_s[i-1][j][k]
+ COEFF1[i][j][k] * p_sparse_s[i+1][j][k]
+ COEFF2[i][j][k] * p_sparse_s[i][j-1][k]
+ COEFF3[i][j][k] * p_sparse_s[i][j+1][k]
+ COEFF4[i][j][k] * p_sparse_s[i][j][k-1]
+ COEFF5[i][j][k] * p_sparse_s[i][j][k+1]
+ COEFF6[i][j][k] * p_sparse_s[i][j][k] ;
pipi_sparse += p_sparse_s[i][j][k] * ap_sparse_s[i][j][k];
}
}
}
#pragma endscop
Then i am able to run this with pluto but now I expect to have some red black ordering or wave transform of the loop but Pluto throws out the optimised loop as this
int t1, t2, t3, t4;
int lb, ub, lbp, ubp, lb2, ub2;
register int lbv, ubv;
/* Start of CLooG code */
if ((nx >= 1) && (ny >= 1) && (nz >= 1)) {
for (t1=1;t1<=nx;t1++) {
for (t2=1;t2<=ny;t2++) {
for (t3=1;t3<=nz;t3++) {
ap_sparse_s[t1][t2][t3]= COEFF0[t1][t2][t3] * p_sparse_s[t1-1][t2][t3] + COEFF1[t1][t2][t3] * p_sparse_s[t1+1][t2][t3] + COEFF2[t1][t2][t3] * p_sparse_s[t1][t2-1][t3] + COEFF3[t1][t2][t3] * p_sparse_s[t1][t2+1][t3] + COEFF4[t1][t2][t3] * p_sparse_s[t1][t2][t3-1] + COEFF5[t1][t2][t3] * p_sparse_s[t1][t2][t3+1] + COEFF6[t1][t2][t3] * p_sparse_s[t1][t2][t3] ;;
pipi_sparse += p_sparse_s[t1][t2][t3] * ap_sparse_s[t1][t2][t3];;
}
}
}
}
The output loop is exactly the same as the input loop with the loop dependency still existing.
Could you see what is wrong ?
I want to run the executable created by pluto mpi on several nodes of a cluster, but I don't know what I should do. Can u please help me?
I have Pluto built using the master branch.
polycc --tile --l2tile --intratileopt --unroll 2mm.c
generates a 2mm.pluto.c
that does not compile due to missing closing braces.
The attached 2mm.c is from Polybench-4.2
2mm.zip
test/multi-stmt-2d-periodic.c test/multi-stmt-2d-periodic.c:6:15: error: CHECK-DAG: expected string not found in input
// CHECK-DAG: T(S{{[0-9]+}}): (t, t+i, t+j)
^
:26:1: note: scanning from here
[pluto] Affine transformations [<iter coeff's> ]
^
:31:1: note: possible intended match here
T(S2): (t, t+j, t+i)
^
[Failed] test/multi-stmt-2d-periodic.c!
This could be due to the recent change to the objective -- t+j is being preferred over t+i.
Reproduce on the pluto-rlp branch with:
$ ./polycc test/multi-stmt-stencil-seq.c --typedfuse
I compiled an example covcol.c located in examples/covcol, after modifying the common.mk file an setting
NPROCS=4
, using
make dist
, and after that I tried running it on a cluster using the command
mpirun -np 4 -host host1,host2,host3,host4
and when I check the rank number by printing it
printf("rank number is %d",my_rank )
;
it's always equal to zero. How can I check if the program uses all the nodes or not?
make[1]: Entering directory '/home/uday/git/pluto/examples/ssymm'
touch .test
./orig 2> out_orig
12.579868s
./tiled 2> out_tiled
3.264796s
diff -q out_orig out_tiled
Files out_orig and out_tiled differ
make[1]: *** [../common.mk:106: test] Error 1
On pluto+ branch, test/fusion10.c hangs with GLPK. Works fine with ISL or PIP as solvers:
./polycc test/fusion10.c --glpk --moredebug
Constructing initial basis...
Size of triangular part is 32
0: obj = -9.600000000e+01 inf = 4.614e+01 (6)
10: obj = 9.074000000e+02 inf = 4.330e-15 (0)
On MacOSX it seems that -fopenmp is not accepted as an option by clang-9.0. On the forums I found that openmp support is not good, this may be an explanation.
However, I can't see why the compilation of pluto uses -fopemp, and how to disable this feature.
BTW, if someone is interested, I have compilation instructions for fedora, including some hacks needed to fully compile.
Running ./test_libpluto yields on current master tip: 6118e7f
...
*** TEST CASE 6
[Pluto] Number of statements: 1
[Pluto] Total number of loops: 3
[Pluto] Number of deps: 1
[Pluto] Maximum domain dimensionality: 3
[Pluto] Number of parameters: 2
[pluto] Diamond tiling not possible/useful
[Pluto] Affine transformations [<iter coeff's> ]
T(S1): (i0, i1+i2, i1)
loop types (loop, loop, loop)
[Pluto] After tiling:
T(S1): (i0/32, (i1+i2)/32, i1/32, i0, i1+i2, i1)
loop types (loop, loop, loop, loop, loop, loop)
[pluto_mark_parallel] 1 parallel loops
t1 {loop with stmts: S1, }
[R, T] -> { S_0[i0, i1, i2] -> [o0, o1, o2, i0, i1 + i2, i1] : -31 + i0 <= 32o0 <= i0 and -31 + i1 + i2 <= 32o1 <= i1 + i2 and -31 + i1 <= 32o2 <= i1 }
[Pluto] Number of statements: 1
[Pluto] Total number of loops: 3
[Pluto] Number of deps: 1
[Pluto] Maximum domain dimensionality: 3
[Pluto] Number of parameters: 2
[pluto] Diamond tiling not possible/useful
[Pluto] Affine transformations [<iter coeff's> ]
T(S1): (i0, i1+i2, i1)
loop types (loop, loop, loop)
[Pluto] After tiling:
T(S1): (i0/32, (i1+i2)/32, i1/32, i0, i1+i2, i1)
loop types (loop, loop, loop, loop, loop, loop)
[pluto_mark_parallel] 1 parallel loops
t1 {loop with stmts: S1, }
*** TEST CASE test_lib_pluto_schedule ***
lt-test_libpluto: program.cpp:1997: isl_stat basic_map_extract_dep(isl_basic_map*, void*): Assertion `0' failed.
Aborted (core dumped)
--iss doesn't do any splitting on heat-1dp, heat-2dp, heat-3dp from examples/ either in the pluto branch or pluto+.
After a slightly modified version of the get_polly_libpluto.sh has run for about 10h, the GCC process that tries to build the LLVM, gets killed, because the Raspberry Pi 3 runs out of RAМ, despite the fact that it has about 900MiB of RAM available for the build job as a whole. Cross compilation on a more powerful machine would be very tedious, because the configure scripts (probably, I do not know for sure) need to have access to the actual Raspberry Pi 3 package collection.
The directory, with all logs and ARM binaries and the rest in it MIGHT be available from
https://temporary.softf1.com/2017/bugs/2017_05_19_GCC_Crash_on_Raspberry_Pi_3_Raspbian.tar.xz
test_libpluto hangs on the third test case.
$ ./test_libpluto
*** TEST CASE 3 ***
With debug out, it appears that the call to isl hangs.
[pluto] (Band 1) Solving for hyperplane #2
[pluto] pluto_prog_constraints_lexmin (11 variables, 29 constraints)
[pluto] pluto_constraints_lexmin_isl (11 variables, 29 constraints)
With GLPK, the second test case itself hangs.
Making all in pet
make[2]: Entering directory '/home/chriselrod/Documents/libraries/pluto/pet'
make all-recursive
make[3]: Entering directory '/home/chriselrod/Documents/libraries/pluto/pet'
Making all in .
make[4]: Entering directory '/home/chriselrod/Documents/libraries/pluto/pet'
CXX pet.lo
CC pet_check_code-pet_check_code.o
CC libdep_a-all.o
AR libdep.a
ar: `u' modifier ignored since `D' is the default (see `U')
pet.cc: In function ‘isl_stat foreach_scop_in_C_source(isl_ctx*, const char*, const char*, pet_options*, isl_stat (*)(pet_scop*, void*), void*)’:
pet.cc:1147:19: error: invalid use of incomplete type ‘class clang::Builtin::Context’
1147 | PP.getBuiltinInfo().initializeBuiltins(PP.getIdentifierTable(),
| ~~~~~~~~~~~~~~~~~^~
In file included from pet.cc:87:
/usr/local/include/clang/Lex/Preprocessor.h:84:7: note: forward declaration of ‘class clang::Builtin::Context’
84 | class Context;
| ^~~~~~~
make[4]: *** [Makefile:1207: pet.lo] Error 1
make[4]: Leaving directory '/home/chriselrod/Documents/libraries/pluto/pet'
make[3]: *** [Makefile:1250: all-recursive] Error 1
make[3]: Leaving directory '/home/chriselrod/Documents/libraries/pluto/pet'
make[2]: *** [Makefile:843: all] Error 2
make[2]: Leaving directory '/home/chriselrod/Documents/libraries/pluto/pet'
make[1]: *** [Makefile:521: all-recursive] Error 1
make[1]: Leaving directory '/home/chriselrod/Documents/libraries/pluto'
make: *** [Makefile:391: all] Error 2
This is with
clang --version
clang version 10.0.0 (https://github.com/llvm/llvm-project.git b406eab888021ade8b4e680d2cf45b82fca17a98)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
I don't have this problem with clang 9.0. While clang 10 hasn't been released yet, it seems unlikely that there will be many changes between the current 10 (rc4) and the final release.
This issue probably should have been opened at pet-for-pluto.
On the master branch, examples/heat-1d emits no output for testing (On a make test, out_orig is empty).
I tried to compile the 'pet' branch, but I get the following errors:
CXXLD pet_check_code
./.libs/libpet.so: undefined reference to clang::DeclarationName::getAsString[abi:cxx11]() const' ./.libs/libpet.so: undefined reference to
llvm::sys::getDefaultTargetTripleabi:cxx11'
./.libs/libpet.so: undefined reference to clang::getClangFullVersion[abi:cxx11]()' ./.libs/libpet.so: undefined reference to
clang::DeclarationNameInfo::getAsStringabi:cxx11 const'
./.libs/libpet.so: undefined reference to `clang::QualType::getAsString[abi:cxx11](clang::Type const*, clang::Qualifiers)'
collect2: error: ld returned 1 exit status
Makefile:888: recipe for target 'pet_check_code' failed
I tried both with Clang 3.4 and Clang 3.3 as it says in the front-page Readme that it only works with versions up to 3.4.
Should I use a different version of Clang?
Which one?
Thanks in advance!
I got the following error when I run polycc with heat-3d:
$ ../../polycc heat-3d.c --distmem --mpiomp --commopt_fop --tile --isldep --lastwriter --cloogsh --timereport -o heat-3d.distopt_fop.c
[pluto] Turning on lastwriter
[pluto] Assuming data arrays are declared globally; turn on variables_not_global (and include macro definitions) otherwise
[Clan] Error: syntax error at line 95, column 28.
Error extracting polyhedra from source file: 'heat-3d.c'
heat-1d and heat-2d report similar error. (they both point to the modulo (%) operator)
pluto+ branch no longer works after the merge. This was known at merge time, but it was committed so that it can be fixed in steps. All test cases on a 'make test' fail.
[Failed] test/fusion10.c!
test/negparam.c free(): double free detected in tcache 2
FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/negparam.c
[Failed] test/negparam.c!
test/nodep.c free(): double free detected in tcache 2
FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/nodep.c
[Failed] test/nodep.c!
test/noloop.c [Passed]
test/seidel.c free(): double free detected in tcache 2
^C[Failed] test/seidel.c!
test/seq.c free(): double free detected in tcache 2
^C^C[Failed] test/seq.c!
test/shift.c free(): double free detected in tcache 2
^C^C[Failed] test/shift.c!
test/simple.c ^C^C[Failed] test/simple.c!
test/tricky1.c free(): double free detected in tcache 2
FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/tricky1.c
[Failed] test/tricky1.c!
test/tricky2.c FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/tricky2.c
[Failed] test/tricky2.c!
test/tricky3.c free(): double free detected in tcache 2
FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/tricky3.c
[Failed] test/tricky3.c!
test/tricky4.c FileCheck error: '-' is empty.
FileCheck command line: FileCheck test/tricky4.c
[Failed] test/tricky4.c!
test/tce-4index-transform.c free(): double free detected in tcache 2
^C^C[Failed] test/tce-4index-transform.c!
test/wavefront.c free(): double free detected in tcache 2
^C^C[Failed] test/wavefront.c!
test/pluto+/dep-1,1.c free(): double free detected in tcache 2
^C
^C[Failed] test/pluto+/dep-1,1.c!
Hello,
I'm on a pretty recent version of Ubuntu (18.04.3 LTS).
I installed clang-9 (which installs llvm-9). The same problem appears
with clang-6.0.
I followed the build process for the devel branch, as described in
https://github.com/bondhugula/pluto
The process blocks while running ./configure (with no arguments). Here are
the last lines it prints:
checking which clang to use... system
checking for llvm-config... yes
checking for main in -lLLVM-9.0.0... yes
checking clang/Basic/SourceLocation.h usability... no
checking clang/Basic/SourceLocation.h presence... no
checking for clang/Basic/SourceLocation.h... no
configure: error: clang header file not found
checking for pet/Makefile... no
configure: error: configure in pet/ failed
To get to this point, I also had to make symbolic links to FileCheck-9 and
llvm-config-9 to get rid of the "-9" (otherwise, the installation build blocks
earlier, as it requires FileCheck and llvm-config without the version number).
Addendum: the stable version compiles with no problems.
Best,
dpotop
$ valgrind --leak-check=full ./src/pluto test/gemver.c --dfp --hybridfuse --notile --noparallel
[pluto] Auto-transformation time: 0.525126s
[pluto] Total FCG Construction Time: 0.262769s
[pluto] Total FCG Colouring Time: 0.052208s
[pluto] Total Scaling + Shifting time: 0.040810s
[pluto] Total Skewing time: 0.046699s
[pluto] Total constraint solving time (LP/MIP/ILP) time: 0.267857s
[pluto] Code generation time: 0.418973s
[pluto] Other/Misc time: 1.653294s
[pluto] Total time: 2.733726s
[pluto] All times: 0.136333 0.525126 0.418973 1.653294
==25470==
==25470== HEAP SUMMARY:
==25470== in use at exit: 106,495 bytes in 52 blocks
==25470== total heap usage: 200,042 allocs, 199,990 frees, 28,838,540 bytes allocated
==25470==
==25470== 64 bytes in 4 blocks are definitely lost in loss record 40 of 49
==25470== at 0x483880B: malloc (vg_replace_malloc.c:309)
==25470== by 0x42716D: dims_to_be_skewed (in /home/uday/git/pluto/src/pluto)
==25470== by 0x42797A: introduce_skew (in /home/uday/git/pluto/src/pluto)
==25470== by 0x43AAF8: pluto_auto_transform (in /home/uday/git/pluto/src/pluto)
==25470== by 0x42C2B5: main (in /home/uday/git/pluto/src/pluto)
==25470==
==25470== LEAK SUMMARY:
==25470== definitely lost: 64 bytes in 4 blocks
==25470== indirectly lost: 0 bytes in 0 blocks
==25470== possibly lost: 0 bytes in 0 blocks
==25470== still reachable: 106,431 bytes in 48 blocks
==25470== suppressed: 0 bytes in 0 blocks
==25470== Reachable blocks (those to which a pointer was found) are not shown.
==25470== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==25470==
==25470== For counts of detected and suppressed errors, rerun with: -v
Problem linking pet and clang apparently.
CXXLD libpet.la
/usr/bin/ld: /home/thiago/llvm-3.4/lib/libclangFrontend.a(CompilerInstance.cpp.o): unrecognized relocation (0x2a) in section `.text'
/usr/bin/ld: final link failed: Bad value
LLVM-3.4 and Clang-3.4 were compiled with gcc-4.9.3.
The configure finishes right.
Any ideas?
Thanks,
Thiago
The current Pluto using PET as the parser will fail to extract access functions during the dependence analysis for code such as below:
S1: C[i][j] = 0;
S2: C[i][j] += A[i][k] * B[k][j]
The RAW dependence between S1 and S2 is associated with both source access function and destination access function as NULL, which is supposed to be set as C[i][j].
The reason is that when setting access functions for each dependence, the current Pluto uses the if condition
program.cpp:
line 422-423
if (options->isldepaccesswise &&
(stmts[dep->src]->reads != NULL && stmts[dep->dest]->reads != NULL))
When using Clan as the parser, for S1, when the nreads = 0, it will still malloc 0 bytes and assign stmt->reads a unique pointer.
osl_pluto.c:
line 833
stmt->reads = (PlutoAccess **)malloc(stmt->nreads * sizeof(PlutoAccess *));
However, when using PET, currently, stmt->reads will be set as NULL
program.cpp
line 555-561
if (stmt->nreads > 0) {
stmt->reads = (PlutoAccess **)malloc(stmt->nreads * sizeof(PlutoAccess *));
}
Therefore, it will make the dependence analysis skip the access function assignment when using PET as the parser for such dependence.
I tried to perform some experiments with Pluto (branch pet) and a Red-Black
Gauss-Seidel stencil, but the resulting code is invalid.
For the input code:
#pragma scop
for (t = 0; t < T; t++) {
for (i = 1; i < N+1; i++)
for (j = 1; j < N+1; j++)
for (k = 1; k < N+1; k++)
if ((i+j+k)%2 == 0)
A[i][j][k] = 0.2 * A[i][j][k]
+ 0.16 * R[i][j][k]
+ 0.13 * (A[i - 1][j][k] + A[i][j - 1][k] + A[i][j][k - 1] +
A[i + 1][j][k] + A[i][j + 1][k] + A[i][j][k + 1]);
for (i = 1; i < N+1; i++)
for (j = 1; j < N+1; j++)
for (k = 1; k < N+1; k++)
if ((i+j+k)%2 == 1)
A[i][j][k] = 0.2 * A[i][j][k]
+ 0.16 * R[i][j][k]
+ 0.13 * (A[i - 1][j][k] + A[i][j - 1][k] + A[i][j][k - 1] +
A[i + 1][j][k] + A[i][j + 1][k] + A[i][j][k + 1]);
}
#pragma endscop
Pluto generates (using the call "polycc kernel.c --pet -o test.c") the
following code, which does not take the conditions into account:
for (t2=0;t2<=3;t2++) {
for (t4=1;t4<=128;t4++) {
for (t5=1;t5<=128;t5++) {
lbv=1;
ubv=128;
#pragma ivdep
#pragma vector always
for (t6=lbv;t6<=ubv;t6++) {
A[ t4][ t5][ t6] = (((0.2 * A[ t4][ t5][ t6]) + (0.16 * R[ t4][ t5][
t6])) + (0.13 * (((((A[ t4 - 1][ t5][ t6] + A[ t4][ t5 - 1][ t6]) + A[ t4][
t5][ t6 - 1]) + A[ t4 + 1][ t5][ t6]) + A[ t4][ t5 + 1][ t6]) + A[ t4][ t5][
t6 + 1])));;
}
}
}
for (t4=1;t4<=128;t4++) {
for (t5=1;t5<=128;t5++) {
lbv=1;
ubv=128;
#pragma ivdep
#pragma vector always
for (t6=lbv;t6<=ubv;t6++) {
A[ t4][ t5][ t6] = (((0.2 * A[ t4][ t5][ t6]) + (0.16 * R[ t4][ t5][
t6])) + (0.13 * (((((A[ t4 - 1][ t5][ t6] + A[ t4][ t5 - 1][ t6]) + A[ t4][
t5][ t6 - 1]) + A[ t4 + 1][ t5][ t6]) + A[ t4][ t5 + 1][ t6]) + A[ t4][ t5][
t6 + 1])));;
}
}
}
}
The full example can be found here: https://pastebin.com/yi9SLCqD
For the following SCoP a call to polycc (polycc kernel.c --lastwriter --pet -o kernel.out.c) results in a segmentation fault (output: polycc: line 54: 26951 Segmentation fault
):
#pragma scop
for (i = 1; i < N+1; i++)
for (j = 1; j < N+1; j++)
for (k = 1; k < N+1; k++)
A2[i][j][k] = 0.2 * A1[i][j][k]
+ 0.16 * R[i][j][k]
+ 0.13 * (A1[i - 1][j][k] + A1[i][j - 1][k] + A1[i][j][k - 1] +
A1[i + 1][j][k] + A1[i][j + 1][k] + A1[i][j][k + 1]);
for (i = 1; i < N+1; i++)
for (j = 1; j < N+1; j++)
for (k = 1; k < N+1; k++)
A1[i][j][k] = 0.2 * A2[i][j][k]
+ 0.16 * R[i][j][k]
+ 0.13 * (A2[i - 1][j][k] + A2[i][j - 1][k] + A2[i][j][k - 1] +
A2[i + 1][j][k] + A2[i][j + 1][k] + A2[i][j][k + 1]);
#pragma endscop
Yet another attempt to compile Pluto. This time on fedora. The environment seems saner than on MacOSX and Ubuntu (for instance, FileCheck is part of llvm-devel). However, the configuration process fails on pet, after a few issues I was able to work around.
The issues:
It seems that the pet configuration process is looking for a file called clang/Basic/SourceLocation.h that does not exist in clang-9.0.0
Although diamond tiling is clearly possible here, it's not being performed (the scalar dimension at the top likely confuses the detection).
$ ./test_libpluto
[...]
*** TEST CASE 6
[pluto] Diamond tiling not possible/useful
[pluto] Affine transformations
T(S1): (i0, i1+i2, i1)
loop types (loop, loop, loop)
Outermost tilable bands: 1 bands
(t1, t2, t3, ) with stmts {S1, }
Innermost tilable bands: 1 bands
(t1, t2, t3, ) with stmts {S1, }
[Pluto] After tiling:
T(S1): (i0/32, (i1+i2)/32, i1/32, i0, i1+i2, i1)
loop types (loop, loop, loop, loop, loop, loop)
[Pluto] After intra-tile optimize
T(S1): (i0/32, (i1+i2)/32, i1/32, i1+i2, i1, i0)
loop types (loop, loop, loop, loop, loop, loop)
[pluto_mark_parallel] 1 parallel loops
t1 {loop with stmts: S1, }
[pluto] Auto-transformation time: 0.002858s
[pluto] Other/Misc time: 0.008955s
[pluto] Total time: 0.011813s
[R, T] -> { S_0[i0, i1, i2] -> [o0, o1, o2, i1 + i2, i1, i0] : -31 + i0 <= 32o0 <= i0 and -31 + i1 + i2 <= 32o1 <= i1 + i2 and -31 + i1 <= 32o2 <= i1 }
Diamond tiling isn't working with pluto+. Reproduce, for example, with test/heat-2d.c.
[pluto] Diamond tiling not possible/useful
[pluto] Affine transformations [<iter coeff's> ]
T(S1): (t, t+i, t+j)
loop types (loop, loop, loop)
[Pluto] After tiling:
T(S1): (t/32, (t+i)/32, (t+j)/32, t, t+i, t+j)
loop types (loop, loop, loop, loop, loop, loop)
[Pluto] After tile scheduling:
T(S1): (t/32+(t+i)/32, (t+i)/32, (t+j)/32, t, t+i, t+j)
loop types (loop, loop, loop, loop, loop, loop)
[pluto] using statement-wise -fs/-ls options: S1(4,6),
[Pluto] Output written to heat-2d.pluto.c
I am automating the Pluto build process with my new Alien::Pluto software for Perl:
https://github.com/wbraswell/alien-pluto
However it fails to build due to a missing makeinfo
command, which is not right because I have not changed any documentation files (or any other files whatsoever) in the Pluto tarball...
https://api.travis-ci.org/v3/job/316614841/log.txt
Making all in candl
make[3]: Entering directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl'
Making all in doc
make[4]: Entering directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl/doc'
make[5]: Entering directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl/doc'
MAKEINFO candl.info
/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl/autoconf/missing: line 81: makeinfo: command not found
WARNING: 'makeinfo' is missing on your system.
You should only need it if you modified a '.texi' file, or
any other file indirectly affecting the aspect of the manual.
You might want to install the Texinfo package:
<http://www.gnu.org/software/texinfo/>
The spurious makeinfo call might also be the consequence of
using a buggy 'make' (AIX, DU, IRIX), in which case you might
want to install GNU make:
<http://www.gnu.org/software/make/>
make[5]: *** [candl.info] Error 127
make[5]: Leaving directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl/doc'
make[4]: *** [all-recursive] Error 1
make[4]: Leaving directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl/doc'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4/candl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/travis/build/wbraswell/alien-pluto/_alien/build_73oc/pluto-0.11.4'
external command failed at /home/travis/perl5/perlbrew/perls/5.10/lib/site_perl/5.10.1/Alien/Build/CommandSequence.pm line 87.
make: *** [_alien/mm/build] Error 2
...
The command "make" exited with 2.
I am NOT running a buggy make on Travis:
$ make -v
GNU Make 3.81
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
This program built for x86_64-pc-linux-gnu
The bottom line is that no part of Pluto should be trying to call makeinfo
during the build process.
Please let me know as soon as this issue has been resolved so that I can test it again and then publish the Alien::Pluto software.
Thanks in advance! :-)
None of the distmem targets in examples/ work - due to missing declarations for (t_comp_start, etc...) to start with. These are being emitted but they later disappear from the generated code. We also lack test cases to check for such regressions - they could be added to the filecheck-based tests we now have (filecheck-test.sh).
cd examples/seidel
make distopt
pluto] using statement-wise -fs/-ls options: S1(2,3), S2(2,3), S3(2,3),
[CLooG] INFO: 1 dimensions (over 7) are scalar.
[pluto] using statement-wise -fs/-ls options: S1(5,7), S2(6,7), S3(6,7), S4(6,7), S5(6,7), S6(6,7),
[Pluto] Output written to seidel.distopt.c
[pluto] using statement-wise -fs/-ls options: S1(5,7), S2(6,7), S3(6,7), S4(6,7), S5(6,7), S6(6,7),
[Pluto] Output written to seidel.distopt.c
OMPI_CC=gcc mpicc -D__MPI -O3 -march=native -mtune=native -ftree-vectorize -fopenmp -DTIME seidel.distopt.c sigma_seidel.distopt.c pi_seidel.distopt.c
../../polyrt/polyrt.c -o distopt -I ../../polyrt -lm
seidel.distopt.c: In function ‘main’:
seidel.distopt.c:118:15: error: ‘t_comp_start’ undeclared (first use in this function); did you mean ‘t_start’?
IF_TIME(t_comp_start = rtclock());
^~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:118:15: note: each undeclared identifier is reported only once for each function it appears in
IF_TIME(t_comp_start = rtclock());
^~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:119:7: error: ‘_lb_dist’ undeclared (first use in this function); did you mean ‘va_list’?
_lb_dist = max(ceild(2 * t1 - 2, 5), ceild(10 * t1 - T + 1, 10));
^~~~~~~~
va_list
seidel.distopt.c:120:7: error: ‘_ub_dist’ undeclared (first use in this function); did you mean ‘u_int’?
_ub_dist =
^~~~~~~~
u_int
seidel.distopt.c:122:7: warning: implicit declaration of function ‘polyrt_loop_dist’ [-Wimplicit-function-declaration]
polyrt_loop_dist(_lb_dist, _ub_dist, nprocs, my_rank, &lbd_t3, &ubd_t3);
^~~~~~~~~~~~~~~~
seidel.distopt.c:122:44: error: ‘nprocs’ undeclared (first use in this function)
polyrt_loop_dist(_lb_dist, _ub_dist, nprocs, my_rank, &lbd_t3, &ubd_t3);
^~~~~~
seidel.distopt.c:122:52: error: ‘my_rank’ undeclared (first use in this function)
polyrt_loop_dist(_lb_dist, _ub_dist, nprocs, my_rank, &lbd_t3, &ubd_t3);
^~~~~~~
seidel.distopt.c:122:62: error: ‘lbd_t3’ undeclared (first use in this function); did you mean ‘lbd’?
polyrt_loop_dist(_lb_dist, _ub_dist, nprocs, my_rank, &lbd_t3, &ubd_t3);
^~~~~~
lbd
seidel.distopt.c:122:71: error: ‘ubd_t3’ undeclared (first use in this function); did you mean ‘uid_t’?
polyrt_loop_dist(_lb_dist, _ub_dist, nprocs, my_rank, &lbd_t3, &ubd_t3);
^~~~~~
uid_t
seidel.distopt.c:123:64: error: ‘lbd_t4’ undeclared (first use in this function); did you mean ‘lbd’?
#pragma omp parallel for private(lbv, ubv, _lb_dist, _ub_dist, lbd_t4, ubd_t4,
^~~~~~
lbd
seidel.distopt.c:123:72: error: ‘ubd_t4’ undeclared (first use in this function); did you mean ‘uid_t’?
#pragma omp parallel for private(lbv, ubv, _lb_dist, _ub_dist, lbd_t4, ubd_t4,
^~~~~~
uid_t
seidel.distopt.c:124:38: error: ‘lbd_t5’ undeclared (first use in this function); did you mean ‘lbd’?
t4, lbd_t5, ubd_t5, t5, lbd_t6, ubd_t6, t6,
^~~~~~
lbd
seidel.distopt.c:124:46: error: ‘ubd_t5’ undeclared (first use in this function); did you mean ‘uid_t’?
t4, lbd_t5, ubd_t5, t5, lbd_t6, ubd_t6, t6,
^~~~~~
uid_t
seidel.distopt.c:124:58: error: ‘lbd_t6’ undeclared (first use in this function); did you mean ‘lbd’?
t4, lbd_t5, ubd_t5, t5, lbd_t6, ubd_t6, t6,
^~~~~~
lbd
seidel.distopt.c:124:66: error: ‘ubd_t6’ undeclared (first use in this function); did you mean ‘uid_t’?
t4, lbd_t5, ubd_t5, t5, lbd_t6, ubd_t6, t6,
^~~~~~
uid_t
seidel.distopt.c:125:34: error: ‘lbd_t7’ undeclared (first use in this function); did you mean ‘lbd’?
lbd_t7, ubd_t7, t7)
^~~~~~
lbd
seidel.distopt.c:125:42: error: ‘ubd_t7’ undeclared (first use in this function); did you mean ‘uid_t’?
lbd_t7, ubd_t7, t7)
^~~~~~
uid_t
seidel.distopt.c:167:15: error: ‘t_comp’ undeclared (first use in this function)
IF_TIME(t_comp += rtclock() - t_comp_start);
^~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:168:15: error: ‘t_pack_start’ undeclared (first use in this function); did you mean ‘t_start’?
IF_TIME(t_pack_start = rtclock());
^~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:174:14: error: ‘__p’ undeclared (first use in this function)
for (__p = 0; __p < nprocs; __p++) {
^~~
seidel.distopt.c:175:11: error: ‘receiver_list’ undeclared (first use in this function)
receiver_list[__p] = 0;
^~~~~~~~~~~~~
seidel.distopt.c:177:9: warning: implicit declaration of function ‘sigma_a_0’ [-Wimplicit-function-declaration]
sigma_a_0(t1, t3, T, N, my_rank, nprocs, receiver_list);
^~~~~~~~~
seidel.distopt.c:180:13: error: ‘send_count_a’ undeclared (first use in this function)
send_count_a = pack_a_0(t1, t3, send_buf_a, send_count_a);
^~~~~~~~~~~~
seidel.distopt.c:180:28: warning: implicit declaration of function ‘pack_a_0’ [-Wimplicit-function-declaration]
send_count_a = pack_a_0(t1, t3, send_buf_a, send_count_a);
^~~~~~~~
seidel.distopt.c:180:45: error: ‘send_buf_a’ undeclared (first use in this function); did you mean ‘setvbuf’?
send_count_a = pack_a_0(t1, t3, send_buf_a, send_count_a);
^~~~~~~~~~
setvbuf
seidel.distopt.c:185:15: error: ‘t_pack’ undeclared (first use in this function)
IF_TIME(t_pack += rtclock() - t_pack_start);
^~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:189:15: error: ‘t_comm_start’ undeclared (first use in this function); did you mean ‘t_start’?
IF_TIME(t_comm_start = rtclock());
^~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:197:15: error: ‘t_comm’ undeclared (first use in this function)
IF_TIME(t_comm += rtclock() - t_comm_start);
^~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:200:9: error: ‘send_counts_a’ undeclared (first use in this function)
send_counts_a[__p] = receiver_list[__p] ? send_count_a : 0;
^~~~~~~~~~~~~
seidel.distopt.c:202:7: warning: implicit declaration of function ‘MPI_Alltoall’ [-Wimplicit-function-declaration]
MPI_Alltoall(send_counts_a, 1, MPI_INT, recv_counts_a, 1, MPI_INT,
^~~~~~~~~~~~
seidel.distopt.c:202:38: error: ‘MPI_INT’ undeclared (first use in this function)
MPI_Alltoall(send_counts_a, 1, MPI_INT, recv_counts_a, 1, MPI_INT,
^~~~~~~
seidel.distopt.c:202:47: error: ‘recv_counts_a’ undeclared (first use in this function)
MPI_Alltoall(send_counts_a, 1, MPI_INT, recv_counts_a, 1, MPI_INT,
^~~~~~~~~~~~~
seidel.distopt.c:203:20: error: ‘MPI_COMM_WORLD’ undeclared (first use in this function)
MPI_COMM_WORLD);
^~~~~~~~~~~~~~
seidel.distopt.c:204:7: error: ‘req_count’ undeclared (first use in this function)
req_count = 0;
^~~~~~~~~
seidel.distopt.c:207:19: error: ‘__total_count’ undeclared (first use in this function)
IF_TIME(__total_count += send_count_a);
^~~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:208:11: warning: implicit declaration of function ‘MPI_Isend’ [-Wimplicit-function-declaration]
MPI_Isend(send_buf_a, send_count_a, MPI_DOUBLE, __p, 123,
^~~~~~~~~
seidel.distopt.c:208:47: error: ‘MPI_DOUBLE’ undeclared (first use in this function)
MPI_Isend(send_buf_a, send_count_a, MPI_DOUBLE, __p, 123,
^~~~~~~~~~
seidel.distopt.c:209:38: error: ‘reqs’ undeclared (first use in this function); did you mean ‘read’?
MPI_COMM_WORLD, &reqs[req_count++]);
^~~~
read
seidel.distopt.c:214:11: warning: implicit declaration of function ‘MPI_Irecv’ [-Wimplicit-function-declaration]
MPI_Irecv(recv_buf_a + displs_a[__p], recv_counts_a[__p], MPI_DOUBLE,
^~~~~~~~~
seidel.distopt.c:214:21: error: ‘recv_buf_a’ undeclared (first use in this function); did you mean ‘setvbuf’?
MPI_Irecv(recv_buf_a + displs_a[__p], recv_counts_a[__p], MPI_DOUBLE,
^~~~~~~~~~
setvbuf
seidel.distopt.c:214:34: error: ‘displs_a’ undeclared (first use in this function)
MPI_Irecv(recv_buf_a + displs_a[__p], recv_counts_a[__p], MPI_DOUBLE,
^~~~~~~~
seidel.distopt.c:218:7: warning: implicit declaration of function ‘MPI_Waitall’ [-Wimplicit-function-declaration]
MPI_Waitall(req_count, reqs, stats);
^~~~~~~~~~~
seidel.distopt.c:218:36: error: ‘stats’ undeclared (first use in this function)
MPI_Waitall(req_count, reqs, stats);
^~~~~
seidel.distopt.c:221:9: error: ‘curr_displs_a’ undeclared (first use in this function)
curr_displs_a[__p] = 0;
^~~~~~~~~~~~~
seidel.distopt.c:224:15: error: ‘t_unpack_start’ undeclared (first use in this function); did you mean ‘t_start’?
IF_TIME(t_unpack_start = rtclock());
^~~~~~~~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c:229:9: error: ‘proc’ undeclared (first use in this function); did you mean ‘putc’?
proc = pi_0(t1, t3, T, N, nprocs);
^~~~
putc
seidel.distopt.c:229:16: warning: implicit declaration of function ‘pi_0’; did you mean ‘pipe’? [-Wimplicit-function-declaration]
proc = pi_0(t1, t3, T, N, nprocs);
^~~~
pipe
seidel.distopt.c:237:37: warning: implicit declaration of function ‘unpack_a_0’ [-Wimplicit-function-declaration]
curr_displs_a[proc] = unpack_a_0(
^~~~~~~~~~
seidel.distopt.c:244:15: error: ‘t_unpack’ undeclared (first use in this function); did you mean ‘truncate’?
IF_TIME(t_unpack += rtclock() - t_unpack_start);
^~~~~~~~
seidel.distopt.c:24:22: note: in definition of macro ‘IF_TIME’
#define IF_TIME(foo) foo;
^~~
seidel.distopt.c: In function ‘write_out’:
seidel.distopt.c:485:7: warning: implicit declaration of function ‘MPI_Gather’ [-Wimplicit-function-declaration]
MPI_Gather(&lw_count_a, 1, MPI_INT, lw_recv_counts_a, 1, MPI_INT, 0,
^~~~~~~~~~
seidel.distopt.c:485:34: error: ‘MPI_INT’ undeclared (first use in this function)
MPI_Gather(&lw_count_a, 1, MPI_INT, lw_recv_counts_a, 1, MPI_INT, 0,
^~~~~~~
seidel.distopt.c:486:18: error: ‘MPI_COMM_WORLD’ undeclared (first use in this function)
MPI_COMM_WORLD);
^~~~~~~~~~~~~~
seidel.distopt.c:487:7: warning: implicit declaration of function ‘MPI_Gatherv’ [-Wimplicit-function-declaration]
MPI_Gatherv(lw_buf_a, lw_count_a, MPI_DOUBLE, lw_recv_buf_a,
^~~~~~~~~~~
seidel.distopt.c:487:41: error: ‘MPI_DOUBLE’ undeclared (first use in this function)
MPI_Gatherv(lw_buf_a, lw_count_a, MPI_DOUBLE, lw_recv_buf_a,
^~~~~~~~~~
make: *** [../common.mk:284: distopt] Error 1
I tried compiling the program fdtd-2d program which is in the examples folder, I only changed the sizes of matrices by setting the constants nx and ny 10240 and the compilation failed.
This doesn't occur when the problem size is small.
Also, in my case running the command
mpirun -np 16 -hostfile ../hosts ./dist
result in 16 executions with one rank each.
The command I used :
../../polycc fdtd-2d.c --distmem --timereport --nocommopt --tile --isldep --lastwriter --indent -o fdtd-2d.dist.c
OMPI_CC=gcc mpicc -D__MPI -O3 -march=native -mtune=native -ftree-vectorize -fopenmp -DTIME fdtd-2d.dist.c pi_fdtd-2d.dist.c sigma_fdtd-2d.dist.c \
../../polyrt/polyrt.c -o dist -I ../../polyrt -lm
The error I got :
In file included from fdtd-2d.dist.c:4:0:
pi_defs.h:6:0: warning: "_UB_REPLACE_ME_DISTLOOG0t3" redefined [enabled by default]
#define _UB_REPLACE_ME_DISTLOOG0t3 min(min(floord(tmax+ny-2,32),floord(32*t1+ny+30,64)),t1)
^
pi_defs.h:4:0: note: this is the location of the previous definition
#define _UB_REPLACE_ME_DISTLOOG0t3 min(min(floord(tmax+ny-1,32),floord(32*t1+ny+31,64)),t1)
^
In file included from pi_fdtd-2d.dist.c:4:0:
pi_defs.h:6:0: warning: "_UB_REPLACE_ME_DISTLOOG0t3" redefined [enabled by default]
#define _UB_REPLACE_ME_DISTLOOG0t3 min(min(floord(tmax+ny-2,32),floord(32*t1+ny+30,64)),t1)
^
pi_defs.h:4:0: note: this is the location of the previous definition
#define _UB_REPLACE_ME_DISTLOOG0t3 min(min(floord(tmax+ny-1,32),floord(32*t1+ny+31,64)),t1)
^
/tmp/cc7wD9xO.o: In function `read_grid_size':
polyrt.c:(.text+0xa2): relocation truncated to fit: R_X86_64_32S against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
/tmp/cc7wD9xO.o: In function `polyrt_init_grid_size':
polyrt.c:(.text+0x459): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x463): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x46d): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x477): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x481): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x48b): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x495): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x49f): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x4a9): relocation truncated to fit: R_X86_64_PC32 against symbol `grid_size' defined in COMMON section in /tmp/cc7wD9xO.o
polyrt.c:(.text+0x4b3): additional relocation overflows omitted from the output
collect2: error: ld returned 1 exit status
make: *** [dist] Error 1
In distmem branch, 'make distomp' fails in any of the examples/ dirs.
I tried to distribute this implementation of warp_affine, which I could parallelize with Pluto
float A00,A01,A10,A11;
int coord_00_c,coord_00_r,coord_01_c,coord_01_r,coord_10_c,coord_10_r,coord_11_c,coord_11_r;
float o_r,o_c,r,c; double t_start, t_end;
init_array() ;
IF_TIME(t_start = rtclock());
int n_r,n_c;
int i,j;
const float a00=0.1f;
const float a01=0.1f;
const float a10=0.1f;
const float a11=0.1f;
const float b00=0.1f;
const float b10 =0.1f;
#pragma scop
for (n_r = 0; n_r < ROWS; n_r++) {
for (n_c = 0; n_c < COLS; n_c++) {
o_r = a11 * n_r + a10 * n_c + b00;
o_c = a01 * n_r + a00 * n_c + b10;
r = o_r - floorf(o_r);
c = o_c - floorf(o_c);
coord_00_r = floorf(o_r);
coord_00_c = floorf(o_c);
coord_01_r = coord_00_r;
coord_01_c = coord_00_c + 1;
coord_10_r = coord_00_r + 1;
coord_10_c = coord_00_c;
coord_11_r = coord_00_r + 1;
coord_11_c = coord_00_c + 1;
coord_00_r = clamp(coord_00_r, 0, ROWS - 1);
coord_00_c = clamp(coord_00_c, 0, COLS - 1);
coord_01_r = clamp(coord_01_r, 0, ROWS - 1);
coord_01_c = clamp(coord_01_c, 0, COLS - 1);
coord_10_r = clamp(coord_10_r, 0, ROWS - 1);
coord_10_c = clamp(coord_10_c, 0, COLS - 1);
coord_11_r = clamp(coord_11_r, 0, ROWS - 1);
coord_11_c = clamp(coord_11_c, 0, COLS - 1);
A00 = src[coord_00_r][coord_00_c];
A10 = src[coord_10_r][coord_10_c];
A01 = src[coord_01_r][coord_01_c];
A11 = src[coord_11_r][coord_11_c];
dst[n_r][n_c] = mixf(mixf(A00, A10, r), mixf(A01, A11, r), c);
}
}
#pragma endscop
and it failed with this error :
[CLooG] INFO: 1 dimensions (over 3) are scalar.
pluto: ast_transform.c:193: pluto_mark_parallel: Assertion `stmt->ploop_id == i' failed.
../../polycc: line 60: 5016 Aborted (core dumped) $pluto $*
Here is the folder which contains the whole code and all the required files, it also contains the full output.
warp_affine.zip.
I get the same error when I try to use a scalar variable with a reduction, here is an example :
int q,w,cc,e,r;
init_array() ;
float prod1;
#pragma scop
for (int q = 0; q < ROWS - 5; q++) {
for (int w = 0; w < COLS - 5; w++) {
for (int cc = 0; cc < 3; cc++) {
prod1=0.0f;
for (int r = 0; r < 5; r++) {
prod1 += src[q + r][w][cc] * kernelX[r];
}
temp[q][w][cc] = prod1;
}
}
}
#pragma endscop
I also get ./isl_list_templ.c:246: index out of bounds
when running make par
, I couldn't understand why.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.