etmc / tmlqcd Goto Github PK

tmLQCD is a freely available software suite providing a set of tools to be used in lattice QCD simulations. This is mainly a HMC implementation (including PHMC and RHMC) for Wilson, Wilson Clover and Wilson twisted mass fermions and inverter for different versions of the Dirac operator. The code is fully parallelised and ships with optimisations for various modern architectures, such as commodity PC clusters and the Blue Gene family.

Home Page: http://www.itkp.uni-bonn.de/~urbach/software.html

License: GNU General Public License v3.0

C 90.33% C++ 3.54% Python 0.28% Makefile 0.87% Shell 0.14% Perl 0.15% Lex 3.01% Pawn 0.49% Assembly 0.82% R 0.30% POV-Ray SDL 0.08%

hmc lqcd multigrid quda solver ddalphaamg qphix clover rhmc twisted

tmlqcd's Introduction

Here are some remarks collected in order to configure, compile and
install the tmLQCD programme suit. For more information, also about running
the code please read the documentation in the doc sub-directory. 

CONFIGURE and COMPILE

It is recommended to build the code not in the source directory but in
a separate directory.

The lime library (tested with version 1.2.3) is needed to compile the
program. Please download it at

http://usqcd.jlab.org/usqcd-software/c-lime/

Configure and compile lime (for documentation see
http://usqcd.jlab.org/usqcd-docs/c-lime/) first.
Then you should use the configure option --with-lime=dir for the
tmLQCD to set the correct directory where to find lime (see below). 

For more documentation please change into the doc directory and type
latex main.tex
and see the sections for configuring, installing and testing the code.

Here we have gathered some examples for some standard architectures.
Building the tmLQCD executables is a three step procedure:

****************************************************************************

1) configure:

In your build directory type

path-to-the-sources/configure --help

to get an overview of the available options and switches. In
particular check out the prefix option for your installation path. 
What follows now are some examples for a few standard architectures.

- a scalar build on a P4 machine would look like:

path-to-the-sources/configure --disable-mpi --enable-sse2 --enable-p4 \
  --enable-gaugecopy --disable-newdiracop --with-limedir=<path-to-lime> \
  --with-lapack="<linker options needed for lapack>" \
  CC=<cc>

- Opteron with SSE2:

path-to-the-sources/configure --disable-mpi --enable-sse2 --enable-opteron \
  --enable-gaugecopy --disable-newdiracop --with-limedir=<path-to-lime> \
  --with-lapack="<linker options needed for lapack>" \
  CC=<cc>

- A MPI parallel (4dims) build on a P4 cluster:

path-to-the-sources/configure --enable-mpi --enable-sse2 --enable-p4 \
  --with-mpidimension=4 --enable-gaugecopy --disable-newdiracop \
  --with-limedir=<path-to-lime> --with-lapack="<linker options needed for lapack>" \
  CC=<mpicc>

- on the Munich Altix machine:

path-to-the-sources/configure --enable-mpi --with-mpidimension=4 \
  --with-limedir=<path-to-lime> --enable-newdiracop \
  --disable-shmem --with-lapack="<linker options needed for lapack>" \
  CC=mpicc CFLAGS="-mcpu=itanium2 -O3 -g -c99 -mtune=itanium2" 

for lapack on this machine please type
module load mkl


- on the HLRB ice installation use

path-to-the-sources/configure --enable-mpi --with-mpidimension=4 \
   --disable-sse2 --disable-p4  --with-limedir=<path-to-lime> \
   --enable-newdiracop --with-lapack="<linker options needed for lapack>" \
   CC="mpicc -std=c99" CFLAGS="-g" \

where it is again important to use the Intel C compiler! 

for lapack first load the module mkl and then use

--with-lapack="-L$LIBRARY_PATH -llapack -lblas"

- on Blue Gene installations

For the Blue Gene L and P see the README.bg? files

For BG/Q you can enable QPX intrinsics with --enable-qpx, which will have
effect only with the XLC compiler.

You may enable or disable other configure options as needed. See the
documentation for more details.

****************************************************************************

2) make

type `make` in your build directory.

If there appears no error message during compilation you should end up
with a few executable in the build directory, namely `hmc_tm`,
`invert` and `invert_doublet`.

****************************************************************************

3) make install

type `make install`

to get the executables installed.



****************************************************************************
****************************************************************************

in the following we provide a "codemap", giving a short explanation
for the contents of each c-file:

****************************************************************************
top directory: apart from the main routines all routines are compiled into
	       the run-time library libhmc.

DML_crc32.c: invert, invert_doublet, hmc_tm
	     some helper functions to compute the SCIDAC 
	     checksum
D_psi.c:     invert, invert_doublet, hmc_tm
	     Wilson twisted mass Dirac operator, not even/odd 
	     preconditioned 
Hopping_Matrix.c: invert, invert_doublet, hmc_tm
	     Hopping matrix for the even/odd preconditioned 
	     Dirac operator
Hopping_Matrix_nocom.c: benchmark
	     Hopping matrix for the even/odd preconditioned 
	     Dirac operator, communication switched off
Nondegenerate_Matrix.c: invert_doublet, hmc_tm
	     operators needed for even/odd preconditioning 
	     the non-degenerate flavour doublet Dirac operator
Ptilde_nd.c: hmc_tm
	     the more precise polynomial $\tilde P$ needed for 
	     the PHMC for the non-degenerate flavour doublet
benchmark.c: main routine
	     benchmark code for D_psi and Hopping_Matrix
block.c:     experimental
boundary.c:  invert, invert_doublet, hmc_tm
	     implements the twisted boundary conditions for the
	     spinor fields
chebyshev_polynomial.c: experimental
chebyshev_polynomial_nd.c: hmc_tm
	     implements the generation of coefficients for the 
	     chebyshev polynomial using the clenshaw recursion 
	     relation
deriv_Sb.c:  hmc_tm
	     the variation of Q=gamma_5 D with respect to the 
	     gauge fields in the even/odd case 
deriv_Sb_D_psi.c: hmc_tm
	     the variation of Q=gamma_5 D with respect to the 
	     gauge fields in the non even/odd case 
det_monomial.c: hmc_tm
	     implements the functions needed for a det monomial
detratio_monomial.c: hmc_tm
	     implements the functions needed for a detratio monomial
poly_monomial.c: hmc_tm
             implements function needed for a POLY monomial 
             (PHMC for light degenerate quarks)
dml.c:       invert, invert_doublet, hmc_tm
	     some helper functions to compute the SCIDAC 
	     checksum
double2single.c: main routine
	     can convert a gauge field from double to single precision
single2double.c: main routine
	     can convert a gauge field from single to double precision
eigenvalues_bi.c: hmc_tm
	     computes eigenvalues of the mass non-degenerate two flavour 
	     Dirac operatoe
expo.c:      hmc_tm
	     implements the exponetial function of an su(3) element
gamma.c:     invert, invert_doublet, hmc_tm
	     implements multiplication of gamma matrices and some useful
	     combination of those with a spinor field
gauge_io.c:  invert, invert_doublet, hmc_tm
	     IO routines for gauge fields 
gauge_monomial.c: hmc_tm
	     implements the functions needed for a gauge monomial
gen_sources.c: invert, invert_doublet, hmc_tm
	     implements the generation of source spinor fields
geometry_eo.c: invert, invert_doublet, hmc_tm
	     anything related to gauge and spinor field geometry
get_rectangle_staples.c: hmc_tm
             computes rectangular staples of gauge links as needed for
	     e.g. the Iwasaki gauge action and its derivative
get_staples.c: hmc_tm
             computes plaquette staples of gauge links as needed for
	     for all gauge actions and their derivatives
getopt.c:    invert, invert_doublet, hmc_tm
	     needed for command line options
hmc_tm.c:    main routine
	     hmc_tm executable
hybrid_update.c: hmc_tm
	     implements the functions for the gauge field update and
	     the momenta update
init_bispinor_field.c 
init_chi_copy.c
init_chi_spinor_field.c
init_dirac_halfspinor.c
init_gauge_field.c
init_gauge_tmp.c
init_geometry_indices.c
init_moment_field.c
init_spinor_field.c
init_stout_smear_vars.c: invert, invert_doublet, hmc_tm
	     provide routines to allocate memory for the corresponding
	     objects
integrator.c: hmc_tm
	     implements the routines needed for the integrator in the
	     MD udpate
invert.c:    main routine
	     invert executable
invert_doublet.c: main routine
	     invert_doublet executable
invert_doublet_eo.c: invert_doublet
	     performs an inversion of the flavour doublet operator using
	     even/odd preconditioning and the CG solver
invert_eo.c: invert
	     performs an inversion of the Wilson twisted mass Dirac operator
	     using a solver as specified in the input file. Depending on the 
	     input file even/odd preconditioning is used or not
io.c:        invert, invert_doublet, hmc_tm
	     helper routines: some deprecated IO routines for gauge and spinor 
	     spinor fields, and the routine writing the initial stdout message
	     of the executables
io_utils.c:  invert, invert_doublet, hmc_tm
	     IO helper routines related to swap endian and checksums
linsolve.c:  hmc_tm
	     CG and bicgstab solvers as used only in the HMC
little_D.c:  experimental
measure_rectangles.c: hmc_tm
	     computes the gauge action related to the rectangular part
monomial.c:  hmc_tm
             provides the definition for monomials and initialisation functions
mpi_init.c:  invert, invert_doublet, hmc_tm, benchmark
	     MPI initialisation routine
ndpoly_monomial.c: hmc_tm
	     implements the functions needed for a ndpoly monomial
observables.c: hmc_tm, invert, invert_doublet
	     computes the gauge action related to the Wilson plaquette part
online_measurement.c: hmc_tm
	     anything related to online measurements
phmc.c       hmc_tm
	     functions and variables as needed for the PHC
polyakov_loop.c: hmc_tm
	     measures the polyakov loop
propagator_io.c: invert, invert_doublet, hmc_tm
	     functions related to spinor field IO
ranlxd.c:    invert, invert_doublet, hmc_tm
	     RANLUX random number generator (64 Bit)
ranlxs.c:    invert, invert_doublet, hmc_tm
	     RANLUX random number generator (32 Bit)
read_input.l: invert, invert_doublet, hmc_tm
             definition of the input file parser (flex)
reweighting_factor.c: experimental
reweighting_factor_nd.c: experimental
sighandler.c: invert, invert_doublet, hmc_tm
	     handles signal related to illegal instructions
start.c:     invert, invert_doublet, hmc_tm
	     functions needed to give initial values to gauge and spinor fields
stout_smear.c: invert, invert_doublet
	     functions to stout smear a given gauge configuration
stout_smear_force.c: experimental
tm_operators.c: invert, invert_doublet, hmc_tm
	     operators needed for even/odd preconditioning the Wilson
	     twisted mass Dirac operator
update_backward_gauge.c: invert, invert_doublet, hmc_tm
	     functions to update the gauge copy
update_momenta.c: hmc_tm
	     function to update the momenta in the HMC MD part
update_tm.c: hmc_tm
	     the HMC MD part
xchange_2fields.c: invert, invert_doublet, hmc_tm
	     implements the MPI communication of two even/odd spinor fields
	     at once
xchange_deri.c: hmc_tm
	     implements the MPI communication of derivatives
xchange_field.c: invert, invert_doublet, hmc_tm
	     implements the MPI communication of a single even/odd spinor
	     field
xchange_gauge.c: invert, invert_doublet, hmc_tm
	     implements the MPI communication of the gauge field
xchange_halffield.c: invert, invert_doublet, hmc_tm
	     implements the MPI communication of a half spinor field
xchange_lexicfield.c: invert, invert_doublet, hmc_tm
	     implements the MPI communication of a single (full) spinor
	     field

****************************************************************************
the linalg directory: all routines here are compiled into the liblinalg
                      runtime library
                      capital letters are spinor fields, others scalars
add.c:                Q = R + S
assign.c:             R = S
assign_add_mul.c:     P = P + c Q with c complex
assign_add_mul_r.c:   P = P + c Q with c real
assign_add_mul_add_mul.c:   R = R + c1*S + c2*U with c1 and c2 complex variables
assign_add_mul_add_mul_r.c: R = R + c1*S + c2*U with c1 and c2 real variables
assign_diff_mul.c:    S=S-c*Q
assign_mul_add_mul_add_mul_add_mul_r.c: R = c1*R + c2*S + c3*U + c4*V
			 		with c1, c2, c3, c4 real variables
assign_mul_add_mul_add_mul_r.c:         R = c1*R + c2*S + c3*U 
					with c1, c2 and c3 real variables
assign_mul_add_mul_r.c:     R = c1*R + c2*S , c1 and c2 are real constants 
assign_mul_add_r.c:         R = c*R + S  c is a real constant
assign_mul_bra_add_mul_ket_add.c:       R = c2*(R + c1*S) + (*U)
					with c1 and c2 complex variables
assign_mul_bra_add_mul_ket_add_r.c:     R = c2*(R + c1*S) + (*U)
					with c1 and c2 complex variables
assign_mul_bra_add_mul_r.c:             R = c1*(R + c2*S)
					with c1 and c2 complex variables
comp_decomp.c:                          Splits the Bi-spinor R in the spinors S and T 
convert_eo_to_lexic.c:                  convert to even odd spinors to one full spinor
diff.c:                 Q = R - S
diff_and_square_norm.c: Q = R - S and ||Q||^2
mattimesvec.c:          w = M*v for complex vectors w,v and and complex square matrix M
mul.c:                  R = c*S, for complex c
mul_r.c:                R = c*S, for real c
mul_add_mul.c:          R = c1*S + c2*U , c1 and c2 are complex constants
mul_add_mul_r.c         R = c1*S + c2*U , c1 and c2 are real constants
mul_diff_mul.c:         R = c1*S - c2*U , c1 and c2 are complex constants
mul_diff_mul_r.c        R = c1*S - c2*U , c1 and c2 are real constants
mul_diff_r.c            R = c1*S - U , c1 is a real constant 
scalar_prod.c:          c = (R, S)
scalar_prod_i.c:        c = Im(R, S)
scalar_prod_r.c:        c = Re(R, S)
square_and_prod_r.c:    Returns Re(R,S) and the square norm of S
square_norm.c:          c = ||Q||^2

****************************************************************************
solver directory: all routines here are compiled into the libsolver
                  runtime library
		  the solvers are for spinor fields, if not indicated
		  otherwise.

Msap.c:                 experimental SAP preconditioner
bicgstab_complex.c:     BiCGstab for complex fields
bicgstabell.c:          experimental
cg_her.c :              CG solver for hermitian operators
cg_her_nd.c:            CG solver for hermitian heavy doublet operators
cgs_real.c:             CGS solver
chrono_guess.c:         routines for the chronological solver
dfl_projector.c:        experimental
diagonalise_general_matrix.c:  subroutine to diagonalise a complex n times n
                               matrix. Input is a complex matrix in _C_ like
                               order. Output is again _C_ like. Uses lapack
eigenvalues.c           compute the nr_of_eigenvalues lowest eigenvalues
                        of (gamma5*D)^2
fgmres.c:               FGMRES (flexible GMRES) solver
gcr.c:                  GCR solver
gcr4complex.c:          GCR solver for complex fields
generate_dfl_subspace.c: experimental
gmres.c:                GMRES solver
gmres_dr.c:             GMRES-DR solver
gmres_precon.c:         GMRES usable for preconditioning other solvers (experimental)
gram-schmidt.c:         Gram-Schmidt orthonormalisation routines
jdher.c:                Jacobi Davidson for hermitian matrices (to compute EVs)
lu_solve.c:             compute the inverse of a matrix with LU decomposition
mr.c:                   MR solver
pcg_her.c:              PCG solver
poly_precon.c:          polynomial preconditioner using Chebysheff polynomials
			with complex argument
quicksort.c:            a quicksort routine
sub_low_ev.c:           routines to subtract exactly computed eigenvectors from
			a given spinor field

tmlqcd's People

Contributors

Stargazers

Watchers

Forkers

kostrzewa urbach deuzeman palao annube florian-burger uwenger ggscorzato kcichy sunpho84 opene alexandrou kpetrov meiyisi franzdirenzo knippsch vincentdrach chjost metivett accordini wiesechr grodid bknippsch elenagr jvolmer visviv m-schroeck lorenzoriggio nikela g-koutsou robfre21 finkenrath marcogarofalo gbergner digideskio id2359 marcuspetschlies rqzhang0 gnodvi aniketsen mfkiwl

tmlqcd's Issues

util/io.c contains gauge read functionality, can that be dropped

Inside util/io.c there exists a function

int read_lime_gauge_field_doubleprec(double * config, char * filename,
const int T, const int LX, const int LY, const int LZ) {

and the function:

int read_lime_gauge_field_singleprec(float * config, char * filename,
const int T, const int LX, const int LY, const int LZ){

I suspect both are to be removed, because this functionality can be found in the io/ directory. So unless I hear protests, I will do so.

define debug levels

currently we do have the DebugLevel option, what it was never written down which type of message we want to have at which deubg level. Siebren, you thought about it already, didn't you? So maybe we can have sort of a list defining this, but right now I don't have a very good idea, how...!

Some solvers try to read a source even when ReadSource = no

BiCGstab, GCR and maybe also MR try to read a source, or end up in utils_parse_propagator_type.c when the ReadSource parameter is set to no. This should not happen, either:

If a source must be present, and ReadSource = no, an error should be returned.
If a source should not be read, utils_parse_propagator_type should not be entered at all

Configuration file parser debugging is not switchable in inverter

It seems like the debugging information of the configuration file parser (which depends on g_proc_id and verbose through myverbose) cannot be switched on and off without changing invert.c. Is everyone OK with me adding a -v flag to the inverter that switches verbose on? If not, why not?

I'm working on reading the CGMMS masses directly from the configuration file and need to see if my debugging messages are correct.

configuration file parser

FLEX is a pain to use, XML is a pain to write, how about finding a usable alternative?

Add in CGMMSEO solver (Xining Du)

Xining Du has done some work to the CGMMS solver to make it compatible with EO preconditioning:

"Basically I was using the CG solver with evenodd preconditioning, and I did some modifications (I call it CGMMSEO) to let it solve multiple masses one by one. The reason I did this was that I found it is faster than the CGMMS solver without evenodd preconditioning."
"The code I have been using is the tmLQCD version 5.1.6. However, I did some modifications on top of it, mainly reorganizing the CGeo solvers for multiple masses in a single solver. The basic solver routines are not changed."

It would be good to merge these changes back into the general code.

Move from own complex to C99 complex

CGMMS source+propagator format struct confusion

operator.c defines the following structs for use

paramsSourceFormat *sourceFormat = NULL;
paramsPropagatorFormat *propagatorFormat = NULL;
paramsInverterInfo *inverterInfo = NULL;

io/params.h declares
extern paramsGaugeInfo GaugeInfo;
extern paramsPropInfo PropInfo;
extern paramsSourceInfo SourceInfo;

These structs are fairly similar, and their mixing is partly the source of issue #29.
Some cleaning up, or at least clear defined overarching idea would help here.

P_M_eta.c cleanup

P_M_eta.c needs several cleanups:

Multiple functions need their own files
printf commands need to be dependent on debug level and processor id
Check_Approximation is probably not compiled now (not used), and is also broken in current form
comments and indenting are in different formats throughout the code, make it uniform

Tag 5-1-6 is missing

The tmLQCD version currently describes itself as 5.1.6, yet there is no tag for this version in the repository. There should be a version tagged as 5.1.6, and we are currently developing 5.1.7, 5.2 or 6.

automatic detection of source timeslice

when one wants to treat more than one gauge in a single invert run with reading sources from files depending on the timeslice, an automatic detection of this value is required. I have implemented this in the branch

AutomaticTimesliceDetect

in my fork of tmLQCD in commit urbach@14ab210

Comments would be helpful!

Parallel I/O for propagators needs to be better protected against errors

Similar to what was done for gauge I/O, propagator I/O needs to be protected against I/O errors through readbacks or other checks. Part of this is already implemented, but it needs to be everywhere.

Elaborate on collaboration process

In my opinipn the section on the website describing the collaboration process for tmLQCD could be a bit more explicit to give some pointers:

git branch branchname
git checkout branchname
work, 
git add [...], 
git commit
work,
git add [...],
git commit...
git push origin branchname    # send work to own fork on github

In addition, before submitting a pull request the person should resolve any conflicts that might have developed with the current state of the code. (this can also be done by the integrator of the pull request, but it will be additional work for us, which is not necessarily a bad thing). Doing this regularly will keep the codes in sync and make integration easier.

git remote add upstream [email protected]:etmc/tmLQCD.git
git fetch upstream -v
git merge upstream/master   # merge any conflicts

square_norm question

I don't really understand what's going on in the linalg/square_norm function (at least the unoptimized one). It seems to me like there are four empty operations taking place.

At the third iteration of the loop we have:

tr = ds_3
ts = ds_1 + ds_2 + ds_3
tt = ds_3
ks = ds_1 + ds_2 + ds_3
kc = 0

Since all the variables are overwritten from one call to the next I presume they were declared static for performance purposes rather than data persistence, so I don't really know why the additions and subtractions are carried out and then discarded.

clean README file

DUM_SOLVER needs removal

DUM_SOLVER was removed in solver sub-dir almost completely
checks are needed
and the same must be done for bispinors in solver
then it can be removed also from hmc_tm
in invert it is not needed any longer

Hosting of related code

There is some code related to tmLQCD, the repositories of which are currently hosted elsewhere. The obvious example is Lemon, which is even an optional dependency of tmLQCD. I'd say it makes sense to move (or at least mirror) these repositories here as well. It makes for a convenient one-stop process for those who just want to use the code. As for development, there are equally good reasons for moving away from SVN in these cases as there were for tmLQCD itself.

The only significant downside I can think of would be that the current addresses may be provided in publications and presentations. We could fairly easily take care of this by providing tarballs there and/or merging patches back into public SVN repositories. In the case of Lemon, we should actually still be able to update the address.

Are there any objections that I am missing?

can linsolve.c|h be removed from suite

solve_cg is not used any longer, so can it be removed?

modenumber computation happens inline in invert.c, subfunction needed

Invert.c has most of its functionality handed off to subfunctions, such as eigenvalue computation or plaquette measurement. The modenumber computation is done inline inside invert.c, at around line 360-400 in revision 1783. This code should be moved to a subfunction, to clean up and to factorize.

gauge_input_filename is too short!

This is quite a dangerous problem which goes back to read_input. When your GaugeInputFilename is longer than 100 characters, read_input WILL write into unspecified memory (strcpy!!) as the length of gauge_input_filename is hardcoded to be 100 characters. This is also true for the ranlux input filename.

Remove APEnext define

CGMMS documentation and sample input

Currently, the CGMMS solver is not listed in section 1.4.2 or section 2.7 of the documentation as one of the possible solvers. This should be fixed.
No sample input files are provided with the CGMMS solver either, and since particularly the extra_masses.input file is needed, a sample for this should be provided.

config.h inclusion everywhere

Every .c file needs to start with

ifdef HAVE_CONFIG_H

include<config.h>

endif

before any other includes, otherwise certain defines do not take effect.

I do not know any case where this is happening right now, reported by Carsten.

remove LAPACK dependency

Definition of ALIGN and ALIGN_BASE

These preprocessor defines are currently done in sse.h, if some level of explicit SSE optimization is requested. This means, however, that any code that declares memory for potential use with SSE routines needs to include some #ifdef checks. Those could be skipped if these defines were done in some central location and were simply set to 0 if no SSE optimization was requested.

A corollary to this, is the special case of wanting alignment without wanting to use the manual SSE routines. With increasing compiler sophistication, there are indications that automatically generated SSE code is starting to outperform our current implementations. While this is in fact quite awesome, it does make a case for allowing for alignment -- still needed for optimal performance -- without necessarily activating the SSE routines themselves. This is currently impossible.

One possible location for alignment definitions if global.h. If we want alignment to be a separate option, maybe an inclusion in the configuration script and config.h might be more appropriate.

Graceful exit

Is there already some centralised way to force a graceful exit in case of a fatal error such as a failure to malloc? If not, shouldn't we add one? Doesn't have to be complicated, but it would be nice to be able to write something like if (result == very_bad) fatal_error( "Error_message_goes_here"); and have the system take care of dumping the error message, flush buffers and finalizing MPI before quitting. I think this would encourage error checking and result in better code, as well as remove ugly preprocessor checks for MPI in the middle of random functions.

rebase / merge discussion

I wanted to discuss this outside of Carsten's pull request that created the flurry of comments (mostly by me, sorry about the number!)

This blog post describes succinctly the difference between merge and rebase in day to day git usage. Now as long as work is not shared, rebase certainly keeps the noise down in the repository because there will only be one merge message, that of the pull request once it is merged into the master branch in etmc/tmLQCD.

On the other hand, those extra commits preserve some handy information about the development process which can be seen in the network graph if you compare my three rebased branches (read_input, cgmms_input and urbachFixAutomaticTSDetect) to a branch like Albert's c99_complex or Carsten's AutomaticTSDetect. In the rebased branches it looks as though they have been split off the etmc master branch today (24.01.12) even though in truth they have been split off a week ago.

What are your opinions on merge and rebase? Personally I am quite happy to drop the extra temporal information given by the merge messages for a more streamlined commit history.

Remove global spinor field g_spinor_field

move to something like used now for gauge fields

configure is buggy

the current configure version does not work on all platforms. On my ubuntu 10.4 I have to run autoconf. But I did not understand the problem yet. It seems the very same version works e.g. on jugene

Source location information is not always available in propagators

It would be good to have the source location (and perhaps more info) added in the inverter info lime message or be present in propagator files in some other way.

Send notes of Bonn meeting

Make and send the notes of this meeting around to the larger ETMC group.

Benchmark segfaults

It seems to have issues in the xchange_deri routine, probably related to the following when compiling that file:
./xchange_deri.h:33:6: note: expected ‘struct su3adj ** const’ but argument is of type ‘struct su3adj ***’. I think we can just remove the ampersand before the argument at both locations where the function is called, but I'm not sure if it's that simple.

Schroedinger Functional cleanup

The Schroedinger functional code that currently exists inside the tmLQCD package has several issues.

It is not included in existing code very well
a) For example, sf_get_staples.c replicates the code for get_staples.c 8 times, with only very minor changes (copy the same 13 lines over and over again, modifying only a single line). This is really bad, because the code becomes hard to read (if clause is far separated from its actual effect), and very hard to maintain. What if a bug is found in one of these if clauses, it needs to be fixed everywhere, but where is everywhere?
b) sf_get_rectangle_staples.c does the same to get_rectangle_staples.c, but then on a whole different scale, one that is SO bad that the compilation time of the ENTIRE tmLQCD package noticeably increases due to this single file. Also here a proper if construction would solve everything in a clean way, and the same story about bug fixing.
c) Inside hmc_tm, there are a few places where the SF code gets called. Also here a lot of replication of code is present, particularly for the output. Similar things are true for update_tm.c.
Debug output appears to be still present.
a) There are instances of #if 1 or #if 0 preprocessor directives (sf_gauge_monomial.c, sf_observables.c)
b) printf ("hola"); in sf_gauge_monomial.c
Many functions in the same file, mainly in sf_calc_action.c
Output not prepared for parallel running, so no preparation for many cores repeating the same statement
Inefficiencies in recomputing the same value
For example, inside hmc_tm.c
if(g_proc_id==0){
fprintf(parameterfile,"# First plaquette value for SF: %14.12f \n", plaquette_energy/(6._VOLUME_g_nproc));
printf("# First plaquette value for SF: %14.12f \n", plaquette_energy/(6._VOLUME_g_nproc));
fprintf(parameterfile,"# First rectangle value for SF: %14.12f \n", rectangle_energy/(12._VOLUME_g_nproc));
printf("# First rectangle value for SF: %14.12f \n", rectangle_energy/(12._VOLUME_g_nproc));
}
calls both functions twice without first storing intermediate result. Other examples exist.
No tests are supplied, and the sample input file currently does not work. And even when fixed (online measurement error) it is hard to see if the code is working, as all trajectories from the test get rejected (tested the first 250, not going to wait more).

Unless someone steps up claiming to currently use the SF code, and provides usable test cases, applying fixes for these issues is likely to introduce undetected bugs, and unlikely to lead to any clear benefit.
Since the code will exist in the repository anyway, I'd propose to just remove it for now, especially since that does have at least some benefits: faster compilation, smaller files, and cleaner code.

LEMON writer fails

This is using LEMON from the git repository.

# Trajectory is accepted.
# Writing gauge field to .conf.tmp.
# Constructing LEMON writer for file .conf.tmp for append = 0
[LEMON] Node 0 reports in lemonWriteLatticeParallel:
    Could not write the required amount of data.
[LEMON] Node 1 reports in lemonWriteLatticeParallel:
    Could not write the required amount of data.
LEMON write error occurred with status = -5, while writing in gauge_write_binary.c!
[LEMON] Node 1 reports in lemonWriteRecordHeader:
    Writer not ready for header.
KILL_WITH_ERROR on node 1: Header writing error. Aborting
LEMON write error occurred with status = -5, while writing in gauge_write_binary.c!
[LEMON] Node 0 reports in lemonWriteRecordHeader:
    Writer not ready for header.
KILL_WITH_ERROR on node 0: Header writing error. Aborting
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 10128 on
node artemis exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[artemis:10126] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[artemis:10126] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
zsh: exit 1     mpirun -np 2 ./hmc_tm -f ../sample-input/sample-hmc0.input

phase_N <-> kaN "redundancy"

There is a certain redundancy from boundary.[c,h] in D_psi.c now that c99 complex is being integrated, but I don't know how much of a performance impact, if any, it would make to simply replace instances of phase_N by -kaN (where N = {0,1,2,3} ) Any ideas?

Add link to all pages on wiki

As discusssed in Bonn we should probably have a link in the menu to display all pages on the wiki. This is achieved in MoinMoin by linking to

https://znwiki3.ifh.de/ETMC/TitleIndex

commit 9cec2dfa has timestamp in the future

I don't know what happened there and whether it will continue to be a problem... It seems like commit 9cec2df was timestamped with Sat Jan 21 03:15:17 2012 +0000, I don't know why or how this happened. I will close this issue if it goes away on its own.

Minutes archive on wiki

I've started an archive of minutes on the wiki to make it possible to easily recover discussions even in the far past.

https://znwiki3.ifh.de/ETMC/Minutes

Just creating this issue here in case there are any objections to this.

Improved version reporting

For debugging purposes it is often necessary to find out what the exact version of a compiled executable is. We have version numbers such as 5.1.5, 5.1.6, but these encompass quite a few SVN revision numbers. SVN provides some revision number information through use of the keywords $Id$, but these only refer to the last change of a file. So, if the backend of the IO changes, but the interface inside hmc_tm.c does not change, there is no way to know this from within hmc_tm.c, and therefore $Id$ will still report an older version number, even though the newer version of the IO backend is already compiled in. There is no way inside SVN to get global revision numbers, outside tools such as svnversion are needed for this. It would be useful to have a workaround for this that does give version information in more detail.

A proposal: make the version calls for hmc_tm and invert go to a separate usage function, which is modified in a pre-commit hook script upon every commit.

cgmms solver always writes in single precision

Output propagator precision is hardcoded in solver/cg_mms_tm.c: it is set to 32

install some unit testing system

check cunit

Add in gradient flow

Remove legacy SVN keywords

The $Id SVN keyword has been used extensively throughout the code to allow automatic documenting of last changes to a file. It and other SVN keywords (Date, Revision, Author, HeadURL) need to be removed everywhere.

Gauge fixing

It would be good to have a gauge fixing routine in the tmLQCD code, for example for Landau/Coulomb gauge fixing. There are several independently written versions circulated around, but this seems like a specific functionality that would benefit greatly from all the parallel functionality existing in the tmLQCD package. Checks can be made against existing gwc, Zpackage and other codes.

SSE2 and SSE3 for smearing and clover_leaf.c

compiling with --enable-sse3|2 gives errors

../../tmLQCD/smearing/stout_stout_smear.c:39: error: can't find a register in class ‘GENERAL_REGS’ while reloading ‘asm’
../../tmLQCD/smearing/stout_stout_smear.c:30: error: ‘asm’ operand has impossible constraints

../tmLQCD/clover_leaf.c:719: error: can't find a register in class ‘GENERAL_REGS’ while reloading ‘asm’
../tmLQCD/clover_leaf.c:611: error: ‘asm’ operand has impossible constraints

which is due to problems in the inline assembly implementation of the su3 etc. macros.

Need to either rework the routines or undef SSE macros in those files.

Remove configure script from repository

I have tentatively removed the configure script from the repository in my unit testing branch. A change in configure.in results in thousands of lines of changes in configure. While the configure script should remain in the tarball distribution I don't think a git repository is the right place for it. Are there any objections to this?

Supply sample input files for all available solvers

Source types affect writing behaviour

When the input file for the inverter is given the following input line:
SourceType = VOLUME

Inversion works and proceeds without problems, however, the propagator is not written to disk.
If however
SourceType = POINT
Is selected, the propagator is written to disk. This discrepancy needs to be cleaned up, either with an additional input parameter that governs propagator writing, or just by fixing the VOLUME case.

Why .NOTPARALLEL: in Makefile.in?

Is there are reason why we have the .NOTPARALLEL in the Makefile? It really slows down the compilation, especially on machines with many cores where one could potentially compile 6 to 8 modules in parallel.

enabling gprof is currently broken

Enabling gprof is currently broken because both the compiler and the linker need to be called with the -pg flag.