Coder Social home page Coder Social logo

Comments (5)

hzone3898 avatar hzone3898 commented on July 2, 2024 1

After unsuccessful debugging, turns out @anjohan was right and my input .yaml file was just wrong.
I did have stresses in the training data, but I forgot to change "ForceOutput" to "StressForceOutput" in the .yaml!

from pair_allegro.

Linux-cpp-lisp avatar Linux-cpp-lisp commented on July 2, 2024

Looks like you are trying to compile the development version of pair_allegro, where if you look at the README you'll see it's been upgraded from requiring stable_29Sep2021_update2 and is now compatible with versions after LAMMPS made that breaking change to their neighborlists. You should be able to use any stable version of LAMMPS from after that update, including the latest without specifying a tag when you pull. See README on https://github.com/mir-group/pair_allegro/tree/stress

from pair_allegro.

hzone3898 avatar hzone3898 commented on July 2, 2024

Thank you! Using the latest LAMMPS version made it compile.

However when trying to run simple NVT or minimization for a test system (water molecules), I get an "std::out_of_range'" error:

KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:106)
  will use up to 1 GPU(s) per node
  using 1 OpenMP thread(s) per MPI task
Allegro is using input precision f and output precision f
Allegro is using device cuda
Reading data file ...
  orthogonal box = (-50.378 -49.142 -3.651) to (77.723 98.154 39.909)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  1002 atoms
  read_data CPU = 0.072 seconds
Allegro: Loading model from deployed.pth
Allegro: Freezing TorchScript model...
Type mapping:
Allegro type | Allegro name | LAMMPS type | LAMMPS name
0 | H | 1 | H
1 | O | 2 | O
### Equilibration NVT ###
Neighbor list info ...
  update: every = 1 steps, delay = 0 steps, check = yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 4.3
  ghost atom cutoff = 4.3
  binsize = 4.3, bins = 30 35 11
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair allegro3232/kk, perpetual
      attributes: full, newton on, ghost, kokkos_device
      pair build: full/bin/ghost/kk/device
      stencil: full/ghost/bin/3d
      bin: kk/device
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 1
terminate called after throwing an instance of 'std::out_of_range'
  what():  Argument passed to at() was not in the map.
[g1101:2612852] *** Process received signal ***
[g1101:2612852] Signal: Aborted (6)
[g1101:2612852] Signal code:  (-6)
[g1101:2612852] [ 0] /lib64/libpthread.so.0(+0x12ce0)[0x7fffbfac1ce0]
[g1101:2612852] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7fff618e3a9f]
[g1101:2612852] [ 2] /lib64/libc.so.6(abort+0x127)[0x7fff618b6e05]
[g1101:2612852] [ 3] /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6(+0xa998a)[0x7fff6208598a]
[g1101:2612852] [ 4] /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6(+0xb51ea)[0x7fff620911ea]
[g1101:2612852] [ 5] /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6(+0xb5255)[0x7fff62091255]
[g1101:2612852] [ 6] /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6(+0xb54e9)[0x7fff620914e9]
[g1101:2612852] [ 7] /projappl/lammps/build/lmp[0xb752b9]
[g1101:2612852] [ 8] /projappl/lammps/build/lmp[0xcbc995]
[g1101:2612852] [ 9] /projappl//lammps/build/lmp[0x8c9042]
[g1101:2612852] [10] /projappl/lammps/build/lmp[0x58542c]
[g1101:2612852] [11] /projappl/peptides/lammps/build/lmp[0x48fbb6]
[g1101:2612852] [12] /projappl/peptides/lammps/build/lmp[0x48fe9e]
[g1101:2612852] [13] /projappl/lammps/build/lmp[0x44fdad]
[g1101:2612852] [14] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7fff618cfcf3]
[g1101:2612852] [15] /projappl/lammps/build/lmp[0x47102e]
[g1101:2612852] *** End of error message ***

I'm using a deployed.pth allegro model trained on nequip==0.5.6, and thus I use "pair_style allegro3232" as seen in #12.

from pair_allegro.

anjohan avatar anjohan commented on July 2, 2024

Hi,

With gdb you can pinpoint the line where it fails. Something like gdb -ex=r -ex=where --args /path/to/lammps/build/lmp -in in.script.

Keep in mind that if you use the stress branch branch of pair_allegro and ask LAMMPS for pressure/stress, the model needs to be trained with stress support:

if(vflag){
torch::Tensor v_tensor = output.at("virial").toTensor().cpu();
auto v = v_tensor.accessor<outputtype, 3>();
// Convert from 3x3 symmetric tensor format, which NequIP outputs, to the flattened form LAMMPS expects
// First [0] index on v is batch
this->virial[0] = v[0][0][0];
this->virial[1] = v[0][1][1];
this->virial[2] = v[0][2][2];
this->virial[3] = v[0][0][1];
this->virial[4] = v[0][0][2];
this->virial[5] = v[0][1][2];
}

from pair_allegro.

hzone3898 avatar hzone3898 commented on July 2, 2024

This is the error when running with gdb:

Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 1
[New Thread 0x7ffd25fff000 (LWP 3156788)]
terminate called after throwing an instance of 'std::out_of_range'
  what():  Argument passed to at() was not in the map.

Thread 1 "lmp" received signal SIGABRT, Aborted.
0x00007fff618e3a9f in raise () from /lib64/libc.so.6
#0  0x00007fff618e3a9f in raise () from /lib64/libc.so.6
#1  0x00007fff618b6e05 in abort () from /lib64/libc.so.6
#2  0x00007fff6208598a in ?? ()
   from /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6
#3  0x00007fff620911ea in ?? ()
   from /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6
#4  0x00007fff62091255 in std::terminate() ()
   from /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6
#5  0x00007fff620914e9 in __cxa_throw ()
   from /appl/spack/v017/install-tree/gcc-8.5.0/gcc-11.2.0-zshp2k/lib64/libstdc++.so.6
#6  0x0000000000b752b9 in ska_ordered::order_preserving_flat_hash_map<c10::IValue, c10::IValue, c10::detail::DictKeyHash, c10::detail::DictKeyEqualTo, std::allocator<std::pair<c10::IValue, c10::IValue> > >::at (key=...,
    this=<optimized out>) at /local_scratch/tuple:510
#7  c10::Dict<c10::IValue, c10::IValue>::at (this=this@entry=0x7fffffffaae8,
    key=...) at /projappl/lammps/src/ios_base.h:152
#8  0x0000000000b7c65d in LAMMPS_NS::PairAllegro<(Precision)0>::compute (
    this=0x6a5ada0, eflag=<optimized out>, vflag=2)
    at /projappl/lammps/src/stl_uninitialized.h:1144
#9  0x00000000005f571a in LAMMPS_NS::Verlet::setup (this=0x6c38ed0, flag=1)
    at /projappl/lammps/build/atom_vec.h:140
#10 0x000000000058542c in LAMMPS_NS::Run::command (this=0x4008e020, narg=1,
    arg=0xd713ee0)
    at /appl/spack/v017/install-tree/gcc-11.2.0/cuda-11.5.0-mg4ztb/include/crt/basic_string.tcc:171
#11 0x000000000048fbb6 in LAMMPS_NS::Input::execute_command (this=0x6a18420)
    at /projappl/lammps/build/kspace.h:853
#12 0x000000000048fe9e in LAMMPS_NS::Input::file (this=0x6a18420)
    at /projappl/lammps/build/kspace.h:302
#13 0x000000000044fdad in main (
    argc=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    argv=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /projappl/lammps/src/main.cpp:105
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-189.5.el8_6.x86_64 hwloc-libs-2.2.0-3.el8.x86_64 libibverbs-56mlnx40-1.56103.x86_64 libjpeg-turbo-1.5.3-12.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 libpng-1.6.34-5.el8.x86_64 librdmacm-56mlnx40-1.56103.x86_64 nvidia-driver-cuda-libs-525.85.12-1.el8.x86_64 openssl-libs-1.1.1k-7.el8_6.x86_64 zlib-1.2.11-19.el8_6.x86_64
(gdb) quit
A debugging session is active.

        Inferior 1 [process 3156674] will be killed.

from pair_allegro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.