Coder Social home page Coder Social logo

hydragnn's People

Contributors

allaffa avatar erdemcaliskan avatar justinbakermath avatar jychoi-hpc avatar kshitij-v-mehta avatar lemonandrabbit avatar markoburcul avatar pzhanggit avatar sauravmaheshkar avatar seheracer avatar streeve avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hydragnn's Issues

Load serialized *pkl files directly

Currently, dataset loading support raw data files, e.g., LSMS format. For every run, it would read the raw data files, convert them to serialized format and generate *pkl files. The process can be time consuming and even unnecessary sometimes. We should provide an option to load *pkl files directly.

Directional graphs

Metallic bonds and covalence bonds require undirected graphs because the electrons are shared between atoms and there are no exclusive owners.
However, ionic bonds are created between atom pairs where there is a clear donor and a clear receiver of the electron. Therefore, we can inject physics information in the adjacency matrix by transforming the graph from undirected (as the current implementation performs) to directed.

Currently, the undirected graph relies on a routines inside the "GCNN/data_utils/helper_functions.py" file:

remove_collinear_candidate
This function makes sure that if A is neighbour of B, then B is mutually neighbour of A. This routines somewhat guarantees that the connectivity between atoms is local (the adjacency os only locally dense, not globally dense), avoiding the connectivity between atoms to "explode" and transform the adjacency matrix into something that is globally dense.

I think that for the directional graph we can avoid calling this function, because A can be neighbor of B without having B being neighbor of A.

Error when running the prediction

After I trained a model, the log directory has been created with the following contents:

(open-ce-1.4.0-py38-0) bash-4.4$ ls -l logs/PNAStack-r-7-mnnn-5-ncl-6-hd-5-ne-2-lr-0.001-bs-64-data-FePt_32atoms-node_ft-0-task_weights-1.0-1.0-1.0-/
total 2561
-rw------- 1 sacer sacer 575017 Oct 28 12:34 PNAStack-r-7-mnnn-5-ncl-6-hd-5-ne-2-lr-0.001-bs-64-data-FePt_32atoms-node_ft-0-task_weights-1.0-1.0-1.0-.pk
-rw------- 1 sacer sacer 155644 Oct 28 12:34 charge_density.png
-rw------- 1 sacer sacer 137315 Oct 28 12:34 charge_density_-001.png
-rw------- 1 sacer sacer 182418 Oct 28 12:34 charge_density_error_hist1d.png
-rw------- 1 sacer sacer 153256 Oct 28 12:34 charge_density_error_hist1d_-001.png
-rw------- 1 sacer sacer 145924 Oct 28 12:34 charge_density_scatter_condm_err.png
-rw------- 1 sacer sacer   4130 Oct 28 12:34 config.json
-rw------- 1 sacer sacer    674 Oct 28 12:34 events.out.tfevents.1635438862.h50n07.1263424.0
-rw------- 1 sacer sacer  38818 Oct 28 12:34 free_energy.png
-rw------- 1 sacer sacer  34485 Oct 28 12:34 free_energy_-001.png
-rw------- 1 sacer sacer  61311 Oct 28 12:34 free_energy_scatter_condm_err.png
-rw------- 1 sacer sacer    943 Oct 28 12:34 history_loss.pckl
-rw------- 1 sacer sacer 178170 Oct 28 12:34 history_loss.png
-rw------- 1 sacer sacer 157442 Oct 28 12:34 magnetic_moment.png
-rw------- 1 sacer sacer 167461 Oct 28 12:34 magnetic_moment_-001.png
-rw------- 1 sacer sacer 171006 Oct 28 12:34 magnetic_moment_error_hist1d.png
-rw------- 1 sacer sacer 175293 Oct 28 12:34 magnetic_moment_error_hist1d_-001.png
-rw------- 1 sacer sacer 141539 Oct 28 12:34 magnetic_moment_scatter_condm_err.png

Here, I used ./examples/configuration.json as is except for "num_epoch": 2 to finish early.

Then I changed example.py as follows and ran it:

import hydragnn

hydragnn.run_prediction("./examples/configuration.json")

I got the following error:

Traceback (most recent call last):
  File "example.py", line 3, in <module>
    hydragnn.run_prediction("./examples/configuration.json")
  File "/gpfs/alpine/stf008/scratch/sacer/allGNN/HydraGNN/hydragnn/run_prediction.py", line 48, in run_prediction
    output_type = config["NeuralNetwork"]["Variables_of_interest"]["type"]
TypeError: string indices must be integers

Run command on Summit:
jsrun -n24 -a1 -g1 -c7 -r6 -b rs --smpiargs="off" python example.py

Include templating infrastructure for Hierarchical Community-aware Graph Neural Network (HC-GNN)

This paper describes a very elegant way to improve the performance of localized (short-range) message passing neural networks (MPNNs) by including global attention mechanics to model long-range interactions through hierarchical clustering of nodes
https://arxiv.org/pdf/2009.03717.pdf

Looking at the original implementation of HC-GNN
https://github.com/zhiqiangzhongddu/HC-GNN/blob/master/model.py
it seems like the inclusion of hierarchical MPNN can be easily templated over the underlying localized MPNN.
Since we are already tempting HydraGNN with respect to MPNNs, including Hierarchical MPNN as an additional level of may be very doable and not too difficult to perform.

Map data from CPU to GPU batch by batch if the total dataset is too large to fit onto the GPU

When I run the Ising model and I create data using the current set-up

number_atoms_per_dimension = 5

configurational_histogram_cutoff = 1000

The following line crashes:

HydraGNN/hydragnn/preprocess/serialized_dataset_loader.py", line 88, in load_serialized_data

data.to(device)

This happens because the total volume of the dataset is too large and we map all the data at once on the GPU.

File "/root/HydraGNN/hydragnn/preprocess/serialized_dataset_loader.py", line 88, in load_serialized_data

data.to(device)

File "/opt/conda/lib/python3.8/site-packages/torch_geometric/data/data.py", line 216, in to

return self.apply(

File "/opt/conda/lib/python3.8/site-packages/torch_geometric/data/data.py", line 204, in apply

store.apply(func, *args)

File "/opt/conda/lib/python3.8/site-packages/torch_geometric/data/storage.py", line 146, in apply

self[key] = recursive_apply(value, func)

File "/opt/conda/lib/python3.8/site-packages/torch_geometric/data/storage.py", line 495, in recursive_apply

return func(data)

File "/opt/conda/lib/python3.8/site-packages/torch_geometric/data/data.py", line 217, in

lambda x: x.to(device=device, non_blocking=non_blocking), *args)

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 6; 31.75 GiB total capacity; 30.10 GiB already allocated; 2.25 MiB free; 30.48 GiB reserved in total by PyTorch)

I greatly appreciate the effort you all made in ensuring that the data is mapped to GPU once, but if the data is too big, that is not possible. We may need to keep the data on CPU, and re-load it to the GPU only if strictly needed at the current batch.

Implement skip connections using the JumpingKnowledge pyg module

  • modify Base class to instantiate pyg.nn.models.JumpingKnowledge
  • modify the layer/dimensionality logic to account for concatenated aggregation scheme within JumpingKnowledge
  • within Base class include a reset_parameters() to reset skip connection choices
  • modify test cases and log the performance of the model on CI test case.

previous issues/PRS

Add linting

  • Decide on pylint, flake, etc.
  • Fix issues (imports, unused variables, etc.)
  • Decide on what else to enforce
  • Add to CI
  • Add to pre-commit hooks

CPU mode for profiling tools

One thing observed in PR is that something needs to fixed about using profiling in CPU mode in md17 and qm9 in CI test.
`
_____________________________ pytest_examples[qm9] _____________________________
Traceback (most recent call last):
File "/home/runner/work/HydraGNN/HydraGNN/tests/test_examples.py", line 26, in pytest_examples
assert return_code == 0
AssertionError: assert -11 == 0
----------------------------- Captured stderr call -----------------------------
Downloading https://data.pyg.org/datasets/qm9_v3.zip
Extracting dataset/qm9/raw/qm9_v3.zip
Processing...
Using a pre-processed version of the dataset. Please install 'rdkit' to alternatively process the raw data.
Done!
0: Using CPU
0: Using CPU

0%| | 0/11 [00:00<?, ?it/s]
36%|███▋ | 4/11 [00:00<00:00, 32.53it/s]ERROR:2022-05-12 13:46:01 3754:3754 CudaDeviceProperties.cpp:26] cudaGetDeviceCount failed with code 35
____________________________ pytest_examples[md17] _____________________________
Traceback (most recent call last):
File "/home/runner/work/HydraGNN/HydraGNN/tests/test_examples.py", line 26, in pytest_examples
assert return_code == 0
AssertionError: assert -11 == 0
----------------------------- Captured stderr call -----------------------------

ALIGNN model implementation using built-in LineGraph Data Transformation capabilities

We discussed quite a while about the idea of including the support of the Atomistic Line Graph Neural Network (ALIGNN) model described in the following paper:
https://arxiv.org/abs/2106.01829

There are built-in PyTorch Geometric capabilities that would make that easier to implement. For instance, the construction of the line graph is already supported:
https://pytorch-geometric.readthedocs.io/en/latest/_modules/torch_geometric/transforms/line_graph.html#LineGraph

Skip connection for ResNet type of GNN

As the name suggests, the skip connections in deep architecture bypass some of the neural network layers and feed the output of one layer as the input to the following levels. It is a standard module and provides an alternative path for the gradient with backpropagation.

Skip Connections were originally created to tackle various difficulties in various architectures and were introduced even before residual networks. In the case of residual networks or ResNets, skip connections were used to solve the degradation problems (e.g., vanishing gradient), and in the case of dense networks or DenseNets, it ensured feature reusability.

Enable interactive plots

Most of the options are there to display rather than save figures, but a few changes are still needed throughout the viz class

Project contribution

I want to use HydraGNN multi-head attention implementation in my research for predicting reproducibility of scholarly articles. I appreciate the contributions of this project in the GNN space.

I checked good-first-issue and found no open issues. I previously commented on #164 for potential contribution. Does this repository support external contribution/collaboration ?

Allow Hydragnn models to be build with embedding layers for node and edge features

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        self.node_emb = Embedding(21, 21)
        self.edge_emb = Embedding(4, 4)

        aggregators = ['mean', 'min', 'max', 'std']
        scalers = ['identity', 'amplification', 'attenuation']

        self.convs = ModuleList()
        self.batch_norms = ModuleList()
        for _ in range(2):
            conv = PNAConv(in_channels=21, out_channels=21,
                           aggregators=aggregators, scalers=scalers, deg=deg,
                           edge_dim=4, towers=1, pre_layers=1, post_layers=1,
                           divide_input=False)
            self.convs.append(conv)
            self.batch_norms.append(BatchNorm(21))

        self.mlp = Sequential(Linear(21, 21), ReLU(), Linear(21, 21), ReLU(),
                              Linear(21, 1))

[Feature Request 🚀] Add a `CITATION.cff`

Github recently released a new feature where repository owners can add a CITATION.cff file making it easy for others to cite the repository.

Currently the information on the README isn't very helpful as it doesn't provide a BibTeX reference. Adding a CITATION.cff would make the attribution process very easy.

Unit test for data loading

Check graph/node scalar/vector data loading works as intended - create synthetic torch_geometric data and verify the values returned

Clean up visualizer

Some functions in visualizer.py have not been used for a while and would also likely to break the code since they are outdated. We need to either delete these functions since not used or update them consistently with other part of the code.

Also, create plots only on rank 0.

Repository sanitization and pathway for automatic deployment of docs

repo sanitization

  • assess the structure of the repo to:
    • include docstring style documentation for all models, utils
    • include style guidelines, automated tests for linting
  • include git conventions, branch creation strategies, recommended good practices for contribution
  • include ISSUE_TEMPLATE, and guidelines for creating PR's to contributors

docs and auto-deployment

  • modify the README to provide higher level idea, description, and example script to get started
  • include necessary documentation resources within README and point to the docs site for elaborate usage/explanations of HydraGNN
  • mention about HPC component of running HydraGNN on exascale compute clusters like Frontier, etc.
  • test the feasibility of either using pdoc3 or readthedocs to ensure documentation of the website is automatically pushed to a connected web endpoint

Test with external datasets

  • torch_geometric.datasets
    • QM9
    • MD17
  • [ x ] Materials Project
  • [ x ] Open Catalyst 2020
  • [ x ] Open Catalyst 2022
    etc.

Support MPI backend

This is mostly motivated by the use case of running one instance of Hydra per GPU on a node. I was only able to do so with the mpi backend and iterating through the LSB_HOSTS list.

@jychoi-hpc if there's a better way, please let me know; otherwise, I'll plan to add something like what I describe

tutorial questions

Can you please comment where is "within HydraGNN" ? Is it the json file ? Can you please describe in details the steps to run an example using the FePt.zip dataset ?

Set the path to the selected dataset within HydraGNN and run

Are the ADIOSData file required if users on not on Summit ?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.