Coder Social home page Coder Social logo

singroup / matid Goto Github PK

View Code? Open in Web Editor NEW
23.0 7.0 6.0 33.27 MB

MatID is a python package for identifying and analyzing atomistic systems based on their structure. MatID is designed to help researchers in the automated analysis and labeling of atomistic datasets.

Home Page: https://singroup.github.io/matid/

License: Apache License 2.0

Python 99.99% Shell 0.01%
data-analysis high-throughput materials-science classification python

matid's People

Contributors

jan-janssen avatar lauri-codes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

matid's Issues

Classifier does not scale with system size

matid.Classifier.classify's execution time is definitely above linear to the system size. With 1k+ atoms in the system, we need 1h+ on a decent CPU.

I'll try to look into it myself, but maybe you already have an idea or know about this limitation? Or is there a way to tune the behaviour. The Classifier class seems to have a plethora of thresholds and similar parameter?

ValueError: Negative values in data passed to `pairwise_distances`

Dear Sir

Recently, I am considering to identify the dimension of some crystal structures.
The MatID is very powerful in doing this, so I try to use it. While I could not analyze the results due to some errors.

The following are my input python lines: (several *.cif files obtained from Materialsproject are located in the path D:/test)

import os
import json
import ase.io
from matid import Classifier
inpath = "D:/test"
geometries = []
for root, dirs, files in os.walk(inpath):
for i_file in files:
if i_file.endswith("cif"):
i_atoms = ase.io.read("{}/{}".format(root, i_file))
geometries.append((i_file, i_atoms))
classifier = Classifier()
classifications = []
for i_file, i_geom in geometries:
print("Classifying")
i_cls = classifier.classify(i_geom)
print("Done")
classifications.append(i_cls)

After running, the following errors come out:
ValueError: Negative
values in data passed to pairwise_distances. Precomputed distance need to have non-negative values.
Could you please help me to solve this problem? Thank you very much!

Attachment:
(1) the .cif file

generated using pymatgen

data_Na8SnSb4
_symmetry_space_group_name_H-M 'P 1'
_cell_length_a 10.55930848
_cell_length_b 10.55930848
_cell_length_c 10.55930848
_cell_angle_alpha 60.00000000
_cell_angle_beta 60.00000000
_cell_angle_gamma 59.99999998
_symmetry_Int_Tables_number 1
_chemical_formula_structural Na8SnSb4
_chemical_formula_sum 'Na16 Sn2 Sb8'
_cell_volume 832.51378766
cell_formula_units_Z 2
loop

_symmetry_equiv_pos_site_id
symmetry_equiv_pos_as_xyz
1 'x, y, z'
loop

_atom_site_type_symbol
_atom_site_label
_atom_site_symmetry_multiplicity
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
Sn Sn16 1 0.125000 0.125000 0.125000 1
Sn Sn17 1 0.875000 0.875000 0.875000 1
Sb Sb18 1 0.209603 0.763466 0.763466 1
Sb Sb19 1 0.763466 0.209603 0.763466 1
Sb Sb20 1 0.790397 0.236534 0.236534 1
Sb Sb21 1 0.236534 0.790397 0.236534 1
Sb Sb22 1 0.236534 0.236534 0.790397 1
Sb Sb23 1 0.763466 0.763466 0.763466 1
Sb Sb24 1 0.763466 0.763466 0.209603 1
Sb Sb25 1 0.236534 0.236534 0.236534 1

(2) .cif file

generated using pymatgen

data_Na3UO4
_symmetry_space_group_name_H-M 'P 1'
_cell_length_a 6.78243980
_cell_length_b 6.78243901
_cell_length_c 6.99866654
_cell_angle_alpha 58.93968265
_cell_angle_beta 58.93968585
_cell_angle_gamma 60.60349503
_symmetry_Int_Tables_number 1
_chemical_formula_structural Na3UO4
_chemical_formula_sum 'Na6 U2 O8'
_cell_volume 224.90403452
cell_formula_units_Z 2
loop

_symmetry_equiv_pos_site_id
symmetry_equiv_pos_as_xyz
1 'x, y, z'
loop

_atom_site_type_symbol
_atom_site_label
_atom_site_symmetry_multiplicity
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
O O6 1 0.216007 0.216007 0.283639 1
O O7 1 0.213157 0.786660 0.250091 1
O O8 1 0.216007 0.216007 0.784347 1
O O9 1 0.213340 0.786843 0.749909 1
O O10 1 0.786660 0.213157 0.250091 1
O O11 1 0.783993 0.783993 0.215653 1
O O12 1 0.786843 0.213340 0.749909 1
O O13 1 0.783993 0.783993 0.716361 1
U U14 1 0.000000 0.000000 0.000000 1
U U15 1 0.000000 0.000000 0.500000 1

Conda package

Hi @lauri-codes

I created a conda-forge package for matid would you be interested to join me in maintaining it?

Best,

Jan

Misclassidication of stacked system

The single surface is correctly classified as <Classification.Surface: '2D Surface'>, but the same surface twice as a stacked system is classified as <Classification.Class2D: '2D Generic'>.

The command clusters[1].cell() returns *** AttributeError: 'NoneType' object has no attribute 'cell' for the stacked system.

Further development and plans?

Good day Lauri,

it was very interesting for me to learn about matid! Many years ago I actually did something similar, on a less serious level although. In future I'd be definitely willing to re-use matid. Here are some questions I'd like to ask:

  • What is your general vision concerning the further development? e.g. do you plan to interface and support other Python materials libraries such as AiiDA, pymatgen etc.?
  • Do you plan to consider more concrete structural classes? i.e. detection of the perovskites, spinels, schorls, Friauf-Laves types, adamantanes, which is of course very useful and scientifically interesting, but difficult task.
  • In my experience I've encountered some really philosophical questions about the classification. Consider e.g. VASP calculation of a molecule box, when the distance between the replicas is just a couple of ร…. Should it be considered as 3D or 0D? There are the other more complicated examples with the 2D adsorption and so on. What is your approach?

In whole, thanks for your efforts, they are really appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.