barbagroup / bempp_exafmm_paper Goto Github PK

Manuscript repository for our research paper, including reproducibility packages for all results, and latex source files.

Jupyter Notebook 94.50% Shell 0.42% Python 0.96% TeX 4.12%

bem biophysics electrostatics fmm

bempp_exafmm_paper's People

Stargazers

Watchers

Forkers

stjordanis govarguz

bempp_exafmm_paper's Issues

Round 2, Reviewer 6 comments

The authors present a python PB solver based on a fast multipole method library and a Galerkin boundary element method package. The method integrates a python interface with optimized computational kernels, and can be run interactively via Jupyter notebooks for ease of prototyping. Besides describing the mathematical frameworks, they further present results demonstrating the capability of the software, confirming code correctness, and assessing performance with between 8,000 and 2 million elements. To illustrate the interactive computing platform, they studied the conditioning of two variants of the boundary integral formulation with just a few lines of code. Mesh-refinement studies confirm convergence as 1/N, for N boundary elements, and a comparison with results from the APBS code using various proteins shows agreement. Performance results include timings, breakdowns, and computational complexity. To highlight the high performance of the method, they illustrate the compution for the solvation free energy of a Zika virus structure, represented by 1.6 million atoms and 10 million boundary elements.

A. On the intended audience, I think the intentions of the authors to provide a new platform for “high productivity and high performance” are reasonable for a journal interested in publishing articles reporting new resources.

B. On the intended functions, the authors are proposing a python-notebook-based workflow for “high productivity and high performance”. The innovation is not in the underlining algorithms that are already published elsewhere, but in the development of a workflow, which is a reasonable effort worth reporting.

C. On the agreement with existing software packages. There are some numerical disagreements between this implementation and those in the literature. It is possible that these are due to the different treatments of the salt effect. The boundary-element methods model the salt term all the way to the molecular surface in the solvent region, while finite-difference methods usually model the salt term a single ion layer (roughly 2 Angstrom) away from the molecular surface. If the salt concentration is not very low and the solute net charges are high enough, the differences in their solvation energies can be large. As the detail on the salt term is not given for the molecular calculations, it is unclear whether this could account for the observed differences. Another possibility lies in the different numerical surfaces used in different methods. Unless the exact numerical surfaces are used, it is unlikely to have very high numerical consistency even if the same atomic parameters are
used. Overall, PB solvation energies are very sensitive to where the molecular surface is located.

C. Other comments.

The major issue of this reviewer is that the manuscript does not appear to be one reporting/showcasing computing resources. The presentation of the manuscript is not consistent with the claimed goals in providing a python workflow. Instead, it is more like a regular manuscript reporting a new PB solver even after multiple rounds of revisions. A big chunk of the manuscript is on validation of the algorithm/implementation. These materials dilute the main theme of “high productivity and high performance” and are better be moved to appendices. It is recommended that the authors refer to those resource articles in Bioinformatics and Nucleic Acid Research on how to showcase their software functionalities.

The bottom line is that the authors need to talk more about the workflow functionalities. You do not just say your software has certain functionalities without showing. Specifically,

• The authors need to show the ease of using the python Jupyter notebooks by using figures/screenshots how prototyping is done with actual lines of code in the text for playing with different methods as the authors mentioned. Of course, more samples can be posted online.

• The authors need to show how to use their Jupyter notebooks to compute energies for multiple structures from MD as claimed as they agree a single solvation energy does not make much sense for a virus particle. Note that they need to conduct statistics as well. If possible, visualization of the data would be great too.

To highlight the scalability of their PB solver, a single virus particle is not enough. Instead time scaling for a range of smaller to larger proteins/nucleic acids, molecular complexes, maybe all the way to virus will be very helpful. If they can compare with APBS with its default recommended setting, it will be great.
Section “Mesh refinement study using a spherical molecule” is better labeled as comparison with analytical solutions. Otherwise, it is more appropriate to be combined with the next section on “Mesh refinement study using 5PTI”.
Section “Comparison with trusted community software” is partially overlapped with the Mesh Refinement sections as both sections addressed convergence. The authors can just show the extrapolated energy values from both methods to save space and stay focused on the main point. If they want to compare with MIBPB, all three values can be put into the same table, saving the space for more important topics.
In section “Comparison with trusted community software” it is difficult to assess the quality of the numerical calculations as the authors do not provide actual grid spacing used as shown in their BEM method. Please talk about the finite-difference resolution in term of grid spacing (Angstrom) or how many grids per Angstrom. Number of grid points are not very useful.
It is also surprising to see that the APBS convergence scaling is far from the second order, which should be the case for most widely used finite-difference PB solvers. Not sure this is because the grid spacing is still too large or some other reasons. They may want to refer to the original APBS papers to see why this is the case.
There should be a section on how authors prepared all molecules. Only the section on 5PTI has such discussion, but it lacks the mentioning of dielectric constants (solute/solvent regions) and ionic strength/kappa for the salt term.
There are no citations for pdb2pqr and the charm force field. There are no citations for Bempp and Exafmm packages.

Round 3, Editor's comments (decision: reject)

Received: February 15, 2022.

Your manuscript "High-productivity, high-performance workflow for virus-scale electrostatic simulations with Bempp-Exafmm" has now been seen by 3 referees (Reviewers # 3, # 5, and # 6), whose comments appear below. In the light of their advice, we have decided that we cannot offer to publish your manuscript in Nature Computational Science. [emphasis added]

I would like to briefly explain the rationale of our decision to you. As you know, your paper had originally 4 reviewers; Reviewer # 3 was enlisted specifically to give a more technical perspective on the paper, as this reviewer is an expert in Poisson-Boltzmann and numerical analysis. Given the initial disagreements between this reviewer and your team, and in order for us to make a decision as fair as possible, we decided to enlist 2 extra reviewers (Reviewers # 5 and # 6), who are also experts in Poisson-Boltzmann and numerical analysis. Based on the initial assessment given by Reviewers # 5 and # 6, we then decided to send a Revise decision.

The reviews are now back, and none of the reviewers (Reviewers # 3, # 5 and # 6) are satisfied with the revision. Given the previous disagreements between your team and Reviewer # 3, we initially focused our attention to Reviewers # 5 and # 6. Unfortunately, Reviewer # 5 only added their comments to the section that is visible to the editors, and they haven't replied to us about moving their comments to the section that is visible to the authors. But to summarize their comments, this reviewer mentioned that providing more evidence for the accuracy of the software is important, but that their suggestion was not entirely taken into account. Overall, they mentioned that they are not enthusiastic about the paper, and that they think that the paper does not present proper benchmarking and comparison. Regarding Reviewer # 6, this reviewer was also not entirely satisfied with the revision, pointing out that their validation suggestion was not properly addressed. This reviewer also
highlights similar issues as mentioned by Reviewer # 3 regarding the performance study and comparisons.

Therefore, given that none of these reviewers are supportive of the paper, we unfortunately will have to decline publication of your manuscript. I understand that this is disappointing, given all the effort and time that you and your team have put in this paper (which we really appreciated), but we need to make difficult decisions based on the reviews, and also based on the different expertise that we enlist.

PS: We did not understand Reviewer # 3's comments on the code (that the license prevents anonymous downloading and verification), but I just wanted to clarify that this particular comment was not taken into account in our decision.

I am sorry that we cannot be more positive on this occasion, but hope that you find the referees' comments helpful when preparing your paper for resubmission elsewhere.

First revision, 26 May 2021

(Links below will open the PDF with the Google Docs viewer. Default GitHub behavior would be to download the file instead.)

Article file diff, using latexdiff
revised_manuscript_marked_up.pdf

Response to reviewers
response-to-reviewers.pdf

Cover letter to the editor, to accompany the first revision, submitted 26 May 2021:
Cover_letter_to_the_editor.pdf

Add DOIs for all references

All refs need DOI

Round 2, Reviewer 3 comments

Reviewer #3 (Remarks to the Author):

In Table 6, why are the computational domains chosen as cubics for all proteins for APBS? It is unfair for APBS in terms of efficiency.

In Table 7, solvation free energy comparison shows very large differences (i.e., up to 1.9%) between MIBPB and the proposed method for some molecules. This is a series problem. The convergence of MIBPB is known in the literature (see for example: Nguyen et al. "Accurate, robust, and reliable calculations of Poisson–Boltzmann binding energies." Journal of computational chemistry 38.13 (2017): 941-948.). As shown in Figure 3 of this reference, the averaged relative absolute error of the electrostatic solvation free energies for all the 153 molecules with mesh size refinements from 1.1 to 0.2 ̊A is under 0.3%. I suggest the authors to carry out the same calculations at Nguyen et al. to find out the averaged relative absolute errors of the present method and plot them against those of MIBPB. Please report the numbers of elements at all resolutions.

As noticed by Fenley and coworkers (Influence of Grid Spacing in Poisson-Boltzmann Equation Binding Energy Estimation, J Chem Theory Comput. 2013 August 13; 9(8): 3677–3685), the calculations of Poisson–Boltzmann binding energies is a challenging task. The authors need to produce reliable Poisson–Boltzmann binding energies as shown by Nguyen et al. (see Figure 6 of the above-mentioned JCC reference) for the challenging task proposed Fenley and coworkers. This test will reveal the level of performance of the present method.

Reviewer #3 (Remarks to the Author After Authors' Reply):

The authors appear to deflect an essential problem for their software: It does not converge as the state-of-art schemes do. There is indeed no exact solution for electrostatic solvation free energies for biomolecules. However, pointed out by Fenley and coworkers, grid independence is essential. Fenley and Amaro (https://doi.org/10.1007/978-3-319-12211-3_3) pointed out many years ago that APBS has a convergence problem. The same problem was pointed out by Geng and Krasny. Although APBS is one of the most popular PB solvers in the user community, it is well known in the community of PB and GB developers that APBS is not the most trusted solver. For example, the generalized Born (GB) solver in Amber is calibrated with MIBPB, rather than APBS (see the work of Onufriev). It is a bad idea to compare convergence patterns with APBS.
While APBS has not resolved this problem, researchers have put much effort to improve the convergence of DelPhi, PBFD used in Amber, and MIBPB in the past decade. It is clear for me that judged by Table 2, Bempp has the same convergence problem as APBS does, which has been criticized by Amaro, Fenley and coworkers, and many others in the literature. If the authors do want to admit this problem, they should compare the convergence of Bempp with that of MIBPB for the molecules reported by Nguyen et al. I am quite sure that Bempp is not convergent as DelPhi, PBFD used in Amber, and MIBPB. It is inappropriate for the authors to make unverified statements as PB method developers.
Three tables (i.e., Table 2, Table 6, and Table 7) appear to be designed to mislead. They should be merged into one table in which APBS, DelPhi, PBFD, MIBPB, and Bempp are compared at mesh sizes 0.25, 0.5, and 1.0 to analyse their convergence rates.
It is well known that MIBPB is not as fast as DelPhi and APBS at a given mesh for a given protein. However, it might outperform all other methods in terms of efficiency in the sense that at a given convergence level, it requests the smallest amount of time. It is unfair to compare the execution time of MIBPB in the paper. But the authors should compare efficiency after they have established the convergence characteristics of their method.
Thank the authors for bringing my attention to the work of Geng et al in 2007. Can the authors use the designed solutions in that paper to validate their method?
It is not valid to use different boundary settings and different formulations to skip necessary comparisons as suggested in my earlier comments. All methods should give essentially the same solvation free energy for a given protein with a given interface definition.
The Galerkin formulation is not automatically immune Bempp from the problem of geometric singularity. The geometric singularity from protein solvent excluded surfaces is much worse than Lipschitz as shown by Geng and Krasny in their work. Krasney and Geng have been working on this issue for more than ten years. The authors need to define and construct high-order elements to achieve desirable convergence. I cannot find much-related information about this aspect in this manuscript.
It might be misleading to use the Zika virus (PDB ID 6C08) as an example. This protein complex is highly symmetric. It would be silly to not make use of its symmetry in computations. It would be deceiving if symmetry is used. I would suggest the authors use the HIV viral capsid (1E6J), which is far less symmetric than the Zika viral capsid. Frankly, with the help of GPU and parallel architectures, it is quite easy for any of the above mentioned PB solvers to produce electrostatic analysis of these viruses.

Mention accuracy of FMM on timing results

Fig 13 shows time to do one FMM evaluation (Laplace and Helmholtz) versus N. It shows 1 second for 10^6 elements.

how many points? i.e., if plot shows elements (triangles), caption should say how many quad points
what FMM accuracy? often people compare these O(N) plots from one paper to another, and don't realize they are comparing timings for results with different accuracies: always a good idea to state the FMM expansion order and the result's expected digits of accuracy

Reviewer 1 comments

This is an interesting article that describes a well-engineered, robust/reproducible, and very accessible (Jupyter notebooks) Poisson-Boltzmann solver. The authors have done an excellent job describing the rationale for the software, making it straightforward to install via Conda, and ensuring that the results of the paper are reproducible. Overall, the paper is sound and should be of interest to the journal audience. However, there are several issues that detract from the overall accessibility/impact of the article:

1. The article focuses on a use case of questionable value: the solvation energies of very large macromolecules. While this quantity makes sense as a metric for evaluating accuracy/convergence with respect to mesh discretization, the notion of a "[polar] solvation energy" for a virus is not particularly meaningful. The authors should clarify why this quantity is being used and provide readers with a discussion of how the code is meant to be used for physically meaningful quantities of interest.
2. It is unclear who the target audience is for either the paper or the software. The paper spends a significant amount of space discussing issues such as interior/exterior versions of the derivative formulation and eigen-spectra of the operators. This could be interesting to developers of BEM solvers but is unlikely to be meaningful to many consumers of biomolecular electrostatics software.
3. The discussion of mass matrix preconditioning was difficult to follow. In particular, it was unclear whether inversion of the mass matrix was a necessary step for solution of the problem or a way to improve conditioning. Therefore, it was unclear whether approximation of the matrix led to any issues with accuracy of the solution.
4. Was Nanoshaper used for all meshes in the paper? This was unclear and is important for generalizing the results of the paper to other molecules.
5. /usr/bin/time -v is not a standard option for all flavors of Linux; the authors should clarify.
6. The "Laplace" and "Helmholtz" FMM variants were introduced without sufficient description. Do these refer to the Poisson and linearized Poisson-Boltzmann formulations, respectively? More generally, the paper does a poor job of describing which version of the Poisson-Boltzmann equation it is solving (this isn't introduced until Section 4).
7. There appears to be a typographical error on page 8: "Richard extrapolation" instead of "Richardson extrapolation".

Round 3, Reviewer 3

The authors stated that ``The sphere has a radius of 1 Å, and 100 charges are placed randomly inside, representing the atoms in the solute.'' This is unphysical. There cannot be 100 atoms in a sphere of a radius of 1 Å. It is also unphysical to place charges randomly because some charges can be very close to each other and the associated electrostatic potential and force become singular or divergent. The PB model becomes invalid under this situation.

The authors cherry-pick one protein (1RCX) to validate their method. In the PB community, researchers typically use tens of proteins. The resulting convergence claim is not meaningful.

Unfortunately, there is a large difference between the extrapolated solutions of Bempp and `trusted' APBS (i.e., 320 kcal/mol) for the cheery-picked protein 1RCX. Note that the energy of hydrogen bond in the solvent is typically only a few kcal/mol. Protein-protein binding affinity and protein-ligand binding affinity are typically only in the range of 0 to -20 kcal/mol. The huge difference of 320 kcal/mol indicates that Bempp is neither reliable nor meaningful for practical applications. In the literature, it is well known that APBS, MIBPB, DELPHI, and PBSA have similar free energy predictions for over a hundred molecules (Accurate, robust, and reliable calculations of Poisson–Boltzmann binding energies, Journal of Computational Chemistry 38 (13), 941-948, 2017). This deep level of inconsistency with exiting PB solvers may be the real reason for the authors' rejection of my suggestion of carrying out a comparison on the Marcia Fenley and coauthor's 51 complexes
(J Chem Theor Comput 9(8):3677–3685, 2013.).

The authors state that MIBPB does not converge on 1A63 because its solvation energies are: -586.50, -585.52, and -587.43 kcal/mol on three grid spacing: 1.0, 0.75, and 0.5 Angstrom. They seem to not understand that in the CHARMM force field, the radius of the hydrogen is less than 1 angstrom, which means some hydrogen atoms were not resolved at grid size 1.0 Angstrom, leading to oscillations. Note that there is no theoretical proof or theorem that indicates numerical solutions must be monotonic to be convergent. The authors should also test their Bempp for under resolved meshes 1RCX, such as N= 118708 and 59354.

The authors concluded that MIBPB is not convergent because its solutions vary from 586.50, -585.52, to -587.43 kcal/mol in their test. Please note that the total amount of energy difference is only about 2 kcal/mol on three meshes. In comparison, Bempp's free energy varied over 400 kcal/mol during the mesh refinement for 1RCX in Table 5, which is 200 times larger (Bempp is also inferior in terms of relative errors.). As mentioned early, the discrepancy of the extrapolated solutions of APBS and Bempp is 320 kcal/mol, rendering unphysical results for computational biophysics. How can anyone believe Bempp is doing anything meaningful?

Marcia Fenley and coauthor's set of 51 complexes (J Chem Theor Comput 9(8):3677–3685, 2013.) is an important benchmark test. Note that MIBPB is not the only method that has been tested for the set. Marcia Fenley and coauthors tested many PB solvers. DELPHI was also tested (see, for example, Accurate estimation of electrostatic binding energy with Poisson-Boltzmann equation solver DelPhi program, Journal of Theoretical and Computational Chemistry 15 (08), 1650071, 2016). If Bempp is as robust as the authors claimed, it only takes a couple of days to finish the recommended test. Why the authors do not just do it but choose to fight over a constructive suggestion? At this point, I am quite convinced that Bempp must behave badly for this benchmark test.

The comparison of time and memory cost with respect to error for APBS and Bempp in Figure 4 is designed to mislead. Two methods do not use the same parallel and GPU setting and cannot be compared for time and memory, not to mention that the reference solutions from the two methods differ by over 320 kcal/mol. If the solution is incorrect or unphysical, it is completely irrelevant how fast a method can generate it.

It is known that the biggest advantage for BEM on PB is they have much less memory usage. Discretization on the surface is O(N^2) while the discretization on the 3D domain is O(N^3). However, as shown in Tables 4 and 5, they have almost the same amount of memory usage compared to APBS. This makes their method less competitive.

The authors claim ``The workflow integrates an easy-to-use Python interface with optimized computational kernels, and can be run interactively via Jupyter notebooks, for faster prototyping''. Unfortunately, there is no independent verification for their claim. The authors have placed a license requirement on their software which prevents anonymous downloading and verification of their claims.

Explain assembly time

In section 3.5, we have "Table 4 presents the assembly time, the solution time, and the number of iterations to converge…"

Should we explain here what is assembly time?

Fig 3 illegible

For Figure 3: you can stack the two square sub-images vertically, to fit in column width while making each larger. At this size, they are illegible.

I also don't like the line colors (red and blue). We need a nice color scheme. I assume these were simply made on a Keynote slide?

Round 4, Reviewer 7 (Dec. 2022)

Reviewer #7 (Remarks to the Author):

This manuscript provides a pipeline and its software implementation to use Bempp-Exafmm to conduct virus-scale electrostatics simulations. After reading though the manuscript and a few related references, the reviewer had many concerns about the product, particularly its novelty, numerical accuracy, memory, and practical usage as listed below. Based on these concerns, the reviewer rejects the publication of the manuscript with nature computational science.

Novelty: the work described is mostly a reassembly or insignificant increment of the authors’ previous work [11][15][39][53][60]. The comparisons between direct [26] and derivative [27] boundary integral methods, or the interior and exterior forms, have been clearly stated in many previous work [8][9][29] such that the derivative method has obvious advantage in convergence rate and the exterior form is faster. There is no point to report test results or provide the options in code to let the user to choose different formulations when there is obviously a winner already (e.g. figure 1-2 and table 4).

Numerical Accuracy: This is the main concern. The only case with analytical solution the authors reported at the consideration of accuracy is in figure 2. The fact that the authors use the quantity of solvation energy, which is a number or a weighted average of the reaction potential at the charge location to measure the convergence of the numerical algorithm is not correct. They should instead use the norm of surface potential error. In fact, from Steinbach’s argument (978-0-387-31312-2), when Galerkin boundary integral with singularity removal is used, the solution can be of O(h), which is O(1/(N^2)) as opposed to the O(1/N) reported in this manuscript. Comparison in Table 2 also raises the concern, the difference seems large and tests should be done between the proposed the method and the most accurate method (maybe the MIBPB) with repeatedly refined meshes.

Memory: The memory usage reported in table 6 is surprisingly large compared with previously reported boundary integral PB solvers [9][29]. The reviewer questioned that the authors might use storage extensively in trade of efficiency.

Practical Usage: The python code as wrappers should provide the users from the greater computational biophysics’ community convenient Interfaces to the potential biological application of the PB solver, rather than showing cases how fast the solver can be or how large of the target protein the solver can handle. The authors have access to very advanced supercomputers while most potential users do not have. The wrappers developed by APBS are good examples.

Describe what is in the Jupyter notebooks

We need to describe in the paper what is provided as supplementary materials on this repository, in particular the Jupyter notebooks: what do they teach the reader?

Round 3, Reviewer 6

The authors addressed most of my minor concerns but failed to address my major critique simply because they do not know how to produce a molecular dynamics (MD) trajectory for testing. It is in within their reach to set up and collect a MD trajectory at least for one of the two tested proteins to showcase the versatility of their computational resource as trajectory processing is a major application for PB solvers.

Considering this and their pushbacks on other reviewers’ comments asking validation of their method on a larger set of real molecules, this reviewer’s enthusiasm for the manuscript is diminished. As it stands, the manuscript appears to be just one reporting preliminary efforts to develop a PB solver. Unfortunately, even for the narrower scope, the authors cannot show that their method agree with other widely used methods under identical testing conditions.

Specifically, in subsection “Performance comparison with APBS”, the limiting values between APBS and their method is too large: ~320kcal/mol out of ~10650kcal/mol. The ~3% difference cannot be ignored in comparison of limiting values given identical parameters (coordinates, charges, & radii) are used. It is suggested that the authors use the same surface generation routine during their analysis in this and additional real molecules to make sure the two methods do agree.

Another troublesome point can be found in subsection “Performance study with direct and derivative formulations”, where the authors spend considerate efforts to discuss an unphysical scenario, 100 random charges inside a sphere of 1 Angstrom. It is unclear why the authors address such a totally unrelated scenario to computational structural biology.

In summary, it is unacceptable that the authors cannot work on additional molecules or MD trajectories as suggested. These extra data points can be used to support the validity of their implementation of a new PB solver. Researchers with working knowledge in computational structural biology can easily get these done.

How many points? (sources/targets)

In the performance section of results, Fig. 13 (timing vs. N, showing linear complexity) says "number of elements" in the x-axis label. The data point at largest N shows 2x10^6.

In the text, it says "in the largest case, with over 10 million quadrature points." It's implicit, then, that the figure plots w.r.t. the number of elements, as it says. It would be good to be explicit in the caption by mentioning the number of quadrature points per triangle, and stating the largest N is so-many points.

Elsewhere, too: captions of Table 4, Figs. 14, 15. Also, section 3.5: the Zika virus is discretized with 10 million elements. Is this 6 quadrature points per triangle, leading to 60 million FMM points? Say that.

Reviewer 2 comments

The authors present an implementation for a numerical solver specifically design to tackle continuum electrostatics models. Specifically, the authors present a solver for the Poisson-Boltzmann equation for a set of charges found in biomolecules. The manuscript is well written, and the technical and scientific hypothesis are sound therefore it is worthy of publication; however a few concerns were raised by its current form:

Major concerns

1. In the introduction, the authors claim that simulations of viruses are limited to a few elite researchers; this is simply not true. While large-scale simulations of entire viruses at atomistic remain a niche field, several groups have been able to conduct state-of-the-art virus research that have revealed novel biology using XSEDE resources and campus clusters. There seems to be fine line in the computational virology community in terms of what studies constitute a computational benchmarks and what studies reveal novel biology. There are several reviews available on this topic from various authors.
2. The simulations presented in the introduction serve as computational benchmarks (e.g., Arkhipov 2007, Durrant 2020), but the biological impact of these papers is questionable. There are several computational studies on the biology of viruses that are far more relevant to the present manuscript and to the virology field at large, none of which are cited.
3. To the casual reader the present manuscript does not communicate why another Poisson-Boltzmann solver is needed by the community – it is clear to this reviewer that this is a useful and much needed implementation. However, the computational biophysics field has trusted APBS for several years due to its ease of use, reliability, and performance. There is not a direct comparison between the results from the presented software and APBS. Furthermore, the authors seem to use the APBS suite themselves as they use the PDB2PQR tool to generate the partial charges as the input in their software.
4. Like the previous point the determination of the molecular surface in the proposed approach is dependent on another external package, namely NanoShaper. Although NanoShaper is poised for FEM calculations of electrostatics; APBS includes its own molecular surface determination package that is widely used. More on this below.
5. To address both 3 and 4, the authors should make sure that the needed binding to these external packages is part of their software.
6. The authors use as benchmarks crystallographic and cryoEM structures. Another important technique for structure determination is NMR. The latter has the advantage of yielding ensembles of structures that are statistically independent. The authors should perform solvation free energy calculations on each structure of the ensemble, to establish whether their formulation is sensitive enough to small changes in the structure of the protein which in often cases result in large changes in free energy of solvation.

Minor concerns:

1. The figure in 4 has two different size scales for a same sized particle. It would be more visually appealing if the two renders of the virus had the same size.
2. The units in the scale of the potential in Figure 4b are missing.
3. Details about the molecular surface are lacking. These are important as these parameters determine the solvent exposed area of the system.

Round 2, Reviewer 5 comments

While the manuscript reports an interesting development, it lacks two important components, listed below:

1) Acknowledging previous works of electrostatic modeling of electrostatic potential and electric field of viruses. See for example Fig. 3 in J. Comput. Chem. 2019, 40, 2502–2508.
) Better assessment of the accuracy of the delivered electrostatic energy by comparing with analytical solution. For example, see several cases Fig. 1,2,3 in BMC Biophysics 2012, 5:9.
Furthermore, the abstract and the entire paper should make clear that number of atoms and the dimensions of the molecule/virus are not the crucial factor resulting in computational complexity, rather the mesh size determines the computational efforts. In addition, it should be clearly stated that the report is for linearized PB, but not for non-linear PB.

Reviewer 3 comments

There are a number of concerns regarding this work.

1) There is little innovation in this work. The formulation is well-known. There have been numerous papers on this approach.
2) It is not clear what is the convergence of the methods. Note that high-order boundary element based Poisson-Boltzmann (PB) solvers have been developed by many authors, including Krasny and Geng.
3) The convergence of the software was not carefully validated with problems of known solutions. This validation is required to publish in a normal journal, not to mention a Nature one.
4) There is no systematical comparison with other methods in the literature for either simple geometry or complex geometry.
5) How does the package do for geometric singularities, which are common in protein surfaces?
6) I have also a concern of the perspective of the field of this paper. Many important progresses were not mentioned, for example: Improvements to the APBS biomolecular solvation software suite, E Jurrus, D Engel, K Star, K Monson, J Brandi, LE Felberg, DH Brookes, Protein Science 27 (1), 112-128.
7) Many online PBE solvers available. A user just needs to give a PBD ID to get result. However, the speed of solver is important. I am not sure if Python codes are the best of speed.

Eigenvalue plots hard to read

Figs. 6 and 7 are hard to read because the markers are small. However if we make them larger they will overlap and the pattern will be obscured. What about using + markers instead?

Editor's comments

While we ask you to address all of the points raised, the following points need to be substantially worked on:

Consider the missing references pointed out by Referees 2 and 3 (see below).
Compare your software tool with other methods in the literature, as suggested by Referees 2 and 3.
Validate your approach with problems of known solutions, as suggested by Referee 3.
Explain your use case in more details, including why this is meaningful to the community. Provide other use cases, if possible and feasible.

Referee 2 provided the following references via email (regarding their major concerns 1 and 2):

Major concern 1) Examples of reviews available on this topic:

https://pubmed.ncbi.nlm.nih.gov/26874202/ -- Biochim Biophys Acta. 2016 Jul;1858(7 Pt B):1610-8. doi: 10.1016/j.bbamem.2016.02.007. Epub 2016 Feb 10.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7278654/
https://pubs.acs.org/doi/10.1021/acs.jpclett.8b02298
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6456034/

Major concern 2) Examples of computational studies on the biology of viruses that are more relevant:

https://science.sciencemag.org/content/370/6513/203
https://www.nature.com/articles/s41586-018-0396-4

Referee 3 provided the following information/references via email (regarding their concerns 3 and 4):

Concern 3) Grid refinement validation should be carried out for a large set of realistic biomolecules in terms of various evaluation metrics.
Concern 4) Examples of other methods:

Li, A. and Gao, K., 2016. Accurate estimation of electrostatic binding energy with Poisson-Boltzmann equation solver DelPhi program. Journal of Theoretical and Computational Chemistry, 15(08), p.1650071.
Nguyen, D.D., Wang, B. and Wei, G.W., 2017. Accurate, robust, and reliable calculations of Poisson–Boltzmann binding energies. Journal of computational chemistry, 38(13), pp.941-948.

You will also need to make some editorial changes so that it complies with our Guide to Authors at https://www.nature.com/natcomputsci/for-authors .

In particular, I would like to highlight the following points of our style:

To improve the accessibility of your paper to readers from other research areas, please pay particular attention to the wording of the paper’s abstract, which serves both as an introduction and as a brief, non-technical summary in up to 150 words. It should include the background and context of the work, ‘Here we show’ or an equivalent phrase, and then the major results and conclusions of the paper. Because researchers from other sub-disciplines will be interested in your results and their implications, it is important to explain essential but specialized terms concisely. We suggest you show your summary paragraph to colleagues in other fields to uncover any problematic concepts. We discourage having references, links, and detailed code/hardware information in the abstract, as this information will come in the Code Availability statement.

[…]

To aid in the review process, we would appreciate it if you could also provide a copy of your manuscript files that indicates your revisions by making of use of Track Changes or similar mark-up tools.

Round 2, Editor's comments

Received: Sept. 21, 2021

Your manuscript "High-productivity, high-performance workflow for virus-scale electrostatic simulations with Bempp-Exafmm" was seen by the original 4 referees, and as I mentioned, 2 extra referees, whose comments are appended below. These new referees are experts in Poisson-Boltzmann and numerical analysis.

The new referees (particularly Referee #6) raised important points that perhaps will help better clarify the main goal of the paper.

The main goal of the paper is to provide a usable and reusable workflow for virus-scale electrostatic simulations; I think the goal of the paper is clear, but as Referee #6 mentions, it's not entirely clear what the workflow features are and how this even compares with other PB solvers. We would recommend having a subsection under Results section (perhaps the first subsection of this section) clearly explaining all of these features, and how this basically makes your workflow different than standard PB solvers. It is important to make it very clear (in this subsection and in the paper in general) that you are not trying to propose a new PB solver, but rather, an interactive, reusable workflow for ease prototyping, etc.
You can move main methodological information to the Methods section, and in the Result section, just provide a brief overview of the methods.
While you are not proposing a new PB solver, it is important to demonstrate what are the main benefits of using your resource rather than a standard PB solver. Therefore, we think comparisons in terms of performance/convergence against other PB solvers are fair as requested by Reviewer #3 -- since a resource must also show that it has a comparable performance. However, we also agree with Reviewer #6 that it should not be the center of attention. Our suggestion is then to still present the comparison, but perhaps discuss these results in more detail in a Supplementary Information, and summarize the discussion in the main text, in the Results section. We would also recommend comparing with at least one more PB solver as well, if possible.
Please also address the other comments made by Referees #5 and #6, and provide a reply to the latest review by Referee #3.

Reviewer `#1`

The authors have done an adequate job addressing my concerns and the additions in response to the other reviewers' comments are also adequate. It still seems like the paper falls in the limbo between a computational biophysics and scientific computing audience. This is more of a concern for the editor than this reviewer: I am unsure whether to focus my comments on a computational biophysics reader's concerns (MM-PBSA should not be used to compute solvation energies of viruses) or scientific computing reader's concerns (more detailed analysis of scaling, convergence, etc.).

In summary, this paper is publishable as-is although I remain confused about the audience.

Reviewer `#2`

The authors have addressed my concerns.

Reviewer `#4`

My comments have been addressed satisfactorily.

Reviewers 3, 5, and 6 on separate issue threads.

Reviewer 4 comments

This paper presents a computational workflow combining a fast multipole method library and a Galerkin boundary element method package for solving the Poisson-Boltzmann (PB) equation for large systems such as viruses. In terms of high-performance, the code is among the best ones of state-of-the-art fast BEM PB solvers. A further special contribution of the workflow is its high-productivity through integrating an easy-to-use and interactive open-sourced platform for coding, computing, and analyzing of FMM BEM PB-type of calculations. These contributions are significant to PB developers community.

Minor comments:

1, I suggest the authors provide some description on treating the singularities appeared in surface integrations in the Bempp? The paper only mentioned “The singularity of the Green's function needs to be accounted for in the quadrature rules for integration over adjacent or identical test/trial triangles” on page 10.
2, The work only reports the CPU performance of FMM and BEM parts. To give readers more information of a complete BEM calculations, please also provide some CPU info for meshing part, especially for large systems.
3, In Figure 4, the unit of potential ?

barbagroup / bempp_exafmm_paper Goto Github PK

bempp_exafmm_paper's People

Stargazers

Watchers

Forkers

bempp_exafmm_paper's Issues

Reviewer #1

Reviewer #2

Reviewer #4

Recommend Projects

Recommend Topics

Recommend Org

Reviewer `#1`

Reviewer `#2`

Reviewer `#4`