Coder Social home page Coder Social logo

austekan / fiscore Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 13.62 MB

Fiscore package development, updates, and the newest version release

License: GNU General Public License v3.0

R 100.00%
bioinformatics machine-learning structural-bioinformatics structural-biology

fiscore's People

Contributors

austekan avatar

Watchers

 avatar

fiscore's Issues

Refactoring of PDB_Prepare

Function PDB_Prepare

This function can benefit a lot from refactoring:

  • split sub-functions into separate functions: these can be useful on their own;
  • reduce the complexity of the code;
  • optimize the code and remove dependency on dplyr;
  • correct an ugly bug;

I will address all these topics below. The refactored code is also available on GitHub:

PDB_Prepare Main Function

The refactored main function is in file Proteins.Structure.FiScore.R (this is only a convenience name for my script files; the initial name can be retained).

  • most of the code has been moved to external helper functions;
  • the lower limit of aa can be explicitly set: default = 5;
  • the last for-loop should run more efficiently and the dependence on dplyr has also been removed;

Features

Are extracted by the helper function features.pdb (see file Proteins.Structure.R). This function can be used on its own and should be exported by the package).

  • the function also uses the helper functions: as.type.helix and as.type.sheet;
  • the structure name is stored as a factor (for efficiency): therefore requires explicit as.character() when used in the main function;

Torsions & B-Factor

Are computed by separate functions (see file Proteins.Structure.R). These functions can be used on their own as well.

  • string extraction: the vectorized version is used directly and should run far more efficiently, e.g.:
    df_resno = as.numeric(stringr::str_extract(rownames(pdb_df), "[0-9]{1,}"));

Ugly bug was also corrected:

  • the torsions function now stores an attribute with the complete cases (as a logical vector):
    attr(pdb_df, "complete") = isComplete;
  • the BFactor function uses explicitly this information to select only the complete cases;

Other

  • read.pdb: is a minor helper function not actually used in the code;

The refactored code should be faster and more robust. The function names are provisional and may be changed or adapted to better suite various workflows.

Note:

  • the refactored code has NOT been thoroughly tested!

[PDB_prepare.R] Optimize helper function MINMAX_normalisation_func;

Helper function MINMAX_normalisation_func

The range (min, max) is computed multiple times. It should be computed only once. One of the following conditions also seems redundant: (r[1] == 0) && (r[2] == 0)), as min == max.

Also, inserting a space between "#" and the comments greatly increases readability.

### Helper functions for the analysis

# MIN-MAX normalisation based on the input array
MINMAX_normalisation_func = function(array) {
	# input = numeric array;
	# returns normalised array values;
	
	# check for cases where all B-factor values are 0;
	r = range(array);
	if((r[2] - r[1] == 0) && (r[1] == 0) && (r[2] == 0)) { return (0); }
	
	return ((array - r[1]) / (r[2] - r[1]));
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.