Features not scaled with `minmax`

fseval

Benchmarking framework for Feature Selection and Feature Ranking algorithms 🚀

Demo

Install

Installation through PyPi ^{_{⭐️ RECOMMENDED OPTION}}
```
pip install fseval
```

Installation from source

git clone https://github.com/dunnkers/fseval.git
cd fseval
pip install -r requirements.txt
pip install .

You can now import fseval import fseval in your Python code, or use the fseval command in your terminal. For an example, run fseval --help. For more information, see the documentation link below ⌄.

Documentation

See the documentation.

About

Built at the University of Groningen and published in The Journal of Open Source Software (JOSS):

https://joss.theoj.org/papers/10.21105/joss.04611

Project has some early roots in another project, which is a feature selection algorithm called FeatBoost (see full citation below).

A. Alsahaf, N. Petkov, V. Shenoy, G. Azzopardi, "A framework for feature selection through boosting", Expert Systems with Applications, Volume 187, 2022, 115895, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2021.115895.

The open source Python code of FeatBoost is available in https://github.com/amjams/FeatBoost.

2023 — Jeroen Overschie

	def _score_with_feature_importances(self, score, X_importances):
	"""Scores this feature ranker with the available dataset ground-truth relevant
	features, which are to be known apriori. Supports three types of feature rankings:
	- a real-valued feature importance vector
	- a boolean-valued feature support vector
	- an integer-valued feature ranking vector."""

	### Feature importances
	if self.ranker.estimates_feature_importances:
	# predicted feature importances, normalized.
	y_pred = np.asarray(self.ranker.feature_importances_)
	y_pred = y_pred / sum(y_pred)

	# r2 score
	y_true = X_importances
	score["importance.r2_score"] = r2_score(y_true, y_pred)

	# log loss
	y_true = X_importances > 0
	score["importance.log_loss"] = log_loss(y_true, y_pred, labels=[0, 1])

	### Feature support
	if self.ranker.estimates_feature_support:
	# predicted feature support
	y_pred = np.asarray(self.ranker.feature_support_, dtype=bool)

	# accuracy
	y_true = X_importances > 0
	score["support.accuracy"] = accuracy_score(y_true, y_pred)

	### Feature ranking
	# grab ranking through either (1) `ranking_` or (2) `feature_importances_`
	ranking = None
	if self.ranker.estimates_feature_ranking:
	ranking = self.ranker.feature_ranking_
	elif self.ranker.estimates_feature_importances:
	ranking = self.ranker.feature_importances_

	# compute ranking r2 score
	if ranking is not None:
	# predicted feature ranking, re-ordered and normalized.
	y_pred = self._scores_to_ranking(ranking)
	y_pred = y_pred / sum(y_pred)

	# convert ground-truth to a ranking as well.
	y_true = self._scores_to_ranking(X_importances)
	y_true = y_true / sum(y_true)

	# in r2 score, only consider relevant features, not irrelevant ones. in
	# this way, when `X_importances = [0, 2, 4, 0, 0]` we do not get misleadingly
	# high scores because the ranking also
	sample_weight = np.ones_like(X_importances)
	sample_weight[X_importances == 0] = 0.0

	# r2 score
	score["ranking.r2_score"] = r2_score(
	y_true, y_pred, sample_weight=sample_weight
	)

	def _ensure_splitter(self):
	assert self.splitter is not None, "no splitter configured!"

	from .openml import OpenML
	from .wandb import Wandb

	__all__ = ["OpenML", "Wandb"]

dunnkers / fseval Goto Github PK

fseval's Introduction

fseval

Demo

Install

Documentation

About

fseval's People

Contributors

Stargazers

Watchers

Forkers

fseval's Issues

Recommend Projects

Recommend Topics

Recommend Org