Comments (11)
I just did :)
from scikit-lego.
Hey @koaning,
I stumbled upon your talk a few days ago and really enjoyed many of your talking points. I was curious about the RBF kernel trick so I decided to implement it in an online learning library me and some friends are working on. From what I understand The idea is simply to computing the distance between, say, a month and all the 12 months of the year using a RBF. This way September is closer to August than it is to March, which isn't taken into account if one simply one-hot encodes the month. Is this correct? If you're interested I coded it at the end of this notebook.
from scikit-lego.
- That cream stuff sounds cool beans. I'll give it a spin. Also: PyData Amsterdam has a CFP open at the moment. I'm still in the committee and that cream library sounds like something we'd love to host.
- The goal here is to make an
sklearn
compatible transformer that is general. Your example is good but our goal is to be very general; like be able to supply a date column and a number of RBFs you'd like per year. Or a column that you specify that will denote the timewindow. There's going to be a sprint this wednesday so I'll keep this thread up to date.
from scikit-lego.
- Sounds great! I've started making some slides (written in English) for an upcoming of the data science Meetup back here in Toulouse, so maybe I can reuse them.
- Okay good to know, I just wanted to make sure the maths were right. Indeed I think that having a transformer to extract date features would be nice because it could then pipeline into a
RBFTransformer
.
Good stuff!
Edit: if you're going to try creme
I suggest you install the latest version from GitHub using pip install git+https://github.com/creme-ml/creme
as there is a lot of stuff that isn't on PyPI yet.
from scikit-lego.
Question about creme: most of the learning that occurs, is that just a small SGD step that occurs per datapoint or is there something more happening? SKlearn has some passive agressive things api here, but creme is not doing that atm?
I like the idea of doing a rolling mean on an intercept by the way.
from scikit-lego.
I'm not 100% sure what you mean but here goes: you can provide an optimizer to LinearRegression
and LogisticRegression
. The default optimizer for both is called VanillaSGD
and simply performs textbook online gradient descent. There are many optimisers you can use, such as PassiveAggressiveI
, PassiveAggressiveII
, Adam
, etc. sklearn
's SGDClassifier
and SGDRegressor
can only use plain gradient descent because they use a special trick for the intercept that isn't generic. Because we use a running statistic to compute the intercept we're "allowed" to use any optimizer we wish.
I hope I'm clear! I'm going to write an explanatory notebook when I get some time!
from scikit-lego.
Yep. This is all I wanted to know. Thanks!
Do consider sending that cfp tho: https://pydata.org/amsterdam2019/cfp/
from scikit-lego.
@koaning when are the speakers for PyData Amsterdam annouced? I have to book a plane ticket early if I come.
from scikit-lego.
@MaxHalford tomorrow, but you're in! We're looking forward to seeing your talk!
from scikit-lego.
Cheers @MBrouns, I'm really excited! I'll book my ticket ASAP :)
from scikit-lego.
This feature has now been implemented. Documentation will follow.
from scikit-lego.
Related Issues (20)
- Mention other, preferred, packages in docs HOT 3
- [BUG] Error when calling predict_proba with GroupedPredictor using shrinkage and global model HOT 3
- [FEATURE] VarianceThresholdClassifier HOT 1
- `HierarchicalPredictor` and `HierarchicalTransformer` HOT 2
- [DOCS] Separate page for each meta feature HOT 2
- [DOCS] Document KlusterFoldValidation HOT 3
- [DOCS] Broken links on Home page to installation and user guide sections
- [DOCS] Remove netlify docs HOT 2
- [DOCS] Proposed addition: Adding a Quickstart or Overall User Guide Landing Page
- [DOCS] Latex markdown mixup HOT 1
- [DOCS] Missing explanation on how to run the documentation locally HOT 1
- [BUG] Rename `transform_train` to `resample`. HOT 8
- `linear_model.LowessRegression`
- `decomposition.pca_reconstruction.PCAOutlierDetection` HOT 1
- `decomposition.umap_reconstruction.UMAPOutlierDetection` HOT 5
- Delegate Missing Values and Categorical Handling in `GrouperTransformer` and `GrouperPredictor` HOT 6
- [FEATURE] Narwhals migration for dataframe-agnostic codebase HOT 23
- [BUG] zero_inflated_regressor.py HOT 1
- [FEATURE] equivalent to sklearn discovery module HOT 7
- [BUG] Fairness regularization HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scikit-lego.