Comments (1)
Thanks for the compliments on the implementation. I admit that the automatic cluster selection may not be to everyone's taste, but it is a good default for a large number of cases. Since the single_linkage_tree_
and condensed_tree_
are both exposed as attributes of the model after fitting I feel those who wish to do something else are able to should they desire to do so.
On the other hand, if the question is one of sensitivity to the min_samples
parameter (rather than min_cluster_size
) I may have some answers there. I have been working on a different algorithms that essentially operates over all (or potentially just many) min_samples
values and computes a total stability over the combined epsilon
and min_samples
space. This requires some significant rethinking of how to interpret the algorithm, and I've been drawing heavily from persistent homology (and more accurately persistent homotopy) theory to get something workable. There are still a number of details to hammer out and some work to be done to ensure the resulting algorithm really does return useful clusterings, but I believe it has significant promise.
from hdbscan.
Related Issues (20)
- Is there a bug for "label up to the root" ?
- Validation questions HOT 1
- pypi version throws ValueError HOT 27
- TypeError encountered HOT 2
- Getting Error while using HDBSCAN HOT 1
- Clustering struggles with mix of noise levels HOT 1
- HDBSCAN version 0.8.33 not able to install with python version 3.10.13 HOT 2
- Tests failed with: No module named 'hdbscan._hdbscan_linkage'
- Request for Adding `__version__` Attribute HOT 1
- Request for `verbose` setting
- max_cluster_size parameter does not work
- ip
- Question regarding sparse matrices
- Crash when points are equal HOT 1
- Way to obtain the lambda value HOT 1
- requirements prevent cython>=3 HOT 1
- How to set cluster_selection_epsilon when using cosine distances?
- Outlier scores - possible bug in GLOSH computation
- Can't install HDBSCAN via pip: [WinError 5] Access is denied HOT 1
- HDBScan performance issue when choosing Best algorithm HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdbscan.