Comments (6)
Hey @sfd99 thanks for raising the issue! and sorry for the long delay in responding. Although it looks a bit strange once visualised, I think this is the expected behaviour - the histogram binning is carried out by base R's hist()
function, and those bins won't necessarily center over the values when there are only a small number of unique values. For example, hist(mtcars$carb, breaks = 20)
and hist(mtcars$cyl, breaks = 20)
both generate images that are consistent with inspectdf
.
However, I do see the problem, and I think there should be a feature to make it obvious that the histogram doesn't make much sense. I'll have a think and get back if there is an update. thanks again :)
from inspectdf.
thanks! I think a general solution to this is more complex than it sounds. It's not uncommon to have int
columns that have very many unique values, where it wouldn't make sense to treat them as categories. Perhaps there could be a simple rule eg. fewer than 10 unique values and int --> count frequencies rather than using buckets. I'm not sure.
from inspectdf.
Hi Alastair,
Thanks for the response.
Yes,
pls do let us know
if you find a solution to this weird hist quirk.
Looking forward to it!.
Great PKG...
Best,
sfd99
San Francisco
latest Rstudio/R/Ubuntu Linux
inspectdf 0.0.11
from inspectdf.
I found a starting point
to this misalignment hist problem:
https://stackoverflow.com/questions/41486027/how-to-align-the-bars-of-a-histogram-with-the-x-axis
Take a look!...
Did this Dr, Google query:
how to align histogram bars with the corresponding values in R
One of the quoted solutions there,
when x-values are int:
This will center the hist bar
directly on top of the x-axis value:
data <- data.frame(number = c(5, 10, 11 ,12,12,12,13,15,15))
ggplot(data,aes(x = number)) + geom_histogram(binwidth = 0.5)
But there must be better R ways...
SFd99
from inspectdf.
And finally,
one last, good possible sol:
ggplot(mtcars, aes(x = factor(cyl))) +
geom_bar()
VIA:
https://www.guru99.com/r-bar-chart-histogram.html
from inspectdf.
Hi Alastair,
Yes, I agree.
That rule (your comment above),
could be a possible solution
to the hist alignment quirk...
If it works in a generic way,
the inspectdf PKG:: show_plot() would be the first
to solve it. :-)
thanks!
SFd99
from inspectdf.
Related Issues (20)
- Bug: `inspect_imb()` fails on factor columns HOT 1
- Rearranging output of `inspect_cat()` removes labels for `show_plot()` HOT 3
- label_size, label_angle & label_color to work for comparison plots
- show_ggplotly
- change white color for last (or second) factor. HOT 2
- travis --> GitHub actions
- improve jsd statistic for comparison
- type comparison to include names and types, new plots
- Improvements to inspect_cat
- inspect_cat and comparison plots HOT 1
- inspect_num: inconsistent binning of numerical variables in comparisons HOT 1
- inspect_num %>% show_plot fails on grouped dataframe with Error object 'mid' not found HOT 1
- inspect_num show_plot error HOT 1
- Partial argument match of 'unit' to 'units'
- Change to new cran checks badge URL HOT 1
- [New Functionality] "inspect_num()" add parameter to define number of columns.
- Bug: `inspect_num()` not able to deal with different ranges in `df2`
- Change in expected plot from v0.0.9 and v0.0.12
- Wish for ordered categories
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from inspectdf.