tidyx's People
Forkers
han-tun meersel katesports gmednick carlospumar-debug alex040604 niklasisaiah sauravg94 dataunirio liston aito123 imarin79 jpzhaoo hammao kumo121 rubinasaf whlying obgeneralao anhnguyendepocen manny-ma lyledanley hrocha sidy2015 abdulrasheedisah pgg1309 hypdoctor ssmoot1 ekt-dar elijahrona brandonlstone ukrcherry krzbanas amanojas jennifer95 lhdjung oyogo tenfnan dar4datascience grandslam77 woodstck jmcoyro jenniferlopes artlesshao yagowin jmromeroptidyx's Issues
customising a full report
I actually in need and think it would be a good idea to show how to customise an rmarkdown report to follow branding guidlines. this will have colour palette, setting headers and footers, section pictures, referencing charts and tables and so on. I have been researching the topic and still haven't figured it all out
tidymodels versions
I was inspired by your pitch f/x videos to make several tidymodels versions. If interested, you can take a look here:
https://github.com/datadavidz/pitch_fx
moneyball_part2 is the KNN/UMAP
moneyball_part3 is the Decision Tree/Random Forest
moneyball_part4 is the xgboost
moneyball_part5 is the Naive Bayes
Disclaimer: I am a novice at tidymodels.
I found the results similar/same as what you showed using caret or the native packages. I like the consistency in the tidymodels framework when switching across different models.
-Dave
data cleaning
Data cleaning is super important. We should do a series on data cleaning for some messy data set.
Question on Episode 124
I probably didn't understand the problem completely but I was curious after watching Episode 124 why not just make a 2 column tibble and then group_by fruit and summarize toString?
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.2.1
#> Warning: package 'tibble' was built under R version 4.2.1
#> Warning: package 'tidyr' was built under R version 4.2.1
#> Warning: package 'dplyr' was built under R version 4.2.1
lst1 <- list(winter = c("apple", "orange", "banana"),
summer = c("pear", "kiwi", "grape", "apple"),
spring = c("cherry", "apple", "grape", "banana"))
lst1 |>
enframe(name = "season", value = "fruit") |>
unnest(fruit) |>
group_by(fruit) |>
summarize(season = toString(season))
#> # A tibble: 7 × 2
#> fruit season
#> <chr> <chr>
#> 1 apple winter, summer, spring
#> 2 banana winter, spring
#> 3 cherry spring
#> 4 grape summer, spring
#> 5 kiwi summer
#> 6 orange winter
#> 7 pear summer
Created on 2022-11-15 by the reprex package (v2.0.0)
Databases
Using databases, accessing, writing, joins, etc.
Feature engineering
A cool segment would be to go over feature engineering approaches and variable selection.
Multi User DB editing
Hi Patrick and Ellis,
Loving the series on using databases. If possible could you cover a situation where you could have multiple people simultaneously editing data from the same database.
Using the example from episode 75.
2 different coaches, in different shiny sessions that happen to be going through and providing comments on the same game at the same time. How would the following situation resolve.
(events happen in order)
- Both coaches (on separate machines) open up the shiny app.
- Both coaches are presented with pbp DT in the app that has no comments.
- Coach 1 adds a comment to play 1 and commits it to the db.
- Coach 2 then also wants to give a comment to play 1. Keeping in mind, when this coach pulled the data from the DB, there was no comment for play 1.
- Coach 2 then writes their comment, and presses commit.
In the above situation I think your code as of episode 75 would mean any of Coach 1's comments made between Coach 2 pulling the data from the database and Coach 2 pressing their commit button, get overwritten by Coach 2's commit.
There are a few different solutions that could work for this specific situation, give each coach a specific column, store comments made by different coaches in different tables etc.
I'm interested to see if there's a reasonable way, similar to episode 73 where you could do some sort of reactive polling to provide a way of interacting with the database in a similar vein to multiple people editing a google sheet at once etc. I understand it isn't a realistic use-case to see what other users are typing, character by character but at least something to see if the data in your session of the app is "out of date"? as in someone else has edited/commented/added to that data.
Thanks again for all your time and effort with the TidyX series!
REF: Episode 132 - How to include AND, OR, NOT ops in the [SEARCH] field of the Output Shiny Form?
Hi Ellis & Pat,
Episode 132: EXCELLENT!!!.
Q:
How to include
AND, OR, NOT ops
in a query in the [SEARCH] field
(shown at the top of
the Output Results Shiny Form) ?.
I saw that
a space btw 2 words in [SEARCH]
acts as an AND op. :
ie - [ fran big ]
Great!.
But what about a more complex SEARCH query,
ie:
[ bu OR han ] ## search and show only rows with "bulls" OR with "Hanson" ?.
SFd99
latest R, Rstudio + Ubuntu Linux 20.04
San Francisco
============
Models of p(hit) given throw type
like in episode 45
Real messy data
This is a classic example of what the UK Government statistics organisation (Office for National Statistics (ONS)) produce. Great data but in a very "untidy" spread sheet. For example look at the weekly figures 2021 sheet. In the first column we have both age groups (combined and by gender) but also regions. The other column headings are dates
line tracing
Shiny data dictionary creator
An idea you might want to tackle once is a Shiny app that generates a data dictionary from a datafile that is uploaded by the user (name, description, data type). Maybe something like the data dictionaries that come with TidyTuesday datasets? As an option a user would also be able to add a description to the variables in the dictionary, or if labels were assigned this would serve as the description field. For factor variables the different values could also be shown. Such a tool would be very useful for non-data scientist researchers to document their data in order to facilitate data re-use via open data repositories. Keep up the good work!
produce excel report with formatted fill colors
Hi Ellis and Patrick! I have a question, that might be interesting for a future episode? :)
Related to excel files that we produce on a weekly basis, but where cells that have a different value from the previous week are highlighted with Excel-s fill format. I am going through the openxlsx documentation, but do not seem to find something particular that addresses this question. Perhaps you guys can help :)
Cheers and thanks!
Adrian
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.