Coder Social home page Coder Social logo

tidyx's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidyx's Issues

customising a full report

I actually in need and think it would be a good idea to show how to customise an rmarkdown report to follow branding guidlines. this will have colour palette, setting headers and footers, section pictures, referencing charts and tables and so on. I have been researching the topic and still haven't figured it all out

tidymodels versions

I was inspired by your pitch f/x videos to make several tidymodels versions. If interested, you can take a look here:
https://github.com/datadavidz/pitch_fx

moneyball_part2 is the KNN/UMAP
moneyball_part3 is the Decision Tree/Random Forest
moneyball_part4 is the xgboost
moneyball_part5 is the Naive Bayes

Disclaimer: I am a novice at tidymodels.

I found the results similar/same as what you showed using caret or the native packages. I like the consistency in the tidymodels framework when switching across different models.

-Dave

data cleaning

Data cleaning is super important. We should do a series on data cleaning for some messy data set.

Question on Episode 124

I probably didn't understand the problem completely but I was curious after watching Episode 124 why not just make a 2 column tibble and then group_by fruit and summarize toString?

library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.2.1
#> Warning: package 'tibble' was built under R version 4.2.1
#> Warning: package 'tidyr' was built under R version 4.2.1
#> Warning: package 'dplyr' was built under R version 4.2.1

lst1 <- list(winter = c("apple", "orange", "banana"), 
            summer = c("pear", "kiwi", "grape", "apple"),
            spring = c("cherry", "apple", "grape", "banana"))

lst1 |> 
 enframe(name = "season", value = "fruit") |>
 unnest(fruit) |>
 group_by(fruit) |>
 summarize(season = toString(season))
#> # A tibble: 7 × 2
#>   fruit  season                
#>   <chr>  <chr>                 
#> 1 apple  winter, summer, spring
#> 2 banana winter, spring        
#> 3 cherry spring                
#> 4 grape  summer, spring        
#> 5 kiwi   summer                
#> 6 orange winter                
#> 7 pear   summer

Created on 2022-11-15 by the reprex package (v2.0.0)

Databases

Using databases, accessing, writing, joins, etc.

Feature engineering

A cool segment would be to go over feature engineering approaches and variable selection.

Multi User DB editing

Hi Patrick and Ellis,

Loving the series on using databases. If possible could you cover a situation where you could have multiple people simultaneously editing data from the same database.

Using the example from episode 75.

2 different coaches, in different shiny sessions that happen to be going through and providing comments on the same game at the same time. How would the following situation resolve.

(events happen in order)

  1. Both coaches (on separate machines) open up the shiny app.
  2. Both coaches are presented with pbp DT in the app that has no comments.
  3. Coach 1 adds a comment to play 1 and commits it to the db.
  4. Coach 2 then also wants to give a comment to play 1. Keeping in mind, when this coach pulled the data from the DB, there was no comment for play 1.
  5. Coach 2 then writes their comment, and presses commit.

In the above situation I think your code as of episode 75 would mean any of Coach 1's comments made between Coach 2 pulling the data from the database and Coach 2 pressing their commit button, get overwritten by Coach 2's commit.

There are a few different solutions that could work for this specific situation, give each coach a specific column, store comments made by different coaches in different tables etc.

I'm interested to see if there's a reasonable way, similar to episode 73 where you could do some sort of reactive polling to provide a way of interacting with the database in a similar vein to multiple people editing a google sheet at once etc. I understand it isn't a realistic use-case to see what other users are typing, character by character but at least something to see if the data in your session of the app is "out of date"? as in someone else has edited/commented/added to that data.

Thanks again for all your time and effort with the TidyX series!

REF: Episode 132 - How to include AND, OR, NOT ops in the [SEARCH] field of the Output Shiny Form?

Hi Ellis & Pat,

Episode 132: EXCELLENT!!!.

Q:

How to include
AND, OR, NOT ops
in a query in the [SEARCH] field
(shown at the top of
the Output Results Shiny Form) ?.

I saw that
a space btw 2 words in [SEARCH]
acts as an AND op. :
ie - [ fran big ]
Great!.

But what about a more complex SEARCH query,
ie:
[ bu OR han ] ## search and show only rows with "bulls" OR with "Hanson" ?.

SFd99
latest R, Rstudio + Ubuntu Linux 20.04
San Francisco
============

Real messy data

publishedweek2420212.xlsx

This is a classic example of what the UK Government statistics organisation (Office for National Statistics (ONS)) produce. Great data but in a very "untidy" spread sheet. For example look at the weekly figures 2021 sheet. In the first column we have both age groups (combined and by gender) but also regions. The other column headings are dates

Shiny data dictionary creator

An idea you might want to tackle once is a Shiny app that generates a data dictionary from a datafile that is uploaded by the user (name, description, data type). Maybe something like the data dictionaries that come with TidyTuesday datasets? As an option a user would also be able to add a description to the variables in the dictionary, or if labels were assigned this would serve as the description field. For factor variables the different values could also be shown. Such a tool would be very useful for non-data scientist researchers to document their data in order to facilitate data re-use via open data repositories. Keep up the good work!

produce excel report with formatted fill colors

Hi Ellis and Patrick! I have a question, that might be interesting for a future episode? :)

Related to excel files that we produce on a weekly basis, but where cells that have a different value from the previous week are highlighted with Excel-s fill format. I am going through the openxlsx documentation, but do not seem to find something particular that addresses this question. Perhaps you guys can help :)

Cheers and thanks!
Adrian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.