sussmanbu / final-project-group-10 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sussmanbu/ma415_final_project

0.0 0.0 0.0 139.24 MB

JavaScript 11.89% R 0.54% CSS 0.01% HTML 87.57%

final-project-group-10's Introduction

ma4615-final-project-quarto

Video for Group 10

final-project-group-10's People

Contributors

final-project-group-10's Issues

Final Project Feedback and Grade

Total
125/125 (100%)

Data Page

Describe where/how to find data.
You must include a link to the original data source(s). From what you can tell, why was the data collected/curated? Who collected the data?

Evaluation: E

Describe the different data files used and what each variable means.
If you have many variables then only describe the most relevant ones and summarize the rest. Bulletted lists or tables are recommended.

Evaluation: E

Describe any cleaning you had to do for your data.
You must include a link to your load_and_clean_data.R file.
Also, describe any additional R packages you used outside of those covered in class.
Describe how you combined multiple data files and any cleaning that was necessary for that.
Some repetition of what you do in your load_and_clean_data.R file is fine and encouraged if it helps explain what you did.

Evaluation: M

How much data was removed due to NAs?

Organization, clarity, cleanliness of the page
Make sure to remove excessive warnings, use clean easy-to-read code (without side scrolling), organize with sections, use bullets and other tools, etc.

Evaluation: E

Analysis Page(s)

Introduce what motivates your Data Analysis (DA)
Which variables and relationships are you most interested in?
What questions are you interested in answering?
Provide context for the rest of the page. This will include figures/tables that illustrate aspects of the data of your question.

Evaluation: E

Modeling and Inference
The page will include some kind of formal statistical model. This could be a linear regression, logistic regression, or another modeling framework.
Explain the ideas and techniques you used to choose the predictors for your model. (Think about including interaction terms and other transformations of your variables.)
Describe the results of your modelling and make sure to give a sense of the uncertainty in your estimates and conclusions.

Evaluation: M

Did the weighting make much of a difference? How could you evaluate that?

Good but I'd like to see more concise model descriptions and some comparisons between MA/MS.

Explain the flaws and limitations of your analysis
Are there some assumptions that you needed to make that might not hold? Is there other data that would help to answer your questions?

Evaluation: E

Clarity Figures
Are your figures/tables/results easy to read, informative, without problems like overplotting, hard-to-read labels, etc?
Each figure should provide a key insight. Too many figures or other data summaries can detract from this. (While not a hard limit, around 5 total figures is probably a good target.)
Default lm output and plots are typically not acceptable.

Evaluation: M

Don't show me things that you don't explain and/or are not relevant.

Clarity of Explanations
How well do you explain each figure/result?
Do you provide interpretations that suggest further analysis or explanations for observed phenomenon?

Evaluation: E

Organization and cleanliness.
Make sure to remove excessive warnings, use clean easy-to-read code, organize with sections or multiple pages, use bullets, etc.
This page should be self-contained.

Evaluation: M

Too much overall.

Repeated paragraph.

Big Picture Page

Clarity of Explanation
You should have a clear thesis/goal for this page. What are you trying to show? Provide details that support your thesis but don't go into to much mathematics or statistics. The audience for this page is the general public (to the extent possible).

Evaluation: M

Overall good but it would be useful to see more quantitative descriptions of your findings.

Quality of Figures
Each figure should be very polished and also not too complicated. There should be a clear interpretation of each figure and each figure should have a clear purpose.

Evaluation: R

Are those figures the best way to make the comparisons between MA and MS?

Generally, I'd say these figures are too complicated for the Big Picture page.

Unpolished tables.

Creativity
Do your best to make things interesting. Think of a story. Think of how each part of your analysis supports the previous part or provides a different perspective.

Evaluation: M

OK, but you took the analogizing too far. Also, not much of a flow.

Video

Video Recording
Make a video recording (probably using Zoom) providing a quick explanation of your data and demonstrate some of the conclusions from your EDA.
This video should be no longer than 4 minutes.
Include a link to your video (and password if needed) in your README.md file on your Github repository. You are not required to provide a link on the website.
This can be presented by any subset of the team members.

Evaluation: E

Rest of the Site

General organization and cleanliness of website
The main title of your page is informative.
Each post has an author/description/informative title.
All lab required posts are present.
Each page (including the home page) has a nice featured image associated with it.
Your about page is up to date and clean.
You have removed the generic posts from the initial site template.

Evaluation: E

Project Sharing Feedback

Feedback from presentation by Aruzhan Bektemirova

1. Describe the main idea of the project

what variables in different states affect academic scores

The main topic is educational background, data on standardized test (different poverty level)

Education level by state of Massachusetts and Mississippi.

Mississippi has lower standardized test scores, due to lower economy and higher unemployment rate

The education background difference among families due to the the poverty rate, etc

2. What was the best part of the teams work?

The graphics were very clean and easy to understand. Lots of exploring their variables and seeing what works for them in a very organized way. Graphics were very informative

I like the three variables they chose to merge to the first dataset, parents's income.etc. They have really comprehensive data cover from 2008 to 2019.

The different models were insightful in describing the effect of education on income in each state. The different graphs explored how different mathematical transformations could be used to compare each state.

I think the models the team used were very impressive, they used weighted least squares and robust regression to predict MA scores

This team uses a lot of graphs and charts, as well as the robust regression model are being used, which has the strong demonstration of their thesis statement.

3. How would you suggest improving the team's work?

Everything seemed very clean and organized, maybe more written analysis (but also mainly focused on the graphics so maybe they had written analyses)

They choose two states randomly to compare the data and the standardized test. I wonder if there's some bias exist. They residual plot looks abnormal

I would suggest comparing states across different regions (i.e. New England vs West Coast states). Exploring more states may reveal patterns across regions.

They used the models on the state of Massachusetts, but I believe trying with more states would give a better view

Maybe include more depth of the details, discover more relationships, and find out more pattern related to the education in specific areas

Feedback from presentation by Chase Stephens

1. Describe the main idea of the project

They try to measure education disparity of Mass vs Mississsipi.

the education level of Massachusetts verse Mississippi

Studying education disparity in the US - focused on MA and MI. These disparities seem to be related to economic status.

education disparity between Massachusetts and Mississippi.

seeing what factors affect academic achievement k-12 such as demographics and economic factors

Education disparity in the US. Looking specifically at Massachusetts and Mississippi.

2. What was the best part of the teams work?

I like how they focus on two states. Many predictor variables. I also like how the structure their big picture page - graphs are intuitive to understand.

the data page, they compare the education factor of Massachusetts and Mississippi in 9 different field, that it is very intuitive

Their figures are very nice and look well put together. The project site overall looks nicely polished.

had a clear and detailed goal that they wanted to accomplish with the data.

good exploratory data analysis for factors that might affect the researched metric and also cleaning

Their Big Picture page looks good so far, featuring the distributions of the important features. Good consistency when examining both states.

3. How would you suggest improving the team's work?

Explanation should be more technical, like the process, cleaned data. Explain the interrelationships between factors more sufficiently.

they could work more on their analyze part of the prediction model might be developed, might be able to connect with other state factor other than education

I think their project looks well developed and it seems like they just need to add some finishing touches. I don't have any suggestions.

Label graphs, and maybe make them easier to see (larger). there was only two chunks of graphs to visualize the data and it was difficult to discern what it meant, there was no discussion of what they meant.

finding transformations and a model that fits the data to see how the different factors are actually affecting academic achievement

They are missing models for their data, which is still a work in progress. Mentioned that they are going back and forth between what models to use.

4. Do you have any other comments or ideas?

n/a

Feedback from presentation by Morgan Fleming

1. Describe the main idea of the project

different factors that affect test scores across two states (masachussetts and mississippi)

Looking at what different factors influence test scores using 2 different states: MA and MS.

They collected data on the test scores of students across a district level. Their main research question involves factors influencing test scores. They are looking at Massachusetts and Mississippi, where the former has high test scores and later has lower test scores.

Group 10 studies what factors affect test scores. Massachusetts and Mississippi. They uses standardized test scores in districts to compare the test scores between states. They look at poverty rate, unemployment rate, parent's education level, and percentages of students from each race. They use the weighted regression model.

2. What was the best part of the teams work?

interesting research, and drew upon meaningful conclusions. the charts were also a good way to visualize findings.

The teams work seems organized, they implemented a regression model to determine coefficient for their project.

Firstly, they seem to have a strong dataset. They have data from a Stanford dataset, with standardized test scores (3.5 to -3.5). This is the best way to directly compare test score data between states. They're also researching some great variables such as percentage of economically disadvantaged, unemployment data, and percent of racial groups. They also used a great weighted regression model as it had the best R squared and lowest mean squared error.

The graphs of the regression results very clearly show the pattern of each factor affecting test scores.

3. How would you suggest improving the team's work?

i think it'd be interesting if you could assess whether the highest test scores in MA are from a few certain schools or spread across majority of the schools, and the same for Mississippi.

I think one possible improvement could involve creating more visualizations to demonstrate different parts of their project.

They definitely need better data visualizations. Their graphs aren't as easy to follow and don't pop out and grab the attention of the viewer. Moreover, their analysis section is incomplete. They should also definitely include tables about assessing usefulness of models to make their overall project more complete.

This team already has a good statistical model and the direction of analyzing the results is clear enough, so maybe they want to make their page prettier.

Feedback from presentation by Yixiao Li

1. Describe the main idea of the project

focus on MA most educated state, predict educational outcome based on economic level and racial level

Focus on Massachusetts and Mississippi where MA is the most educated state and Mississippi is the least educated state. The study is based on economic level and racial level

This group focus on educational outcomes based on economic level and racial level across different states.

focus on the education outcome based economic level and racial level in state MA and MS

MA and Missispi, most educated states. MA most educated, MI least educated, Analysis based economic and racial level. (advantage and disadvantage)

2. What was the best part of the teams work?

Robust regression model and weighted regression model to resolve the limitation of data constraints.

Visualization of dataset that can clearly see the distribution of the data. Good model to fit the dataset

Their group visualized their data by group and some of the distribution is normal and some is not. Their model choice by selecting robust model and weighted regression model. I think is the best part of the teams work because I think it is the most effective model.

the robust model really did a good job, clear shows different kind of race and economic variable effects to their score.

The structure of the project is very clear, which is based on 3 factors: 1) economic factor, 2) educational background, and 3) racial demographics. The result is clear, economic lower, mean socre low. racial: African, score low.

3. How would you suggest improving the team's work?

only focus on two states, the result might be limited to only this two states. And their statistical model can use more advanced transformation.

Expand to more states to compare the data with; Find more model to fit some of those skewed data; Use more advanced algebra to process the data

I think they can add more dataset to avoid ovb and at the same time they can also choose to add some interaction terms to make sure their prediction is accurate

maybe could improve my more variable related to the education variables. Besides that, you could use more advanced models to better fit.

You could try to incorporate more datasets, and try to build more models to figure out if there is any other trend.

Feedback from presentation by Taelor Anderson

1. Describe the main idea of the project

2008-2018 data from standford education, compared the education disparity between MA and Mississippi

Education disparity across the united states - test scores against socioeconomic factors like race and income

The main idea of the project is to find disparity in education across the US and look specifically at Massachusetts and Mississippi.

Education level in USA and different between Massachusetts and Mississippi.

2. What was the best part of the teams work?

organized all data and figures very well & have limitations presented & i think the topic is really worth to analyze & have robostic regression which is good

They have a really excellent data set to start off, and they way they approach analysis is very thorough. Their choice of case studies for the states is very effective instead of trying to focus on all the states at once.

Great visualizations (they were very clean and pleasing to look at)
Loved how everything was organized

There are lots of diagrams and it's very clear. The data is sufficient, and the logic of the speech is clear. It intuitively reflects the different levels of education in the United States. Especially the comparison of education between Massachusetts and Mississippi is very intuitive.

3. How would you suggest improving the team's work?

maybe just focus on several variables that related much; deep more on 1-2 variables for the regression analysis

I would suggest narrowing in a bit more in the analysis like some of the graphs or representaitons of the regression have a lot going on, so in the explanations I'd suggest focusing in.

I think the team could improve on performing different models and finding more in depth analysis on why certain variables are important. I think they could consider making predictions and seeing how those model performs.

It would be better to add some more analysis and linear models. If possible, please add some explanations about distribution or new models. This makes the analysis more specific. Easier to understand and demonstrate.

4. Do you have any other comments or ideas?

good overall

Looks great!!

Blog Post 1: Data Proposal Feedback

DS1 seems great.

Data set 1

There would definitely be a lot to exlpore and unpack here.

Data set 2

Could be interesting but other groups are looking at very similar data.

Data set 3

We'll be talking more about the Census later. This might be an interesting data set to combine with another.

Data page feedback

Data Page

Describe where/how to find data.
You must include a link to the original data source(s). From what you can tell, why was the data collected/curated? Who collected the data?

Evaluation: R

Describe the different data files used and what each variable means.
If you have many variables then only describe the most relevant ones and summarize the rest. Bulletted lists or tables are recommended.

Evaluation: N

Describe any cleaning you had to do for your data.
You must include a link to your load_and_clean_data.R file.
Also, describe any additional R packages you used outside of those covered in class.
Describe how you combined multiple data files and any cleaning that was necessary for that.
Some repetition of what you do in your load_and_clean_data.R file is fine and encouraged if it helps explain what you did.

Evaluation: N

Organization, clarity, cleanliness of the page
Make sure to remove excessive warnings, use clean easy-to-read code (without side scrolling), organize with sections, use bullets and other tools, etc.

Evaluation: N

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.