Coder Social home page Coder Social logo

final-project-team6's Introduction

final-project-team6's People


yokios622 avatar saisriram2003 avatar mosherosenstock avatar wanqingsz avatar dpmcsuss avatar

final-project-team6's Issues

Project Sharing Feedback

Feedback from presentation by Yuqi Peng

1. Describe the main idea of the project

They are analyzing shooting data and payroll information to find correlations between income and violence rates.

intersection of race, poverty, and policing in NYC with shooting cases

how income level of different area influence the shotting cases

Find intercarion of police discrimation cases in NYC with 3 datasets merging

This project analyses the occurrences of shootings in New York City and evaluates the relationship between the number of occurrences, the race of the victim, and the income of the neighborhood that the shooting happened. They also are looking at policing in each neighborhood.

2. What was the best part of the teams work?

The map was really informative and a nice visual to include. Their other graphs were also informative and well done.

brings in a lot of different datasets, have detailed information of each shooting cases and demographic data, income data, and criminal data

create greate visualization and make different regions as an example, and create a greate map graph.

They build a many map graph that successfully demostrates the NYC map with differnt rical and wealth group with shooting cases

The maps are very clear and well presented, I think that they do a good job of showing the relationship between multiple variables using colors and heatmaps.

3. How would you suggest improving the team's work?

The team should explain their results well since it seems like they're working with a lot of variables so that it's clear.

I think the team has done wonderful work especially with the mapping data. Just maybe more statistical model could be used to further consolidate their conclusion

can further polish the visualization and give a better interpertation of the multilinear regression model

I am hoping to see the visualization of the model as I believe it will be a strong and signifcant model to show the ricial disparity in NYC given by their graph map

I think that visualizations besides maps might be useful to know more numerical information. I think that the team has a good idea of next steps, including evaluating the model that they've built.

4. Do you have any other comments or ideas?

can interprate other variables in the model


Feedback from presentation by Saisriram Gunturu

1. Describe the main idea of the project

arrests in NYC: low income and higher black population will have more shootings in the area

Intersection of income and poverty in relation to crime rates.

Big picture: intersection of poverty and crime rate, studying the crime and shooting in new york city. Thesis: Area with higher income with fewer crime and shooting

In lower-income neighborhoods across New York City, crime rates tend to be higher compared to more affluent areas. This trend is often attributed to various socio-economic factors. The study conducted that black people are likely to be the victims in crimes and the most common crime committed is assault.

They're trying to find the the disparity of the race in the relation of crime rate

2. What was the best part of the teams work?

fun and different graphics (the map was very informative). Very detailed analyses describing graphics

There was great analysis of the pattern of crime rates on New York City. There were clear correlations from income regions and crime rates.

They discovered that lower income area have higher crime rate. They mapped out the crime happening areas, their data is pretty good which don't even really require any cleaning.

The group included a lot of graphs and visual plots, maps, etc., which was incredibly helpful when understanding the research.

They used a good model like linear regression to visualize the data, and their bigger picture is clear to the audience.

3. How would you suggest improving the team's work?

the team looks like they are on track and have put in lots of effort in their visualizations and analyses, but maybe add a graphic for the linear regressions

I would suggest including more analysis to support the strong thesis on the relation between income level and crime levels

Team 6 choose to merge income to the first dataset. They mainly focused on New York City and conclude that income level have huge impact on crime rate. I feel there could be more to say and may be try to repeat the steps for another city/states.

Implement measures to ensure the quality and integrity of the data collected. This may involve data validation procedures, data cleaning techniques, and addressing missing or erroneous data.

They could explore more in their dataset and try to combine different dataset together, and that might would lead to different results.

4. Do you have any other comments or ideas?

The project looks in great shape, including lots of information and analysis. The group did a great job comparing crime levels in different regions of New York City and demonstrating which racial groups are more likely to commit different types of crimes as well as their socio-economic background.

Feedback from presentation by Shiyu Wang

1. Describe the main idea of the project

the project focus on analyze shooting crime incident in NYC. Combined with perpetrator and victim racial information and other incident related information.

Exploration of shooting rate in the NY city. Predict which areas have higher possible shooting rate.

NYC shooting crime rate. Mainly focus on racial age, sex, crime description to predict the shooting will happen in which area at NYC.

This project explore the relationship between shooting and demographic factor

They investigate nyc shooting cases. They predict where the shooter will be at based on the witness testimony.

2. What was the best part of the teams work?

We used the census map to predict the shooting incident will happen in which BORO, which can help NYPD distributed their power.

Use different income levels to separate the district, using regression models to find the relationship between dependent and independent variables. Also, demonstrate the income level on the map, using red dots to show the possibility of shooting.

The best part is map. Distinct the areas by salary, and spot where the shooting events happened. The shooting crime is high at low salary area. The Black race got high crime rate.

their map are really good, not only shows the distribution of the shooting in different borough, but also show the distribution of the income, so that we could find some relationship between income and crimes.

Their best part is also the map. They divided different part of nyc by salary and racial. People can vividly see how the if there is a storing correlation between shorting cases frequency and income level

3. How would you suggest improving the team's work?

since all the variables are categorical, you can use variable transformation to improve the model predict power.

Improve the distribution of independent variables, try to improve the r-squared of regression model.

Could add more variables, and adjust the model's performance by increasing the R squared. To add more variables, you could try to incorporate other datasets.

maybe you could add more numerical variables in the dataset so that could show more relationship in the model.

Their model selection is problematic because its r squared value is low. Recommend change a model by default using polnomial log log regression

Feedback from presentation by Wanqing Zhao

1. Describe the main idea of the project

Study patterns of crime in particular parts of NYC, find where shooting incidents are most likely to happen.

Pattern of crime in NYC by look at the shooting case in NYC.

trying to predict how likely there is to be a shooting at a given location in nyc

The pattern of crime in new york city, focus on shooting incidents

Crime in New York City with focus on shooting incidents in different boroughs

2. What was the best part of the teams work?

They have an important dataset with an important topic- analyzing crime patterns and determining probability of a shooting to happen.

The regression line and graph they are going to use is very good and nice and their idea of looking at the dangerousness of the most popular city in the world.

Using a lot of independent variables that seem to be very relevant for the prediction. I think looking at a borough level makes sense

They find out what position has the highest incidents of shootings like parking lots, restaurants, or anywhere could occur the shooting incident. This is very useful to know because people could be more careful when they enter these places. The analysis might reduce the shooting incidences in New York.

The analysis is very thorough and they came up with a clear outcome that the location of shootings matches the low income areas.

3. How would you suggest improving the team's work?

The group is still developing the model, creating a multimodal logistic regression would be useful for the group.

maybe this topic can extent to the level of the country and discuss the other city which have higher crime rate..

having more concrete results to share is obviously important, was slightly unclear on why using multinomial logit over other kinds of models (not sure if dependent variable should actually be binary)

I think their project is perfect for now. Maybe they could compare shooting incidents with other crimes and do some analysis on it.

Although the finding is very clear, It would be brilliant if they could embed the results onto a map to make a shootings crime map.

Final Project Feedback and Grade

105/125 (84%)

Data Page

  • Describe where/how to find data.
    You must include a link to the original data source(s). From what you can tell, why was the data collected/curated? Who collected the data?

Evaluation: M

Description of arrest data talks about shootings but shouldn't it be arrests?

  • Describe the different data files used and what each variable means.
    If you have many variables then only describe the most relevant ones and summarize the rest. Bulletted lists or tables are recommended.

Evaluation: R

Very little details on a lot of the variables.

  • Describe any cleaning you had to do for your data.
    You must include a link to your load_and_clean_data.R file.
    Also, describe any additional R packages you used outside of those covered in class.
    Describe how you combined multiple data files and any cleaning that was necessary for that.
    Some repetition of what you do in your load_and_clean_data.R file is fine and encouraged if it helps explain what you did.

Evaluation: M

Merging (why not _join :-( ) by lat and long means you'll only match exact location matches. That doesn't seem like what you'd want. I think instead you'd just want to bind_rows on the appropriate columns.

  • Organization, clarity, cleanliness of the page
    Make sure to remove excessive warnings, use clean easy-to-read code (without side scrolling), organize with sections, use bullets and other tools, etc.

Evaluation: E

Analysis Page(s)

  • Introduce what motivates your Data Analysis (DA)
    Which variables and relationships are you most interested in?
    What questions are you interested in answering?
    Provide context for the rest of the page. This will include figures/tables that illustrate aspects of the data of your question.

Evaluation: E

  • Modeling and Inference
    The page will include some kind of formal statistical model. This could be a linear regression, logistic regression, or another modeling framework.
    Explain the ideas and techniques you used to choose the predictors for your model. (Think about including interaction terms and other transformations of your variables.)
    Describe the results of your modelling and make sure to give a sense of the uncertainty in your estimates and conclusions.

Evaluation: R

Why do we care about predicting the borough?

No discussion of how you chose which predictors to include in your model.

No interpretation of what factors are important in your model.

  • Explain the flaws and limitations of your analysis
    Are there some assumptions that you needed to make that might not hold? Is there other data that would help to answer your questions?

Evaluation: M

OK, but limited discussion about model assumptions.

  • Clarity Figures
    Are your figures/tables/results easy to read, informative, without problems like overplotting, hard-to-read labels, etc?
    Each figure should provide a key insight. Too many figures or other data summaries can detract from this. (While not a hard limit, around 5 total figures is probably a good target.)
    Default lm output and plots are typically not acceptable.

Evaluation: M

Showing the average salaries as big uniform colors across a burough suggests much more uniformity than there really is. (I would have recommended trying to find more high resolution salary information.)

Murder rate percentage of what?

Some default code output.

  • Clarity of Explanations
    How well do you explain each figure/result?
    Do you provide interpretations that suggest further analysis or explanations for observed phenomenon?

Evaluation: M

OK, I just don't really know what your main findings are.

  • Organization and cleanliness.
    Make sure to remove excessive warnings, use clean easy-to-read code, organize with sections or multiple pages, use bullets, etc.
    This page should be self-contained.

Evaluation: E

Big Picture Page

  • Clarity of Explanation
    You should have a clear thesis/goal for this page. What are you trying to show? Provide details that support your thesis but don't go into to much mathematics or statistics. The audience for this page is the general public (to the extent possible).

Evaluation: M

Your thesis just describes what you want to do, not what your answers are.

Do you really think the answer is to model education systems like those in Europe? Or are there other underlying factors?

  • Quality of Figures
    Each figure should be very polished and also not too complicated. There should be a clear interpretation of each figure and each figure should have a clear purpose.

Evaluation: R

These are the same as from the analysis page which were not very polished and had some issues.

  • Creativity
    Do your best to make things interesting. Think of a story. Think of how each part of your analysis supports the previous part or provides a different perspective.

Evaluation: M

OK, but not much of a story.


  • Video Recording
    Make a video recording (probably using Zoom) providing a quick explanation of your data and demonstrate some of the conclusions from your EDA.
    This video should be no longer than 4 minutes.
    Include a link to your video (and password if needed) in your file on your Github repository. You are not required to provide a link on the website.
    This can be presented by any subset of the team members.

Evaluation: E

Rest of the Site

  • General organization and cleanliness of website
    The main title of your page is informative.
    Each post has an author/description/informative title.
    All lab required posts are present.
    Each page (including the home page) has a nice featured image associated with it.
    Your about page is up to date and clean.
    You have removed the generic posts from the initial site template.

Evaluation: M

No title on home page.

Blog Post 1: Data Proposal Feedback

Data set 1

There are effectively only 5 variables in this data set so that is pretty limiting. It could be interesting if you have good ideas about what other data to combine it with. You might also look into the original sources for this data to find a more detailed data set.

Data set 2

This sounds promising but I'd like to know a little more about the different variables in the data set.

Data set 3

Since this data has been used for this project by a few previous teams, I'd recommend against using it unlesss you feel you have some novel questions to ask.

Data page feedback

Data Page

  • Describe where/how to find data.
    You must include a link to the original data source(s). From what you can tell, why was the data collected/curated? Who collected the data?

Evaluation: M

Data just being from NY would make it biased if you were trying to generalize but not if you are trying to make statements about NY. Think about this as limiting the scope of your conclusions rather than bias.

  • Describe the different data files used and what each variable means.
    If you have many variables then only describe the most relevant ones and summarize the rest. Bulletted lists or tables are recommended.

Evaluation: M

Include more details on the variables you'll be focusing on.

  • Describe any cleaning you had to do for your data.
    You must include a link to your load_and_clean_data.R file.
    Also, describe any additional R packages you used outside of those covered in class.
    Describe how you combined multiple data files and any cleaning that was necessary for that.
    Some repetition of what you do in your load_and_clean_data.R file is fine and encouraged if it helps explain what you did.

Evaluation: R

Explain what you are doing, and only show code if it helps those explanations.

  • Organization, clarity, cleanliness of the page
    Make sure to remove excessive warnings, use clean easy-to-read code (without side scrolling), organize with sections, use bullets and other tools, etc.

Evaluation: R

Remove content from original page.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.