https://drive.google.com/drive/folders/1VjAMAWWPs-8v2p5-oEltpz3KynAvJIKn?usp=drive_link
sussmanbu / final-project-team1 Goto Github PK
View Code? Open in Web Editor NEWThis project forked from sussmanbu/ma415_final_project
This project forked from sussmanbu/ma415_final_project
Evaluation: R
Evaluation: R
Evaluation: R
Evaluation: R
Shooting data for Boston PD shooting cases, using tidycenus
examine shooting cases in Boston and link it with demographic data
The team is trying to find the correlation of shooting cases with races
The main idea of the project is to analyze occurrences of shootings in Boston and use census data to determine correlation between employment status, single parent households, household ownership, sex, race, and geographic information and the number of occurrences.
They tried to explore the correation with each variable to decide whats the best predictor and giving very clear graph to visualization
Rich dataset and brings in Tidycensus data to enrich the analysis comparing different regions, the demographic data for each region in Boston is really good combination
The team is doing great with the tidycensus package by showing the correlation of shooting cases with races in each area of boston
I think that the logistic modeling looked well done and they were able to determine the best predicter for the frequency of shootings, which was location.
Maybe merging more data to support the orginal datset so it might be easier writting analysis and big picture
Maybe some other forms of statistical modeling can be done to improve the project, could think about a region-level regression analysis
The team did some preliminary data analysis to the work, I think the team should include some more high level model.
Similar projects had maps for the occurrences, I think that it would be a good thing to include in order to visualize their results.
N/A
The overall is good
In general as income per capita and shootings have a negative correlation
income per capita and shooting frequencies have a negative correlation with a focus on race
They are focusing on the relationship between income and crime rate/shooting frequency.
The analysis of crime rates per race, particularly focusing on shooting frequencies, reveals significant disparities, especially in certain districts of Boston where higher shooting frequencies are observed, with data segmented by race and gender. This underscores the complex interplay of socio-economic factors such as poverty and education, which potentially impact the frequency of shootings.
what are the factors of shooting accident within the area of Boston
The visual graphs were very informative, such as finding that Brighton and Roxbury had the highest frequency of shootings. Looking at unemployment rate was also an informative metric to use.
the graphics are very nice and I like how the color of the bars indicates a certain level on the scale. lots of good analyses
They explored the data from many different aspects, such as gender, income, race, poverty, education, etc.
The team's collaborative effort yielded valuable insights, notably identifying areas such as Brighton and Roxbury with the highest shooting frequencies. Through thorough analysis, the group concluded that individuals involved in these shootings are predominantly Black, often residing in lower-income households.
They did good job at visualizing the data and stats by using different models to indicate different areas in Boston, especially the Roxbury and Brighton areas
If they used a model to predict frequency of shootings in other districts to see how the graph would look it would be interesting
I am unsure if they have a prediction model but incorporating one of those would be the biggest improvement
I think overall it's going pretty well. Eric mainly focused on explaining how they cleaned the data and find the data. He didn't go in very details, but sounds fun!
Including other variables such as neighborhood demographics, police presence, and access to social services may also contribute to variations in crime rates and shooting incidents.
They could look more into different variable that could affect this shooting accident percentage that would make the data more convicing
In addition to the analysis of shooting frequencies and demographic patterns, it could be beneficial for the team to explore the underlying factors contributing to these disparities. For example, conducting qualitative research or community engagement sessions to understand residents' perspectives on crime, policing, and neighborhood dynamics could provide valuable context. Additionally, investigating the impact of structural inequalities, such as housing segregation, access to healthcare, and employment opportunities, can further enrich the analysis.
the shooting data in boston based on region as well as race, income and education.
Boston shooting data 2015-2024
In their exploratory analysis, they looked at relationships between shooting incidents vs different variables such as education, employment, race, and median income. They found correlations for race and median income with shooting incidents by district.
This project explores shooting data in Boston. They explore where shooting occur and what factors may lead to this. They explore factors such as high school completion, race, employment, etc.
Boston shooting data. Police data. Comes w when shooting happened, victims, and if were fatal. Race, incident numbers, etc.
The distribution of the Boston shooting data 2021-2024.
relationship between shooting incident and variables such as race and age and district. 1. Average income and shooting incidents (negative correlation). 2. High school completion( not a clear trend)
They try to explain the shooting based on various aspects. This provide a general view on how each aspect do or do not has an impact on the subject.
They did a lot of exploratory data. with some well put together graphs. I liked the use of color in their graphs.
I like how their project is very clear and concise. It is very easy to tell what this group's thesis is, and every visualization and explanation they had clearly helped explain their thesis.
nice graphs and interesting findings. well-explained. especially the finding about fatal shootings was interesting
Geographical data and the educational data combination are of clear reasoning. The data is divided into smaller categories of more specific situations such as fatal and non-fatal.
The team really captures a good variables when drawing conclusions and also they use many regression models.
there is some amont of the data imcompelete. the heat map is incompelete. They may explain more about why choosing such a model to interpret the data
He mentioned working on a heatmap that would be really helpful for visualizing data. They haven't gotten to any regression analysis yet, so no specific conclusions about any variables and how that relates to number of shooting incidents.
The project is a bit repetitive. The first 5-6 graphs all look exactly the same. I feel that they could have used some sort of time series data, or some other kind of data to make their visualizations more exciting.
Working on developing the project for submission. No current suggestions, just continue working.
More regressions analysis is needed in the future steps, a line of best-fit correlation coefficient needed explained?
They could do more explanatory data analysis which is useful when presenting, and conducting more research.
I think a geographic data visualization would fit really well with this project.
The project is about analyzing shooting data (victims and perpetrators) in the Boston area
Analyzing shooting data in Boston area by neighborhoods
They're investigating shootings in the Boston area separated by district.
analyze the shooting incident across the Boston using the location of the district, and the education level of the
seeing what factors shooting data in the boston area are related to demographics wise
Frequency of shootings with income level, parent background, and educational level as predictor variables
I like how they analyze the relationship of median income vs total cases over all years. Their analysis on avg median income vs total cases is interesting.
The graphs they have right now tell good information. For example, they have a bar graph that shows the average income by the different neighborhoods and have the bars shaded to fit the number of shootings in the area.
I liked the variety of figures and how they were able to combine multiple variables onto one figure.
the bar graph of cases in different area is visualize the data well, and I think their topic is interesting and helpful in real world
choosing good variables to run regressions on and making sure they had good correlation coefficients
I think they had a great background on their data and how they merged datasets together. They also have a good understanding of how to go forward with their project and to add more to make it more compelling
Make sure the trends are more clearer. Strength of trends should also be explained more. Reliability of trends also must be analyzed.
I would suggest changing the color of the bars so that the darker colors indicate more shootings. Also exploring more variety of different graphs as its just bar graphs right now. More information needed for the modeling.
Their project looks good so far, I don't have any suggestions. I think they just need to put everything together and polish the site.
they can improve their regression model to change the variables usages to make the fitted model better.
clearer charts in terms of coloring and axis labels as well as more exploratory data analysis would be good
More work on the analysis on the website itself and more of a comprehensive understanding of what the data is saying.
Excellent project overall!
Change the color scale of the figures so that darker colors are greater values.
I think they could change some of the graphs to be more visually appealing
Gun violence across difference race, interacting with residence on difference distribute. employment status, education level, household weight.
They investigated gun violence in the Boston area. They were interested in socioeconomic factors, as well as, other variables that would influence number of incidents in a district. These included education, number of people below the poverty line, median income and more.
social and economical factors on gun violence in Boston area, looking at different factors such as income and education.
social aspect incidents in Boston. Using income, district, and household informations as features. Roxbury has the most cases of incident and Brighton has the least cases of incident.
Gun issue at MA. Analysis include the number of incidence at different district and poverty level. Roxbury has the highest number of incident cases, while Brighton has the least number of incident cases.
This data includes both fatal and non-fatal shootings, and they analyze if a victim was struck by a bullet within the City of Boston.
Plot bar chart to present their income on different race.
Have scatter plot: clean out the outlier
Linear regression mode: have pval, confidence interval for each coefficient.
The graphs were well-labeled and color-coded clearly. They also had linear regressions. It was great to see a variety of graphs being used appropriate to the data being presented.
The topic was really interesting to get into and the data discovery was in depth. I would say their visualization looks good too.
Correlation analysis on their dataset was pretty interesting
Median income has a negative correlation.
district has a positive correlation
poverty has a positive correlation versus cases of incidents.
They use linear regression models, where district is one of the variable. They evaluate the p-value and the coefficient given by the model.
They have analysis about the non institutional population vs distribution by districts and cases, they showed that some districts have very low population.
Maybe include map will be more interesting. And the districts of Roxbury is unclear, because it separate into west and east, but this group did not encounter that.
We did not get to discuss data equity but given the nature of this information it would be important to highlight any factors that can skew this, over-reporting or under-reporting etc.
Making the blog post presentable would be great! I would love to see more exploration on different location.
Because the team works with a dataset of 10 years, it can be difficult to analyze. They can make comparison between previous years and the recent years through a plot.
Maybe include more variables to do analyze. Right now the group only has their main variables as district and income level. If they include some more variables, they can have a better conclusion for their topic.
Everything is good, but it is better to have more detailed analysis, because I think their analysis is not very enough.
It pretty informative and interesting. Good job for this project.
no
I would try to find some kind of data set that provides the spatial regions of the different districts since this would allow you to incorporate other location level information.
There would be some challenges because I think this data is aggregated and it might be hard to get cross tabulations like the number of white, male, smokers, aged 26-30. Maybe look around for other tobacco related data.
Many possibilities but this also has the challenges of aggregated data. Read about how the data is suppressed as well.
http://edsight.ct.gov/relatedreports/BDCRE%20Data%20Suppression%20Rules.pdf
Total
121/125 (96.8%)
Evaluation: E
Evaluation: E
Evaluation: M
There seem to be issues with how you did the district assignments in the code you are showing: GEOID == c(2136,2137)
.
Evaluation: M
Code is OK but not the easiest to follow. Use the tidyverse!
Evaluation: E
Evaluation: R
Polynomial doesn't really make sense for categorical.
No interpretation of coefficients or explanation of results.
Evaluation: E
lm
output and plots are typically not acceptable.Evaluation: M
Why not just show correlations with total cases? Do you care about the other correlations?
Figures are interesting and suggest future questions but those questions remain unanswered.
Tables are just one-step beyond standard lm
output.
Evaluation: R
Not a lot of explanation of figures/results.
Evaluation: E
Evaluation: E
Evaluation: M
Don't include 2024 in plot.
Colorss to indicate total cases is hard to parse.
Don't include correlation coefficient.
Evaluation: M
Not much of a flow/story.
Evaluation: E
Evaluation: E
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.