Coder Social home page Coder Social logo

prospect-scraper-dt2021's Introduction

leagify

Leagify - complete with new name!

Most of the code is in the repository with the old name Leagueify

prospect-scraper-dt2021's People

Contributors

alesandrof avatar christian-oleson avatar jeromechrist avatar robsmithw avatar rollerss avatar scottburfieldmills avatar zo0o0ot avatar

Stargazers

 avatar

Watchers

 avatar

prospect-scraper-dt2021's Issues

Update ProspectRanking and MockDraftPick objects to add new COVID changes related info

This year, prospect rankings and mock draft picks should have information about whether or not their schools have decided to participate in games this year.

This information should match the property names that are set up in #14 .

For example, ProspectRanking looks like this:

public class ProspectRanking
{
public int rank;
public string change;
public string playerName;
public string school;
public string position1;
public string height;
public int weight;
public string collegeClass;
public DateTime rankingDate;
public string rankingDateString;
public string draftStatus;

additional properties will likely be:

  • fallFootball
  • springFootball

Likely, the ProspectMap will also need to have the additional properties.
Here's how it looks so far:

public ProspectRankingMap()
{
//AutoMap();
// public int rank;
// public string change;
// public string playerName;
// public string school;
// public string position1;
// public string height;
// public int weight;
// public string position2;
// public DateTime rankingDate;
Map(m => m.rank).Index(0).Name("Rank");
Map(m => m.change).Index(1).Name("Change");
Map(m => m.playerName).Index(2).Name("Player");
Map(m => m.school).Index(3).Name("School");
Map(m => m.position1).Index(4).Name("Position");
Map(m => m.height).Index(5).Name("Height");
Map(m => m.weight).Index(6).Name("Weight");
Map(m => m.collegeClass).Index(7).Name("CollegeClass");
Map(m => m.rankingDateString).Index(8).Name("Date");
Map(m => m.draftStatus).Index(9).Name("DraftStatus");

htmlagilitypack - Operation times out in GitPod

Unhandled exception. System.Net.WebException: The operation has timed out.
   at System.Net.HttpWebRequest.GetResponse()
   at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocument doc, IWebProxy proxy, ICredentials creds)
   at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, NetworkCredential creds)
   at HtmlAgilityPack.HtmlWeb.Load(Uri uri, String method)
   at HtmlAgilityPack.HtmlWeb.Load(String url, String method)
   at HtmlAgilityPack.HtmlWeb.Load(String url)
   at prospectScraper.ProspectScraper.RunTheMockDraft(Boolean parseDate) in /workspace/prospect-scraper-dt2021/src/prospectScraper/ProspectScraper.cs:line 89
   at prospectScraper.Program.Main(String[] args) in /workspace/prospect-scraper-dt2021/src/prospectScraper/Program.cs:line 42

Attempted to run a scrape for a new mock draft, but both the big board and mock draft had this issue tonight. I hope it gets resolved magically, but if not, I'll need to figure out why stuff would magically start timing out when the web page loads fine in a browser.

Add command line parameter to ignore date and use today instead

Sometimes, the site gets updated, but the date on the page does not.

New page (same date):
compare

Old page:
original

Create a second command line parameter to let the program ignore the parsed date and use today instead.

Since command line parameters are currently optional, this parameter should take that into account.

I assume we should name this variable something like ignorePageDate.

The RunTheBigBoards() and RunTheMockDraft() should take this variable as a parameter.

Applicable code is here:

static void Main(string[] args)
{
// Initial attempt to handle command line arguments.
// "b" for the big boards, "m" for mock drafts, "e" for everything
if (args.Length == 0)
{
Console.WriteLine("No Arguments- Type bb for big board, md for mock draft, all for both. Running both by default.....");
RunTheBigBoards();
}
else
{
string s = args[0].ToString().ToLower();
switch (s)
{
case "bb":
Console.WriteLine("Running Big Board");
RunTheBigBoards();
break;
case "md":
Console.WriteLine("Running Mock Draft");
RunTheMockDraft();
break;
case "all":
Console.WriteLine("Running Big Board and Mock Draft");
RunTheBigBoards();
RunTheMockDraft();
break;
default:
Console.WriteLine("Input argument of " + s + " not recognized. Please try running again.");
RunTheBigBoards();
break;
}
}
//RunTheBigBoards();
}

Create C# representation of CovidOptOut.cs

Data regarding NCAA conference participation in football games is represented in CovidOptOut.cs.

Using CSVHelper, create a way to get this data into the program. In this case, we only need to read the information in. We do not need to write this information back out to a file.

Probable field names:

  • conference
  • fallFootball
  • springFootball

There are examples of implementations for this, for example, in School.cs:

using CsvHelper.Configuration;
namespace prospectScraper
{
public class School
{
public string schoolName;
public string conference;
public string state;
public School () {}
public School (string schoolName, string conference, string state)
{
this.schoolName = schoolName;
this.conference = conference;
this.state = state;
}
}
public sealed class SchoolCsvMap : ClassMap<School>
{
public SchoolCsvMap()
{
Map(m => m.schoolName).Name("School");
Map(m => m.conference).Name("Conference");
Map(m => m.state).Name("State");
}
}
}

Create a more accurate version of CovidOptOuts.csv

This file was originally created when entire football conferences started cancelling or postponing their seasons. Some of those conferences have now backtracked on their decisions, so it might be better to change this CSV file to reflect that.

Original file is here:
https://github.com/Leagify/prospect-scraper-dt2021/blob/master/src/prospectScraper/info/CovidOptOuts.csv

Current Header is:
Conference,FallFootball,SpringFootball

A more accurate header would probably be:
School,Conference,2020ScheduledGames

Using SchoolStatesAndConferences.csv would be a good start:
https://github.com/Leagify/prospect-scraper-dt2021/blob/master/src/prospectScraper/info/SchoolStatesAndConferences.csv

You wouldn't need to do the state.

Also, D2 and D3 schools are not necessary. The only exception would be schools that have prospects actually appear in either the mock draft or big board, and those players tend to be few and far between. The argument could also be made for FCS schools to be omitted until a player is mentioned, but I know that Trey Lance from NDSU is already a hotly discussed prospect.

The annotation for nullable reference types should only be used in code within a '#nullable' annotations context.

When building, there's a warning message:

Warning: /home/runner/work/prospect-scraper-dt2021/prospect-scraper-dt2021/src/prospectScraper/MockDraftPick.cs(149,19): warning CS8632: The annotation for nullable reference types should only be used in code within a '#nullable' annotations context. [/home/runner/work/prospect-scraper-dt2021/prospect-scraper-dt2021/src/prospectScraper/ProspectScraper.csproj]

It's annoying, and I'd like to fix that so it doesn't complain any more.

Test project not included in the main solution

Any reason the prospectScraperTest.csproj (test project) is not included in the prospectScraper.sln (main solution)?
I was making a change on another issue and noticed this, if the test project was part of the main solution, any contributors using Visual Studio to develop would see the test project regardless of what they're making a change to and would prompt contributors to be more likely to run the test before commiting changes (I would hope ๐Ÿ˜„).

Add a new "fall football" boolean csv.

In the info directory add a new csv file to track whether the College is going to be playing football in the fall.

Conferences that aren't playing in the fall (AFAIK):

  • Big Ten
  • Pac-12
  • Ivy League
  • Division 2

It seems like doing this by conference will be a good way to go, as the conference is listed in SchoolStatesAndConferences.csv - I'm not 100% sure that information is in the prospect object, though, so there may need to be some work in integrating all this together afterward.

First things first.
Create the CSV with two fields:
Conference,FallFootball
string, bool

If schools start trying to have seasons in the spring or doing something else, I'm assuming that I'll just add another column to track that as a boolean as well.

Once this has been created, I'll need to make some additional issues to integrate this into the actual process of scraping, as (at this moment) this is information that is not being scraped from the site, but added to the obtained data afterwards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.