Coder Social home page Coder Social logo

docsite's People

Contributors

abilityguy avatar ajaits avatar beets avatar chadwyck242 avatar chejennifer avatar clincoln8 avatar dependabot[bot] avatar donaldrgosselin avatar dwnoble avatar iancostello avatar jehangiramjad avatar juliawu avatar kilimannejaro avatar kmoscoe avatar lucy-kind avatar mvashishtha avatar n-h-diaz avatar paras-jain avatar pdurbin avatar pradh avatar pulkit-s avatar rvguha avatar sharadshriram avatar shifucun avatar spaceenter avatar tjann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docsite's Issues

Add a helper function for getting the parent location (inverse of get_places_in)

Currently there's no way to get the parent location or area for a given place, although you can find the places within a place.

The actual problem I faced was using the CDC 500 City list and trying to find the state that each city is in.

Here's my rather hacky workaround for doing it from the other direction, which involves finding another list of places and merging:

# Get the 500 cities in the CDC500 list
cdc500_dcids = pd.DataFrame(
    dc.get_property_values(["CDC500_City"], "member", limit=500)["CDC500_City"]
    ).rename(columns={0:'DCID'})

# Get the list of 50 US states by dcid
states = dc.get_property_values(["PlacePagesComparisonStateCohort"], "member")["PlacePagesComparisonStateCohort"]

# Get all the cities in those states by dcid
cities = datacommons.get_places_in(states, "City")

# Convert this list of cities into a dataframe
cities_list = [(key, x) for key,val in cities.items() for x in val]
cities_df = pd.DataFrame(cities_list, columns=['State_DCID', 'City_DCID']) 

# Merge with our original data
cities_cdc500 = pd.merge(cdc500_dcids, cities_df, left_on="DCID", right_on = "City_DCID")

[v1-rest] add links to browser nodes

Across the docs, there are many references to DCID's (like in the examples, etc). Would be nice to link directly to the graph nodes, so users can learn how to navigate that too. (in the image below, all the text in "code" could be linked)

image

API doc for new REST endpoint: /stat/collection

This endpoint allows a user to get the value associated with a statistical variable for a set of child places of a certain type, for a given date.

Endpoint: /stat/collection
Parameters:

  • parent_place
  • child_type
  • date
  • stat_vars (list)

Available as GET

Example:
https://api.datacommons.org/stat/collection?parent_place=country/USA&child_type=State&date=2013&stat_vars=Count_Person

Note, a direct parent/child place relationship is required here as well, i.e. country/USA -- State works, but country/USA -- County would not:
https://api.datacommons.org/stat/collection?parent_place=geoId/06&child_type=County&date=2013&stat_vars=Count_Person
https://api.datacommons.org/stat/collection?parent_place=country/USA&child_type=County&date=2013&stat_vars=Count_Person

Consolidate background common to contributors and users

Some of our docs can be combined and reused-- a separate section:

Background
API
Contributing

E.g. we have https://docs.datacommons.org/data_model.html AND a link to schema.org data model at https://docs.datacommons.org/contributing/background/background.html

StatVars are important to users too: https://docs.datacommons.org/api/python/stat_value.html, so information about StatVars (currently only here) will be relevant.

For this issue, feel free to take large freedoms. As Guha put it, the current Representing Statistics documentation is not written with sensitivity to those who have not been in the space for years.

API doc for new REST endpoint: /stat/set

This endpoint allows a user to get the value for a set of statistical variables, for a single date, across a set of places.

Endpoint: /stat/set (code)

Params:

  • places (list)
  • stat_vars (list)
  • date

Available as POST

Example: curl -X POST https://api.datacommons.org/stat/set -d '{ "places": ["geoId/06", "geoId/0649670", "country/FRA", "country/USA"], "stat_vars": ["Count_Person", "Count_CriminalActivities_CombinedCrime"], "date": "2017"}'

Create SPARQL tutorial

Most SPARQL tutorials "out there" are rather confusing. It would be nice to have something more in depth attached to our docs.

[v1-rest property/value] more descriptions

Some nits:

  • Can we add more description or examples to property
  • For example 1: value should come first in the response example
  • Example 2: an we add more description? And especially highlight in or out

image

API doc for new REST endpoint: /place/stat/date/within-place

This endpoint allows a user to retrieve dates with data available for each statistical variable specified. The set of places to query is specified by an ancestor place, and the place type of the child places to consider (similar to our places-in API). This is helpful for building an interactive app to explore our data, e.g. https://staging.datacommons.org/tools/scatter2

Endpoint: /place/stat/date/within-place (code)

Params:

  • ancestor_place
  • place_type
  • stat_vars (repeated list of stat_vars accepted)

Available as both GET and POST

Request: curl -X POST https://api.datacommons.org/place/stat/date/within-place -d '{ "ancestor_place": "geoId/06”, “place_type”: “City”, “stat_vars”: [“Count_Person”]}'
Response: {“Count_Person”: [“2017”, “2018”, ...]}

or

https://api.datacommons.org/place/stat/date/within-place?ancestor_place=geoId/06&place_type=County&stat_vars=Count_Person&stat_vars=Count_Person_Female

/cc: @shifucun

Limits 0-indexed in Python triple endpoint

When I run the Python code datacommons.get_triples(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2'],limit=2), the result limits me to three triples for the endpoint, rather than two as I would expect:

{'dc/c3j78rpyssdmf': [('dc/zn6l0flenf3m6', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/tkcknpfwxfrhf', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/c3j78rpyssdmf', 'provenance', 'dc/h2lkz1')], 'dc/7hfhd2ek8ppd2': [('dc/7hfhd2ek8ppd2', 'provenance', 'dc/h2lkz1'), ('dc/4mjs95b1meh1h', 'biosampleOntology', 'dc/7hfhd2ek8ppd2'), ('dc/13xcyzcr819cb', 'biosampleOntology', 'dc/7hfhd2ek8ppd2')]}

Likewise for limit=1:

>>> datacommons.get_triples(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2'],limit=1)
{'dc/c3j78rpyssdmf': [('dc/c3j78rpyssdmf', 'provenance', 'dc/h2lkz1'), ('dc/zn6l0flenf3m6', 'biosampleOntology', 'dc/c3j78rpyssdmf')], 'dc/7hfhd2ek8ppd2': [('dc/7hfhd2ek8ppd2', 'provenance', 'dc/h2lkz1'), ('dc/4mjs95b1meh1h', 'biosampleOntology', 'dc/7hfhd2ek8ppd2')]}

Bulk tags show up in page titles

We should perhaps add a property to the page (similar to page-order, etc), that can be used in the sidebar generator to add the tag there. Optionally could be used to add the word Bulk to the page title as well.

image

[v1-rest] Make a clearer note about returning the latest observation

https://juliawu.github.io/datacommons-docsite/api/rest/v1/observations/point

The information is a little hidden that the latest observation is returned if a date isn't specified, especially since the main description says "Retrieve a specific observation at a set date from a variable for an entity." There should be a quick follow there that the latest is returned (otherwise you have to hunt for it in the query params). Users could be quickly scanning through APIs, and might miss this.

[v1-rest] Maintain old v0 API links

Since we have published links to our v0 API (which could be bookmarked, etc), please maintain the old link structure, or add redirects.

https://docs.datacommons.org/api/rest/place_in.html -- content
https://juliawu.github.io/datacommons-docsite/api/rest/place_in.html -- 404

Suggested fix: move old files back to the existing, prod, structure (so first link above continues to work)

As a follow on when we are ready to deprecate v0, we can move to a v0 subfolder with redirects & pointers to the new versions of the API.

[v1-rest] Increase font size of "important" information

Especially since it's important, can we increase the size to at least match the rest of the body text?

Current:
image

Suggestion:
image

css for the icon:

    background: var(--dc-red-lite);
    color: white;
    padding-right: 0;
    margin-right: 0.5em;

and removed font-size: 0.8rem on the containing div

This applies to all div's with class alert (e.g. API key).

API requires long-form decimal as parameter

When running the cURL command curl --request GET \ --url 'https://api.datacommons.org/stat/value?place=country%2FGMB&stat_var=Amount_EconomicActivity_ExpenditureActivity_EducationExpenditure_Government_AsFractionOf_Amount_EconomicActivity_GrossDomesticProduction_Nominal&scalingFactor=100.0000000000', if I try to shorten the scalingFactor to 100.0, I get a response with 404 status code back telling me that no stat data has been found.

Source files should be in its own directory

Issue

The markdown files that are used to build the documentation is at the top-level directory of this repository. This is an issue because non-documentation .md files also get built into production.

For example, the README of this repository is served at this link, which seems unintentional. LICENSE is also affected.

Also, since the source files are mixed with repository files in the top-level directory, it is hard to scan the contents and look for the file one is interested in!

Current Solution

Normally, pages that are not intended to be deployed are added to _config.yml under exclude. (see #140 )

However, this process is prone to human-errors (as evidenced by the README and LICENSE being exposed). They require manual upkeep to make sure the list is up-to-date, etc.

Proposed Change

I'd like to move all source files to its dedicated directory, perhaps called src or source. Then, jekyll configuration can be changed in one line to look at that directory for files to build. This Jekyll option is described in the official documentation.

This is a straightforward code change, and therefore it should be suitable for me as a first-time contributor.

Benefits

  1. Remove human error and effort from excluding/unpublishing irrelevant .md files to the production deployment.
  2. Better code organization, making it easier to understand the code structure and locate specific files.

Index page is empty

index.html only shows a "Welcome to Data Commons" title
But instead it 'd explain little about DataCommons and also shows the ToC

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.