The docsite from datacommonsorg

[v1-rest] Update ordering of endpoints in sidebar

Move /info down (and /observations up) -- mostly ordering these in terms of expected usage.

Getting Started Guide
Triples
Properties
Property Values
Property Values (linked)
Single Observation
Single Observation (linked)
Series of Observations
Series of Observations (linked)
Place Info
Variable Info
Variable Group Info
Variables

Add a helper function for getting the parent location (inverse of get_places_in)

Currently there's no way to get the parent location or area for a given place, although you can find the places within a place.

The actual problem I faced was using the CDC 500 City list and trying to find the state that each city is in.

Here's my rather hacky workaround for doing it from the other direction, which involves finding another list of places and merging:

# Get the 500 cities in the CDC500 list
cdc500_dcids = pd.DataFrame(
    dc.get_property_values(["CDC500_City"], "member", limit=500)["CDC500_City"]
    ).rename(columns={0:'DCID'})

# Get the list of 50 US states by dcid
states = dc.get_property_values(["PlacePagesComparisonStateCohort"], "member")["PlacePagesComparisonStateCohort"]

# Get all the cities in those states by dcid
cities = datacommons.get_places_in(states, "City")

# Convert this list of cities into a dataframe
cities_list = [(key, x) for key,val in cities.items() for x in val]
cities_df = pd.DataFrame(cities_list, columns=['State_DCID', 'City_DCID']) 

# Merge with our original data
cities_cdc500 = pd.merge(cdc500_dcids, cities_df, left_on="DCID", right_on = "City_DCID")

Python get_property_labels example for invalid dcid has valid dcids

http://docs.datacommons.org/api/python/property_label.html

should probably replace with "foo" and "bar"

Fix broken img under Template MCF

https://docs.datacommons.org/contributing/background/mcf_format.html last section has a broken img

[v1-rest] add links to browser nodes

Across the docs, there are many references to DCID's (like in the examples, etc). Would be nice to link directly to the graph nodes, so users can learn how to navigate that too. (in the image below, all the text in "code" could be linked)

Python value errors are copy pasted

many are "get_property_labels"

Standardize notes notation across all API docs

Make notes look like this one:
https://docs.datacommons.org/api/sheets/get_variable.html

API doc for new REST endpoint: /stat/collection

This endpoint allows a user to get the value associated with a statistical variable for a set of child places of a certain type, for a given date.

Endpoint: /stat/collection
Parameters:

parent_place
child_type
date
stat_vars (list)

Available as GET

Example:
https://api.datacommons.org/stat/collection?parent_place=country/USA&child_type=State&date=2013&stat_vars=Count_Person

Note, a direct parent/child place relationship is required here as well, i.e. country/USA -- State works, but country/USA -- County would not:
https://api.datacommons.org/stat/collection?parent_place=geoId/06&child_type=County&date=2013&stat_vars=Count_Person
https://api.datacommons.org/stat/collection?parent_place=country/USA&child_type=County&date=2013&stat_vars=Count_Person

Differentiate the contributing pages

The docsite https://docs.datacommons.org/contributing/contributing_to_datacommons.html probably should be tailored to more practical how-to's rather than the standard Google https://github.com/datacommonsorg/docsite/blob/master/CONTRIBUTING.md

[v1-rest variables] use better example

https://juliawu.github.io/datacommons-docsite/api/rest/v1/variables

Can we use include an example for a place (of a different type) with much much more data? e.g. geoId/06, India, USA. Can use ellipses but if users try it out, they'll see how much data we have (and they will most likely just copy/paste the example).

Python get_triples and get_property_labels does not mention series

Seems it accepts both list and series. get_property_values mentions series.

Consolidate background common to contributors and users

Some of our docs can be combined and reused-- a separate section:

Background
API
Contributing

E.g. we have https://docs.datacommons.org/data_model.html AND a link to schema.org data model at https://docs.datacommons.org/contributing/background/background.html

StatVars are important to users too: https://docs.datacommons.org/api/python/stat_value.html, so information about StatVars (currently only here) will be relevant.

For this issue, feel free to take large freedoms. As Guha put it, the current Representing Statistics documentation is not written with sensitivity to those who have not been in the space for years.

[v1-rest bulk/observations] make it clear in docs that more than 2 entities are possible

https://juliawu.github.io/datacommons-docsite/api/rest/v1/bulk/observations/series

At the moment, judging from the examples and request format at the top of the page, it looks as if exactly 2 entities & variables are required. Can we make it clearer that it's 1 or more? Perhaps with ellipses, and an example that uses a different number of arguments.

Clarify that get_populations will only return one pop per dcid

Will reflect this in R's PR, then subsequent PR to fix for Python/REST

e.g.:

"Given a list of Place DCID’s, return the DCID of StatisticalPopulation’s for these places, constrained by the given property values."

Add overview of sheets api to /api

datacommonsorg/website#324

Python get_property_values return contains info about get_property_labels

Probably a copy pasta

Add custom markdown to link browser nodes

It gets tedious to keep linking to browser nodes. We should add custom markdown to automatically create these links.

A starting point: https://jekyllrb.com/docs/configuration/markdown/

API doc for new REST endpoint: /stat/set

This endpoint allows a user to get the value for a set of statistical variables, for a single date, across a set of places.

Endpoint: /stat/set (code)

Params:

places (list)
stat_vars (list)
date

Available as POST

Example: curl -X POST https://api.datacommons.org/stat/set -d '{ "places": ["geoId/06", "geoId/0649670", "country/FRA", "country/USA"], "stat_vars": ["Count_Person", "Count_CriminalActivities_CombinedCrime"], "date": "2017"}'

[v1-rest observations/series] note what preferred facet means

https://juliawu.github.io/datacommons-docsite/api/rest/v1/observations/series

Main byline says: Retrieve series of observations from a specific variable for an entity from the preferred facet.

but it's unclear what "preferred facet" means.

It also seems as if we should allow users to select a facet? /cc @shifucun

Add cohort definition to glossary

Broken Courseware Link on Web site

https://docs.datacommons.org/contributing/#add-data points to https://docs.datacommons.org/courseware.html instead of https://docs.datacommons.org/courseware/

Getting a 404 error when clicking on link for 'the courseware page'

[v1-rest info/place] typo

https://juliawu.github.io/datacommons-docsite/api/rest/v1/info/place

Get information on a single place (or city)

Create SPARQL tutorial

Most SPARQL tutorials "out there" are rather confusing. It would be nice to have something more in depth attached to our docs.

[v1-rest property/value] more descriptions

Some nits:

Can we add more description or examples to property
For example 1: value should come first in the response example
Example 2: an we add more description? And especially highlight in or out

Our API docs should probably elaborate how to find DCIDs of places

Here is an illustration from this drawing.

Feature request: endpoint that tells me which entities have which properties

When searching for specific properties like "scalingFactor" or "measurementMethod", it would be useful to have an endpoint that returns a list of entities possessing these specific properties.

API doc for new REST endpoint: /place/stat-vars

This endpoint allows a user to explore statistical variables which are available for a set of places. Available as both GET and POST

Endpoint: /place/stat-vars (code)
Parameters: dcids (list)
Request: https://api.datacommons.org/place/stat-vars?dcids=country/USA
Response: {"places":{"country/USA":{"statVars":["dc/zl73qp466bs28","dc/z6jy58c28k7zh"]}}}

Example: http://api.datacommons.org/place/stats-var?dcids=geoId/06&dcids=zip/94025

Sheets API includes Egypt as an Asian country but not Russia

When running the command =DCPLACESIN(A1, "Country") (where A1 = asia as a DCID) and then =DCGETNAME(B1) on the output, the list of names output includes Egypt but not Russia. This strikes me as a little strange.

API doc for new REST endpoint: /place/stat/date/within-place

This endpoint allows a user to retrieve dates with data available for each statistical variable specified. The set of places to query is specified by an ancestor place, and the place type of the child places to consider (similar to our places-in API). This is helpful for building an interactive app to explore our data, e.g. https://staging.datacommons.org/tools/scatter2

Endpoint: /place/stat/date/within-place (code)

Params:

ancestor_place
place_type
stat_vars (repeated list of stat_vars accepted)

Available as both GET and POST

Request: curl -X POST https://api.datacommons.org/place/stat/date/within-place -d '{ "ancestor_place": "geoId/06”, “place_type”: “City”, “stat_vars”: [“Count_Person”]}'
Response: {“Count_Person”: [“2017”, “2018”, ...]}

or

https://api.datacommons.org/place/stat/date/within-place?ancestor_place=geoId/06&place_type=County&stat_vars=Count_Person&stat_vars=Count_Person_Female

/cc: @shifucun

Glossary: Look into the paragraph formatting

https://docs.datacommons.org/contributing/background/glossary.html

E.g. under Statistical variable, I see some interesting line breakages with indents

Alternate name for Tanzania is just a flag

https://browser.datacommons.org/browser/country/TZA

The alternateName for Tanzania is a Tanzanian flag emoji.

Refine API documentation

The whole section of https://docs.datacommons.org/api/ is up for a refinement pass. Feel free to create new, smaller-scoped bugs.

Last python pop example is missing a bracket

http://docs.datacommons.org/api/python/population.html

Limits 0-indexed in Python triple endpoint

When I run the Python code datacommons.get_triples(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2'],limit=2), the result limits me to three triples for the endpoint, rather than two as I would expect:

{'dc/c3j78rpyssdmf': [('dc/zn6l0flenf3m6', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/tkcknpfwxfrhf', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/c3j78rpyssdmf', 'provenance', 'dc/h2lkz1')], 'dc/7hfhd2ek8ppd2': [('dc/7hfhd2ek8ppd2', 'provenance', 'dc/h2lkz1'), ('dc/4mjs95b1meh1h', 'biosampleOntology', 'dc/7hfhd2ek8ppd2'), ('dc/13xcyzcr819cb', 'biosampleOntology', 'dc/7hfhd2ek8ppd2')]}

Likewise for limit=1:

>>> datacommons.get_triples(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2'],limit=1)
{'dc/c3j78rpyssdmf': [('dc/c3j78rpyssdmf', 'provenance', 'dc/h2lkz1'), ('dc/zn6l0flenf3m6', 'biosampleOntology', 'dc/c3j78rpyssdmf')], 'dc/7hfhd2ek8ppd2': [('dc/7hfhd2ek8ppd2', 'provenance', 'dc/h2lkz1'), ('dc/4mjs95b1meh1h', 'biosampleOntology', 'dc/7hfhd2ek8ppd2')]}

Update documentation to reflect PopObs -> StatVar migration

Bulk tags show up in page titles

We should perhaps add a property to the page (similar to page-order, etc), that can be used in the sidebar generator to add the tag there. Optionally could be used to add the word Bulk to the page title as well.

[v1-rest] Make a clearer note about returning the latest observation

https://juliawu.github.io/datacommons-docsite/api/rest/v1/observations/point

The information is a little hidden that the latest observation is returned if a date isn't specified, especially since the main description says "Retrieve a specific observation at a set date from a variable for an entity." There should be a quick follow there that the latest is returned (otherwise you have to hunt for it in the query params). Users could be quickly scanning through APIs, and might miss this.

Move tabs plugin in-house

As discussed in the chat, the tabs plug-in might benefit from being moved in-house.

[v1-rest] Maintain old v0 API links

Since we have published links to our v0 API (which could be bookmarked, etc), please maintain the old link structure, or add redirects.

https://docs.datacommons.org/api/rest/place_in.html -- content
https://juliawu.github.io/datacommons-docsite/api/rest/place_in.html -- 404

Suggested fix: move old files back to the existing, prod, structure (so first link above continues to work)

As a follow on when we are ready to deprecate v0, we can move to a v0 subfolder with redirects & pointers to the new versions of the API.

Documentation site is missing nav bar items

We should make sure https://docs.datacommons.org/statistical_variables.html, etc. have the nav bar items.

[v1-rest] Increase font size of "important" information

Especially since it's important, can we increase the size to at least match the rest of the body text?

Current:

Suggestion:

css for the icon:

    background: var(--dc-red-lite);
    color: white;
    padding-right: 0;
    margin-right: 0.5em;

and removed font-size: 0.8rem on the containing div

This applies to all div's with class alert (e.g. API key).

docs.datacommons.org/tutorials.html 404

From the site-wide Documentation toolbar -

docsite/_layouts/default.html

Line 53 in 02d6a46

<a class="dropdown-item" href="/tutorials.html">Tutorials</a>

https://docs.datacommons.org/tutorials.html is 404 currently.
https://docs.datacommons.org/tutorials/ seems to work.

API requires long-form decimal as parameter

When running the cURL command curl --request GET \ --url 'https://api.datacommons.org/stat/value?place=country%2FGMB&stat_var=Amount_EconomicActivity_ExpenditureActivity_EducationExpenditure_Government_AsFractionOf_Amount_EconomicActivity_GrossDomesticProduction_Nominal&scalingFactor=100.0000000000', if I try to shorten the scalingFactor to 100.0, I get a response with 404 status code back telling me that no stat data has been found.

Source files should be in its own directory

Issue

The markdown files that are used to build the documentation is at the top-level directory of this repository. This is an issue because non-documentation .md files also get built into production.

For example, the README of this repository is served at this link, which seems unintentional. LICENSE is also affected.

Also, since the source files are mixed with repository files in the top-level directory, it is hard to scan the contents and look for the file one is interested in!

Current Solution

Normally, pages that are not intended to be deployed are added to _config.yml under exclude. (see #140 )

However, this process is prone to human-errors (as evidenced by the README and LICENSE being exposed). They require manual upkeep to make sure the list is up-to-date, etc.

Proposed Change

I'd like to move all source files to its dedicated directory, perhaps called src or source. Then, jekyll configuration can be changed in one line to look at that directory for files to build. This Jekyll option is described in the official documentation.

This is a straightforward code change, and therefore it should be suitable for me as a first-time contributor.

Benefits

Remove human error and effort from excluding/unpublishing irrelevant .md files to the production deployment.
Better code organization, making it easier to understand the code structure and locate specific files.

API doc for new REST endpoint: /place/stat-vars/union

This endpoint allows a user to explore the union of statistical variables which are available for a set of places. Only available as POST

Endpoint: /place/stat-vars/union (code)
Parameters:

dcids (list)
Request: curl -X POST https://api.datacommons.org/place/stat-vars/union -d '{ "dcids": ["geoId/06”, “geoId/05”]}'
Response: {"statVars":["dc/zl73qp466bs28","dc/z6jy58c28k7zh"]}}

please improve/lighten the color for the selected page in API docs

Update python docs on api-python

Want to match the docs we've been creating on the docsite.

Index page is empty

index.html only shows a "Welcome to Data Commons" title
But instead it 'd explain little about DataCommons and also shows the ToC

Update api homepage

Update https://docs.datacommons.org/api/ with new get_stat_* related functions.

Sheets API returns incorrect DCID for the US

When I used the DCID getter tool documented in https://docs.datacommons.org/api/sheets/get_dcid.html to get the DCID for the United States, it returned the value geoId/72127. When I tried to use some functions in the Sheets API on this DCID, they consistently returned errors. Only when I manually replaced geoId/72127 with country/USA was I able to return a list of state DCIDs from the endpoint.

datacommonsorg / docsite Goto Github PK

docsite's People

Contributors

Stargazers

Watchers

Forkers

docsite's Issues

Issue

Current Solution

Proposed Change

Benefits

Recommend Projects

Recommend Topics

Recommend Org