Comments (9)
Note: see #132 (comment), we could use cdd
to auto-detect style and parse accordingly.
from griffe.
Instead of inferring the docstring style at once, it could be easier to infer the section styles independently. The rationale is that the style of a section is usually clear while the style of a docstring could be ambiguous. In fact, I've seen mix-styled docstrings.
This approach could actually simplify the way griffe
parses docstrings as a lot of the parsing is common to all styles. It would also become easier to add parsing rules for custom sections.
For the inference in itself, instead of choosing a single style in mkdocs.yml
you could rank the styles in the order of preference, and try each of them until one matches.
from griffe.
Allow me to move this over to griffe 🙂
EDIT: oh, it seems the issue cannot go out of the organization. Then I'll move griffe into it.
from griffe.
So, nice idea! Currently, each object can have its own docstring style attached. But you listed some cases that would still not be supported by this. I'll think about it 🙂
from griffe.
This approach could actually simplify the way griffe parses docstrings as a lot of the parsing is common to all styles. It would also become easier to add parsing rules for custom sections.
Could you expand on that? It's true that the Google and Numpy parsers have a lot in common, though subtle differences remain. What do you mean by "parsing rules for custom sections"?
from griffe.
This approach could actually simplify the way griffe parses docstrings as a lot of the parsing is common to all styles. It would also become easier to add parsing rules for custom sections.
I was thinking of having a docstring parse function common to all styles, which calls the "section" parsers implemented for each styles.
Thinking back on that, it would work only if the space policy between sections is the same across all styles. But for similar enough styles you could "share" that docstring-level parse function and change arbitrarily the parsing style for each section.
What do you mean by "parsing rules for custom sections"?
If I understand correctly, the sections that are availabe are limited by the DocstringSection classes (Parameters, Returs,...) currently implemented which are mapped with section titles. My idea was to allow parameters-like, returns-like, examples-like, note-like ... section parsing with different section titles by letting the user add rules "section title -> parser". sphinx.ext.napoleon
allows to do that.
Also, maybe some current section could be merged? I am thinking of Returns and Yields.
from griffe.
I'm not sure supporting multiple-style docstrings is worth the effort. Even if you're switching from one style to another in a big codebase, docstrings themselves are usually short enough that you can stick to a single style, and not just switch one or two sections in the docstring. And if it's not about switching styles, then I'd consider it a very niche use-case, again not worth the effort.
Your argument about factorizing the code still stands though. It's not the prettiest, DRYest code I've written, it can surely be refactored to something more elegant and reusable 🙂
allow parameters-like, returns-like, examples-like
Custom titles for sections are already supported! There's an example here in Griffe's own docs (unfold the source code). I don't think more customization is welcomed in docstring parsers. If you deviate from the spec (Google, Numpy, Sphinx, etc.), then all the tooling around docstrings stop working (linters, IDEs, ...). Just my opinion of course, happy to read counter-examples!
from griffe.
Tooling for docstring styles is not so great in many tools. (e.g. PyCharm still doesn't support google style very well), but obviously no one would recommend that anyone intentionally use multiple styles in the same doc string, yet it does happen sometimes.
It makes sense for linters and the like to be more strict about doc-string styles, but people want their doc generator to just magically do the "right thing" if at all possible.
I think that it would be fantastic if inference was just at the docstring level, but it would be even nicer if it could be done per section.
However, maybe anyone who wants this should provide some real-world examples doc-strings using multiple styles.
from griffe.
Here's some very rough/ugly script to count styles in packages docstrings:
import os
import re
from contextlib import suppress
from griffe.exceptions import AliasResolutionError
from griffe.loader import GriffeLoader
# build manually or dynamically, from PDM/Poetry/pip cache for example
packages_paths = []
numpy_docstrings = 0
google_docstrings = 0
sphinx_docstrings = 0
unknown_docstrings = 0
markdown_docstrings = 0
rst_docstrings = 0
md_inline_code = re.compile(r"`[^`]+`")
rst_inline_code = re.compile(r"``[^`]+``")
def prompt_docstring(docstring):
global numpy_docstrings
global google_docstrings
global sphinx_docstrings
global unknown_docstrings
global markdown_docstrings
global rst_docstrings
if docstring:
for markup in (
":meth:",
":func:",
":attr:",
":mod:",
":class:",
":raises:",
":raise ",
":note:",
":param ",
":return:",
":returns:",
":rtype:",
):
if markup in docstring.value:
sphinx_docstrings += 1
return
for markup in (
"Args:\n ",
"Arguments:\n ",
"Attributes:\n ",
"Raises:\n ",
"Returns:\n ",
"Yields:\n ",
"Example:\n ",
"Examples:\n ",
):
if markup in docstring.value:
google_docstrings += 1
return
for markup in (
"Args\n----",
"Arguments\n---------",
"Attributes\n----------",
"Parameters\n----------",
"Raises\n------",
"Returns\n-------",
"Yields\n------",
"Methods\n-------",
):
if markup in docstring.value:
numpy_docstrings += 1
return
for markup in (
"```\n",
):
if markup in docstring.value:
markdown_docstrings += 1
return
for markup in (
".. autofunction::",
".. code::",
".. todo::",
".. note::",
".. warning::",
".. versionchanged::",
".. versionadded::",
"::\n\n ",
):
if markup in docstring.value:
rst_docstrings += 1
return
if rst_inline_code.search(docstring.value):
rst_docstrings += 1
return
if md_inline_code.search(docstring.value):
markdown_docstrings += 1
return
unknown_docstrings += 1
def iter_docstrings(obj):
try:
prompt_docstring(obj.docstring)
except AliasResolutionError:
return
for member in obj.members.values():
if not member.is_alias:
iter_docstrings(member)
if __name__ == "__main__":
loader = GriffeLoader(allow_inspection=False)
for index, package_path in enumerate(packages_paths, 1):
print(f"\r{index}/{len(packages_paths)}", end="")
with suppress(Exception):
package = loader.load_module(package_path)
iter_docstrings(package)
print("Google", google_docstrings)
print("Numpy", numpy_docstrings)
print("Sphinx", sphinx_docstrings)
print("Markdown", markdown_docstrings)
print("RST", rst_docstrings)
print("Unknown", unknown_docstrings)
It does not search for multi-style docstrings. To do it, the words search should probably be replaced with regexes, and all regexes should be tested.
It outputs something like this:
Google 2948
Numpy 150
Sphinx 8665
Markdown 5544
RST 4883
Unknown 58256
Note that most of the packages installed on my machine are not data-science libraries, that's probably why there's so few Numpy-style docstrings. Unknown docstrings are docstrings for which it didn't detect any specific markup. They could be used as Markdown or reStructuredText. Also, the numbers here are really not to take seriously, as multiple versions of the same packages were scanned.
from griffe.
Related Issues (20)
- bug: Regular classes that inherit dataclass do not get dataclass parameters HOT 1
- bug: A class that derives from a dataclass should be labelled a dataclass HOT 2
- feature: Merge value when attribute is first annotated then assigned
- docs: bad link HOT 2
- feature: Future backward-compatibility warnings
- feature: API checks: diff each sequence/mapping value instead of the whole
- bug: Error reading page: tuple index out of range HOT 3
- feature: API checks: Report new positional-or-keyword parameters inserted before other positional-or-keyword parameters
- feature: Warning/hint for objects coming from siblings/parents/external exposed in __all__
- feature: Format/highlight expressions?
- feature: link to re-export instead of inserting full documentation HOT 4
- bug: when using `import x.y.z as a`, Griffe thinks `a` is an alias for `x`, not `x.y.z` HOT 4
- bug: parsing of annotations placed after parameters section fails HOT 1
- docs: Add an example CI setup HOT 1
- bug: Custom implementation of cached_property results in property being categorized as a "method" instead of a cached property HOT 7
- bug: Google docstrings: no support for non-multiple or non-named values in Yields section HOT 3
- feature: extension(?): relative cross-references in docstrings, but resolved by the handler/extractor HOT 6
- feature: Pre-commit integration HOT 4
- feature: Allow passing multiple args to `Object.has_labels` instead of a set
- feature: Allow passing multiple args to `load_extensions` instead of a list
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from griffe.