Comments (7)
I remember using the typing.get_type_hints function to resolve string literal type-hints to type objects. If you pass a string to this function, it interpreted it as a forward reference. Therefore, if the string literal includes a custom object, e.g. ArticleBody
it has to be loaded within the global or local namespace or passed to the function as well.
from fundus.
No way, that would be huge! Sadly i already discarded my progress 😅
Never ever discard any progress made ;).
I've had some fun looking into the internal code of the typing.get_type_hints
function. It does not exactly do what we want as stated in the documentation:
Return a dictionary containing type hints for a function, method, module or class object.
We want to do something more simple. Just resolve the string forward reference to a type object. Since the get_type_hints
function internally needs to do the same at some point we can get an idea from there. It uses the _eval_type
function. This function checks for forward references of the specific wrapper type ForwardRef
. Now, the ForwardRef
object has an internal _evaluate
function. This is exactly what we need. In the end, this function uses a recursive call of the standard eval
function. So for us, we could use the internal function of an object that is not meant to be instantiated by the "user" or use eval as long as it works.
Examples:
from datetime import datetime
from typing import Dict, List, Optional, ForwardRef
from src.parser.html_parser import ArticleBody
attribute_annotations: Dict[str, object] = {
"title": Optional[str],
"body": ArticleBody,
"authors": List[str],
"publishing_date": Optional[datetime],
"topics": List[str],
}
attribute_string_annotations: Dict[str, str] = {
"title": "Optional[str]",
"body": "ArticleBody",
"authors": "List[str]",
"publishing_date": "Optional[datetime]",
"topics": "List[str]",
}
resolved_attribute_annotations: Dict[str, object] = {
attribute: ForwardRef(annotation)._evaluate(globals(), locals(), frozenset()) for attribute, annotation in attribute_string_annotations.items()
}
assert attribute_annotations == resolved_attribute_annotations
resolved_attribute_annotations: Dict[str, object] = {
attribute: eval(annotation) for attribute, annotation in attribute_string_annotations.items()
}
assert attribute_annotations == resolved_attribute_annotations
from fundus.
Update:
I gave it a try using string comparisons between types parsed from the table in the guidelines and the actual annotations cast to a string, but this completely fell apart for most types from the typing module. Especially for Optional
types since this itself is nothing but a TypeAlias
.
Maybe someone has a better idea how to keep both things synced.
from fundus.
I remember using the typing.get_type_hints function to resolve string literal type-hints to type objects.
No way, that would be huge! Sadly i already discarded my progress 😅
from fundus.
Another note: eval()
is a dangerous and unsafe function because it may execute arbitrary code. Since we only use this for our very selected use case in the testing environment and not in the code that the actual user receives, I think this is OK.
from fundus.
I ended up with eval()
but ultimately abandoned it. Not because of security concerns - imo it doesn't matter, everyone can execute code through the CI as long as he opens a PR - but because I didn't want to maintain all the unused imports. That's why I switch to comparing the strings in the first place. Maybe living with the imports is the way to go.
from fundus.
That's a trade-off I guess. So far it only would be the article body. My guess for the future is also that we have way more built-in types rather than custom objects that need an import. Even with imports there is the gain not having to maintain the actual annotion guideline list. Also one would immediately be altered if an import is missing. I don't have a strong opinion, just listing some thoughts.
from fundus.
Related Issues (20)
- [Discussion]: Most articles only have one section HOT 2
- [Feature Request]: Add one Lithuanian news source to Fundus
- [Feature Request]: Easy way to test URLSource HOT 1
- Open problems
- Allowing overwrite of lang attribute
- Refactor doc strings according to google specifications
- [Bug]: Fundus not installing on Google Colab HOT 5
- [Feature Request]: Pretty print for PublisherCollection HOT 1
- [Bug]: SZ Parser is not up to date
- Quality control for parser test cases.
- [Question]: Should we use `commitizen` to enforce conventional commits and automate versioning?
- [Question]: Article body extraction benchmark
- [Bug]: WAZ not parsing properly HOT 2
- [Bug]: installing via pip runs into Runtime error (event loop already running) HOT 3
- [Question]: News Publisher has two News site maps, what to do? HOT 4
- [Bug]: Falsely encoded HTML
- Not able to use Fundus due to not being able to load a library HOT 5
- [Question]: How to fix circular import bug HOT 3
- Cannot run the pip installation HOT 9
- [Bug]: url_filter in PublisherSpec not filtering HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fundus.