Comments (14)
There's one problem here: it looks like usedforsecurity=False
was added in Python 3.9, but Datasette still supports Python 3.8 until at least its EOL in October 2024: https://devguide.python.org/versions/
I'm happy to add it with a Python version check or similar, but would the FIPS scanning system identify something like the following?
import hashlib
import sys
def non_security_md5(text):
try:
return hashlib.md5(text.encode("utf8"), usedforsecurity=False).hexdigest()
except TypeError:
# usedforsecurity is not supported
return hashlib.md5(text.encode("utf8")).hexdigest()
If that still trips the security filter I'm not sure what to do here - I want to be able to support Python 3.8.
from datasette.
Since this is purely a cosmetic thing and we're pre-Datasette-1.0 I'd be OK swapping MD5 for SHA256 here to please the filter.
from datasette.
Code in question:
datasette/datasette/database.py
Lines 73 to 77 in 5d79974
datasette/datasette/utils/__init__.py
Lines 705 to 725 in 5d79974
The CSS bit is actually a bigger problem, because changing that will change the CSS class name (and the name used for custom templates) for existing Datasette instances - which could result in customized templates or CSS breaking in ways that people might not easily notice.
from datasette.
Here's the origin of that usedforsecurity=
flag, back in 2010:
from datasette.
Oh wait... is the issue here that in a FIPS enabled system simply calling hashlib.md5(...)
triggers a runtime error?
If so, the fix could well be to use usedforsecurity=False
in Python 3.9+, avoid that parameter in Python 3.8 and document that Datasette on Python 3.8 is incompatible with FIPS, which I imagine is completely fine.
I had originally assumed that this was about a FIPS scanner that statically analyzes Python code looking for insecure uses of hashlib
, but I now see that a runtime error is a more likely mechanism here.
from datasette.
@simonw exactly, calling hashlib.md5(...)
triggers a runtime error, adding the flag apparently disables the check and prevents the runtime error. Adding the flag for 3.9+ and documenting for 3.8 seems quite reasonable.
Not sure if you can do anything about the pint issue, but just as a heads-up that also exhibits a similar runtime error in their use of hashlib.blake2b
, which overall prevents datasette from running in FIPS systems (we're patching around that for now).
from datasette.
OK, I figured out how to replicate this problem using Docker.
I'm using this image: https://hub.docker.com/r/cyberark/ubuntu-ruby-fips
The hardest part was finding an actively maintained FIPS Docker image! I eventually found it via this search for most recently updated images on Docker Hub matching "fips": https://hub.docker.com/search?q=fips&sort=updated_at&order=desc
So I can start the container with:
docker run -it --rm cyberark/ubuntu-ruby-fips /bin/bash
Then I can install stuff I need with:
apt-gen update && apt-get install -y python3 git python3.10-venv
Then:
cd /tmp
git clone https://github.com/simonw/datasette
cd datasette
python3 -m venv venv
source venv/bin/activate
pip install -e '.[test]'
pytest -n auto
This fails a bunch of tests thanks to the FIPS issue - errors like this:
File "/tmp/datasette/datasette/database.py", line 77, in color
return hashlib.md5(self.name.encode("utf8")).hexdigest()[:6]
ValueError: [digital envelope routines] unsupported
from datasette.
Applying this patch causes the test suite to pass in that FIPS Docker container:
diff --git a/datasette/database.py b/datasette/database.py
index becb552c..94225c47 100644
--- a/datasette/database.py
+++ b/datasette/database.py
@@ -74,7 +74,7 @@ class Database:
def color(self):
if self.hash:
return self.hash[:6]
- return hashlib.md5(self.name.encode("utf8")).hexdigest()[:6]
+ return hashlib.md5(self.name.encode("utf8"), usedforsecurity=False).hexdigest()[:6]
def suggest_name(self):
if self.path:
diff --git a/datasette/utils/__init__.py b/datasette/utils/__init__.py
index f2cd7eb0..d8d187ea 100644
--- a/datasette/utils/__init__.py
+++ b/datasette/utils/__init__.py
@@ -713,7 +713,7 @@ def to_css_class(s):
"""
if css_class_re.match(s):
return s
- md5_suffix = hashlib.md5(s.encode("utf8")).hexdigest()[:6]
+ md5_suffix = hashlib.md5(s.encode("utf8"), usedforsecurity=False).hexdigest()[:6]
# Strip leading _, -
s = s.lstrip("_").lstrip("-")
# Replace any whitespace with hyphens
from datasette.
@darugar do you think it's worth releasing a 0.64 with this fix, or can I leave it for a Datasette 1.0 alpha and then Datasette 1.0?
from datasette.
Not sure if you can do anything about the pint issue, but just as a heads-up that also exhibits a similar runtime error in their use of
hashlib.blake2b
, which overall prevents datasette from running in FIPS systems (we're patching around that for now).
How are you patching that?
I'm considering moving Pint out of Datasette core and trying to get it to work as a plugin instead before 1.0 - this may be just the push I need to make that decision.
from datasette.
@simonw yes we're patching Pint. Since we need to patch that anyway doing the patch for datasette is not much more effort, but of course it'd be nicer if datasette didn't need a patch :-) Completely up to you depending on timing for 1.0 alpha. Ideal scenario for us would be a datasette release with Pint as a plugin (we don't use it) and the md5 issue fixed in the release, but not a huge deal.
from datasette.
Wrote this up as a TIL: https://til.simonwillison.net/python/md5-fips
from datasette.
I filed an issue with Pint here:
from datasette.
And a PR against Pint too, which was easy because they've already dropped support for Python 3.8:
from datasette.
Related Issues (20)
- Usablity issue with need for root user
- Consider releasing a 0.65 with some forwards compatibility for 1.0 HOT 2
- Bug (in docs?): the "_internal" table on latest.datasette.io doesn't load HOT 1
- Consider adding a new plugin hook: "pre_query" or similar HOT 3
- Proposal - store metadata inside `internal.db` tables HOT 2
- Broken link in documention: fivethirtyeight.datasettes.com
- Fix font size on filter inputs
- base_url getting appended twice in redirects when applying filters?
- Accessibility: add a `lang` attribute to `html` HOT 1
- What minimal SQLite version should Datasette support? HOT 9
- Remove upserts in `set_XXX_metadata()` methods
- PyOdide test failure HOT 7
- Canned queries with named parameters fail with error against SQLite 3.46.0 HOT 11
- derive_named_parameters() method that works with latest SQLite HOT 5
- Flaky test_max_csv_mb test HOT 3
- Very weird flaky test_create_table_ignore_replace and test_upsert tests HOT 18
- multiple plugins extending the same base template?
- Consider using isolation_level="IMMEDIATE" for write connections HOT 4
- Database/Table/Row not found errors echo back text from URL HOT 7
- Proposal — Datasette JSON API changes for 1.0 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datasette.