sernst / cauldron Goto Github PK
View Code? Open in Web Editor NEWInteractive computing for complex data processing, modeling and analysis in Python 3
Home Page: http://www.unnotebook.com
License: MIT License
Interactive computing for complex data processing, modeling and analysis in Python 3
Home Page: http://www.unnotebook.com
License: MIT License
When something is writing a progress bar out to the terminal an error message comes up from cauldron.
cauldron_1 | [SYNCED]: File chunk 0 /tmp/cd-remote-project-p10dqizn/plan-update-feature/S05-shap.py
cauldron_1 |
cauldron_1 | === RUNNING ===
cauldron_1 | Traceback (most recent call last):
cauldron_1 | File "/cauldron_local/cauldron/cli/threads.py", line 62, in run_command
cauldron_1 | **self.kwargs
cauldron_1 | File "/cauldron_local/cauldron/cli/commands/run/__init__.py", line 225, in execute
cauldron_1 | skip_library_reload=skip_library_reload
cauldron_1 | File "/cauldron_local/cauldron/cli/commands/run/__init__.py", line 303, in run_local
cauldron_1 | skips=steps_run + []
cauldron_1 | File "/cauldron_local/cauldron/runner/__init__.py", line 172, in section
cauldron_1 | if not source.run_step(response, project, ps, force=force):
cauldron_1 | File "/cauldron_local/cauldron/runner/source.py", line 169, in run_step
cauldron_1 | ).console_raw(result['message'])
cauldron_1 | KeyError: 'message'
100%|██████████| 200/200 [06:43<00:00, 2.02s/it]
This specific trace originated while getting shap_values
from the shap
python library using a KernelExplainer
.
The home view in the GUI closes projects out of sync, which causes fast opening of projects to fail.
Joining the values of multiple indexes with ', '
and calling the column index
conceals some potentially important information and hinders the possibility to copy&paste from the output to excel (props for the fact, that this is posssible at all). Would it be possible/desirably for others, that each of the multiindex levels is written to an own column with their respective names?
import cauldron as cd
import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2017', periods=8760, freq='H')
df = pd.DataFrame(
{'value': np.random.rand(8760)},
index=rng
)
df = df.groupby([df.index.month, df.index.day]).agg(['mean', 'std'])
df.index.set_names(['month', 'day'], inplace=True)
df.to_excel('foo.xlsx') # I like this formatting
print(df) # output unpleasant
cd.display.table(df, include_index=True) # Conceals too much information
cd.display.table()
Also it feels like I've been complaining quite a lot. This was absolutely not my intention. I really like the idea behind cauldron and think the work already put in is amazing. Just trying to make it even better ;)
I recently came across your Cauldron project, which looks to be a good alternative to Jupyter notebook. However, I did notice that when sharing a notebook the filename extension is 'CDF'. This is a very common filename extension in some scientific data circles (https://cdf.gsfc.nasa.gov). So it becomes a bit of a nuisance if one wants to work with cauldron files and CDF scientific data files on the same platform. Is there any chance the cauldron filename extension could be something else (how about .cau)?
Improve the file sync handling in the cauldron.cli.sync
subpackage.
When I try to import a file in the project folder (case in point, a file of data arranged in dicts, pandas Series and lists), it throws a ModuleNotFoundError. When doing the same in a normal python or IPython console or Jupyter kernel, it works fine.
Am I doing anything wrong, or is this a bug?
Cauldron with Anaconda, Python 3.6 on Ubuntu 17.04.
There's a typo in the PR template of the word reproduce
in the Steps to Reproduce section:
https://github.com/sernst/cauldron/blob/master/PULL_REQUEST_TEMPLATE.md#steps-to-test-or-reproduce
CC: @DanMayhew
There are multiple issues installing on Linux Mint 18.3, which is basically an Ubuntu 16.04 ("Xenial") system.
By following the instructions on the home page:
$ sudo apt-get install python3-pip
$ sudo pip3 install --upgrade cauldron-notebook
This installs /usr/local/bin/cauldron
, which gives problems later...
$ sudo updatedb && locate -i "bin/cauldron"
/usr/local/bin/cauldron
Apart from that, the Cauldron's Python package is correctly installed, together with all other PIP packages:
$ sudo updatedb && locate -i "cauldron" | grep __init__.py | head -n1
/usr/local/lib/python3.5/dist-packages/cauldron/__init__.py
$ sudo dpkg -i cauldron_3.221.532_amd64.deb
# Install finishes without errors
This installs /usr/bin/cauldron
:
$ sudo updatedb && locate -i "bin/cauldron"
/usr/bin/cauldron
/usr/local/bin/cauldron
Calling cauldron
will run /usr/local/bin/cauldron
due to how binaries are searched at execution time (user local binaries are preferred over system default ones). Also, it breaks:
$ cauldron
Traceback (most recent call last):
File "/usr/local/bin/cauldron", line 7, in <module>
from cauldron.invoke import run
File "/usr/local/lib/python3.5/dist-packages/cauldron/__init__.py", line 1, in <module>
from cauldron import session as _session
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/__init__.py", line 6, in <module>
from cauldron.session.exposed import ExposedProject
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/exposed.py", line 9, in <module>
from cauldron.render import stack as render_stack
File "/usr/local/lib/python3.5/dist-packages/cauldron/render/stack.py", line 5, in <module>
from cauldron.session import projects
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/projects/__init__.py", line 1, in <module>
from cauldron.session.projects.project import DEFAULT_SCHEME # noqa
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/projects/project.py", line 14, in <module>
from cauldron.session.projects import steps
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/projects/steps.py", line 12, in <module>
from cauldron.session.report import Report
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/report.py", line 11, in <module>
class Report(object):
File "/usr/local/lib/python3.5/dist-packages/cauldron/session/report.py", line 45, in Report
def project(self) -> typing.Union['projects.Project', None]:
File "/usr/lib/python3.5/typing.py", line 552, in __getitem__
dict(self.__dict__), parameters, _root=True)
File "/usr/lib/python3.5/typing.py", line 512, in __new__
for t2 in all_params - {t1} if not isinstance(t2, TypeVar)):
File "/usr/lib/python3.5/typing.py", line 512, in <genexpr>
for t2 in all_params - {t1} if not isinstance(t2, TypeVar)):
File "/usr/lib/python3.5/typing.py", line 190, in __subclasscheck__
self._eval_type(globalns, localns)
File "/usr/lib/python3.5/typing.py", line 177, in _eval_type
eval(self.__forward_code__, globalns, localns),
File "<string>", line 1, in <module>
AttributeError: module 'cauldron.session.projects' has no attribute 'Project'
Giving the full path does actually run the correct program, but it doesn't find the cauldron
Python package:
$ /usr/bin/cauldron
First screen:
To Start Cauldron...
Enter your Python 3.5+ executable path
or a remote Cauldron connection URL
> /usr/bin/python3
Second screen:
Missing Cauldron
Unable to import cauldron python package. Has it been installed?
To handle non-containerized environments where gunicorn+nginx is not an option,
it's time to move from the Flask debug server to a waitress wsgi server that
will more robustly handle serving the kernel and UI.
The button only gets enabled after typing on the line. Or maybe I'd selected something wrong first and fixed it, but it didn't realize that until I typed on the line.
Also, it's not clear whether you should pick the "python.app" or not.
Well, best thing would be to be able to pick from a list that defaults to the interpreter you actually use, like jupyter notebook does. Maybe just use its list?
Why do you recommend to use an @
instead of an \
for displaying latex equations? In my opinion this reduces readabiliy if you're already used to latex and adds an extra step when copy&pasting equations from another source/equation generator. What currently already works is using a raw string. At least I suggest to add this information to the documentation
cd.display.latex(r'\alpha @delta')
btw, I'm running Windows 10 and Python 3.5.3. I know of some issues between Linux and Windows when it comes to backslashes, but I guess this only true for everything touching the filesystem (?)
Right now when not auto-reloading libraries, the library directories do not always get reliably added during notebook initialization.
I guess you didn't implement this function yet. Sometimes I'm not aware whether an operation on a dataframe returns a dataframe or a table. When I try to display the result using the cd.display.table
function, it yields the AttributeError
'Series' object has no attribute 'columns'
Since a Series is basicly a DataFrame with one column, it would make some sense to use the same function for displaying them. And when I just want to see the output of an operation I usually don't think it advance what type it is, so I would also like it to be the same function. However, having an own function feels more right because it forces you to be explicit.
Some example code:
import cauldron as cd
import pandas as pd
series = pd.Series(range(1000))
print(type(series))
cd.display.table(pd.DataFrame(series)) # a possible hack would be to convert series' to dataframes first
print(series) # this works but isn't as nicely formatted
cd.display.table(series) # this fails
Hey, first thanks for cauldron, seems quite interesting!
A thing I'm missing which would seem to be logical to be in is some kind of watchdog/watchgod integration. The usecase is that I simple do not wanna leave my editor.
I have seen that there is the possibility to run a step from the cauldron ui, but it would be great to maybe have some --reload option when starting the ui, which then just uses watchgod underneath to reload when it detects a filesystem write.
What do you think? Maybe I could do a pull request even, don't know yet, haven't looked inside the source code of cauldron yet ;)
Seeing import errors because contextfilter
has been removed from jinja2
. It's been replaced by jinja2.pass_context
.
In addition to the cd.step.write_to_console()
function, it would be nice to have a cd.step.render_to_console()
that included Jinja2 templating and textwrap.dedent()
internally to make complex console messages easier to write.
Are you planning to provide input possibilities for certain parameters or even raw data like csv-files? I could think of some use cases where this might be quite handy.
Is there a way to make Jupyter/IPython magic commands work?
If you make a bokeh plot, it will not show the bottom portion or the x-axis unless you put the window in fullscreen mode. Width is okay.
Is it possible to have the ability to save CSVs using Pandas' to.csv function added to Cauldron?
Or simply a "download CSV or Google Sheets" link right in the dataframe?
Is it possible to save a notebook to an html file without including the code? I can think of many cases where you want to show only the results but keep the way how they were obtained private
When multiple tests are configured in a StepTestCase
instance only one allowed to succeed, the others error out. Here is a minimal example:
from cauldron.steptest import StepTestCase
class TestNotebook(StepTestCase):
def test_one(self):
self.run_step('S01-Initialize.py')
def test_two(self):
self.run_step('S01-Initialize.py')
the following error results
__________________________________________________________________________________________ TestNotebook.test_two ___________________________________________________________________________________________
self = <test.test_notebook.TestNotebook testMethod=test_two>
def open_project(self) -> 'exposed.ExposedProject':
"""
Opens the project associated with this test and returns the public
(exposed) project after it has been loaded. If the project cannot be
opened the test will fail.
"""
try:
project_path = self.make_project_path()
> return support.open_project(project_path)
/usr/local/lib/python3.7/site-packages/cauldron/steptest/__init__.py:80:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
project_path = '/Users/kevinschiroo/path/to/project/notebook'
def open_project(project_path: str) -> 'exposed.ExposedProject':
"""
Opens the project located at the specified path and returns the public
(exposed) project after it has been loaded. If the project cannot be
opened a `RuntimeError` will be raised instead.
:param project_path:
The path to the Cauldron project to open. It should be either a
directory that contains a `cauldron.json` file or a file path to the
`cauldron.json` file for the project to load.
"""
res = commander.execute(
name='open',
raw_args='{} --forget'.format(project_path),
response=environ.Response()
)
res.thread.join()
# Prevent threading race conditions
project = (
cd.project.get_internal_project()
if res.success else
None
)
if res.failed or not project:
raise RuntimeError(
> 'Unable to open project at path "{}"'.format(project_path)
)
E RuntimeError: Unable to open project at path "/Users/kevinschiroo/path/to/project/notebook"
/usr/local/lib/python3.7/site-packages/cauldron/steptest/support.py:66: RuntimeError
During handling of the above exception, another exception occurred:
/usr/local/lib/python3.7/site-packages/cauldron/steptest/__init__.py:50: in setUp
self.open_project()
/usr/local/lib/python3.7/site-packages/cauldron/steptest/__init__.py:82: in open_project
self.fail('{}'.format(error))
E AssertionError: Unable to open project at path "/Users/kevinschiroo/path/to/project/notebook"
------------------------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------------------------
spec not found for the module 'test.test_notebook'
[ERROR]: Failed to execute command due to internal error
------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/cauldron/cli/threads.py", line 62, in run_command
**self.kwargs
File "/usr/local/lib/python3.7/site-packages/cauldron/cli/commands/open/__init__.py", line 147, in execute
results_path=results_path
File "/usr/local/lib/python3.7/site-packages/cauldron/cli/commands/open/opener.py", line 177, in open_project
runner.reload_libraries(project.library_directories)
File "/usr/local/lib/python3.7/site-packages/cauldron/runner/__init__.py", line 120, in reload_libraries
for directory in directories
File "/usr/local/lib/python3.7/site-packages/cauldron/runner/__init__.py", line 121, in <listcomp>
for reloaded_module in reload_library(directory)
File "/usr/local/lib/python3.7/site-packages/cauldron/runner/__init__.py", line 115, in reload_library
for path in glob.glob(glob_path, recursive=True)
File "/usr/local/lib/python3.7/site-packages/cauldron/runner/__init__.py", line 115, in <listcomp>
for path in glob.glob(glob_path, recursive=True)
File "/usr/local/lib/python3.7/site-packages/cauldron/runner/__init__.py", line 102, in reload_module
return importlib.reload(module) if module is not None else None
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/__init__.py", line 168, in reload
raise ModuleNotFoundError(f"spec not found for the module {name!r}", name=name)
ModuleNotFoundError: spec not found for the module 'test.test_notebook'
Commenting out each test one at a time the uncommented test succeeds.
This appears to be a change from 0.4.6 to 0.4.7. In 0.4.6 these tests pass, in 0.4.7 one fails.
Please note that pytest-runner will not be supported anymore.
A window menu that would allow switching between different notebooks easily would be very helpful. For example, in chrome I can go to the Window menu and easily see what is open and switch between different open tabs by name.
I have some issues when it comes to displaying characters that are not ASCII (e.g. äöü or €) (actually I'm not sure about the encoding).
Nowhere in the Cauldron app, I can find an option to define the encoding. If I change the file encoding from UTF-8 to windows-1252, these characters are displayed correctly. But this introduces some other issues. E.g. error tracebacks are not shown, but rather
Failed with encoding: utf-8
'utf-8' codec can't decode byte 0xe4 in position 145: invalid continuation byte
The whitespace added by the embedded step:
__cauldron__.display.whitespace(0)
should not be added to the console display.
In an ideal world, everything you see can be inspected and manipulated (a la Bret Victor). In Spyder or a Jupyter notebook, you can manipulate objects by typing some code (with autocompletes sometimes based on the live object) and pressing Cmd-Enter. In Cauldron, poking at something requires switching to a separate code editor and taking what feels like a shot in the dark.
How could interacting with Cauldron feel more "live"?
Would it be possible to have a simple command in cauldron that would print out how long it took for a step to run? Something like cd.track_time()? I find myself regularly wanting to know how long a given step runs. I can add the code to each step pretty easily, but it would be a nice feature if that was something I could just turn on in Cauldron.
Step testing currently supports the unittest
library, and adding pytest
support would allow more flexibility in how tests are defined and handled.
I'm able to scroll up and down, but not left and right, with the tables output by cd.display.table.
Plotly scale is not being properly applied on non-iOS devices that have support for touch interactivity.
Add support for writing messages out only to the console from the notebook so that they appear in log output but not in the display of the running notebook. The capability for doing this already exists internally in the RedirectBuffer
, but should be exposed to make it possible to use externally.
Version Info:
cauldron-notebook (0.0.16, /Users/cswarth/Development/cauldron)
Cauldron-1.270.509.dmg
$ python3 --version
Python 3.5.2
I don't appear to be able to restart the python kernel from the Cauldron MacOS UI.
Even if the UI is freshly started, I end up at a page that says "Applying Changes" with a spinning icon. Is the kernel even running at this point, before I have actually created a notebook?
Reproduce with:
A great feature would be the option to include own css-files to provide some custom styling. Is that something, that's on your roadmap?
It would be great, especially during early development, to have a button on the UI that collects relevant version information in one place that can be copy-pasted into a bug report.
The button would present a popup with copyable text with version or settings you feel most important to include in issues. You could even could pre-populate the github issue form with a request to include that config info, along with instructions for how to retrieve it.
[ Also could we get a version number include in cauldron-notebook? cauldron.__version__
isn't defined, and there doesn't seem to be any other internal storage of version info.
# jinja has version info
$ python -c 'import jinja2; print(jinja2.__version__)'
2.8
# cauldron does not :-(
$ python -c 'import cauldron; print(cauldron.__version__)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: module 'cauldron' has no attribute '__version__'
]
The config info available from the UI might include,
ProductName: Mac OS X
ProductVersion: 10.11.6
BuildVersion: 15G1004
Versions:
python: Python 3.5.2
cauldron UI: 1.270.509
beautifulsoup4 (4.5.1)
cauldron-notebook (0.0.16)
certifi (2016.8.31)
click (6.6)
Flask (0.11.1)
itsdangerous (0.24)
Jinja2 (2.8)
Markdown (2.6.6)
MarkupSafe (0.23)
numpy (1.11.1)
pandas (0.18.1)
pip (8.1.2)
Pygments (2.1.3)
python-dateutil (2.3)
pytz (2016.6.1)
setuptools (26.1.1.post20160901)
six (1.10.0)
Werkzeug (0.11.10)
wheel (0.29.0)
I noticed some unexpected behavior when setting the name of a step.
There is an asterisk above filename that I don't understand. Is that providing some sort of warning?
If one includes a filename extension in the name of the step, the UI appears to incompletely remove the extension before creating the filename. I also have to ask why not let me call the filename anything I want, instead of prepending "S0-" onto the front? Is the S0 used to order the steps in the notebook?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.