Comments (20)
I started experimenting with relaxing the constraint. There are more edge cases than I originally anticipated. As well as some usability gotchas: imagine redefining a variable (without knowing it) in a very expensive cell. Only to find that you can't read it because it was defined elsewhere. That would make me very sad.
from marimo.
Mock up example:
from marimo.
Allow multiple definitions of a variable if it is never read before being assigned to in any cell.
This is a really interesting suggestion. It's certainly possible technically. And it also seems correct ...
I need to think through if there are any edge cases or gotchas. Are there any you can think of?
from marimo.
I was thinking about this earlier when I did for i in range(x)
for what felt like the millionth time.
More suggestions:
Using a with
e.g.
with mo.local_scope():
ax, fig
Use type hints (which can be statically parsed)
ax # mo.local
catching the error and prompting to rename with the _ prefix (that would be my preference, var defaults are just muscle memory)
from marimo.
I was thinking about this earlier when I did for i in range(x) for what felt like the millionth time.
Yes, exactly! Currently, you can not have for i in range(n):
in two cells, even if i
is never read outside the for loop scope. Perhaps I should have mentioned this as well since it is very common!
from marimo.
I need to think through if there are any edge cases or gotchas. Are there any you can think of?
I don't know any. I have been meaning to use Marimo more seriously but I keep running into this when converting existing notebooks. It seems like it would be possible to relax the constraints a bit and still be safe
from marimo.
Maybe have a conversion wizard for Jupyter? Prompt the user for variables that are used like ax or fig and automatically replace during the conversion process.
marimo could even suggest variables based on def frequency.
Doing this in conjunction with the "rename variable" option on error would go a long way.
I can only speak for myself, but I do like the fact that I know that all variables are global and not previously defined. I think relaxation starts to introduce the same issues - but I absolutely get the frustration. I think the biggest sticking for me is the actual conversion though, and aside from that- marimo forces me to write better code
from marimo.
I don't know any. I have been meaning to use Marimo more seriously but I keep running into this when converting existing notebooks. It seems like it would be possible to relax the constraints a bit and still be safe
Yes, that makes sense. I thought about it and don't foresee any correctness issues either. Relaxation does make multiple definition errors more confusing when they do happen, since the relaxed rule is more complicated to explain.
In the following sequence,
- Create a cell defining
x
. - Read
x
in another cell. - Create another cell defining
x
.
Cell 3 would not be run, and instead would display the error:
MultipleDefinitionError: Variables defined in multiple cells cannot be read before assignment within a cell.
This error message would link to the cells from steps 1 and 2.
Whereas in the following sequence:
- Create a cell defining
x
. - Create another cell defining
x
. - Read
x
in another cell.
The cell in step 3 cannot be run. I suppose it would show the same error, and also link to the cells from steps 1 and 2.
It may be confusing for a read to trigger a MultipleDefinitionError
. marimo already has a learning curve, so I am hesitant to do things that make that learning curve steeper.
Maybe have a conversion wizard for Jupyter? ...
If it's just for marimo convert
— marimo could uniquify variable names that are never read before being defined in a cell. It wouldn't be pretty, but it would at least make the conversion work in more cases without user intervention. So perhaps this could be a compromise we settle on for now?
from marimo.
marimo already has a learning curve, so I am hesitant to do things that make that learning curve steeper.
Fair enough, this is important but the advantages are also pretty big in this case since a quite common pattern would "just work" instead of raising errors
from marimo.
MultipleDefinitionError: Variables defined in multiple cells cannot be read before assignment within a cell.
The error could be more verbose than this.
Caution
InconsistentUsageError: Variables may be declared in one cell and referenced in multiple; or declared in multiple cells and used as a referenced nowhere. Cells x, y, z
declare variable
, and Cells a, b, c
reference variable
. Remove all references to variable
or reduce declarations to one.
from marimo.
But because execution is dynamic I think this is not super straightforward. Back to Akshay's example:
- Create a cell defining x.
- Create another cell defining x.
- Read x in another cell.
Receive error
4 . Delete the definition in cell 2
Cell 1 gets promoted to definition
- Change cell 3 to definition
Cell 1 gets demoted.
I think there's a lot more bookkeeping than currently takes place. All repeated cell-level variables will have to be kept in memory and then swapped out from hidden context to global context.
from marimo.
Assigned to myself — to either relax the constraint or improve marimo convert
.
from marimo.
MultipleDefinitionError: Variables defined in multiple cells cannot be read before assignment within a cell.
The error could be more verbose than this.
Caution
InconsistentUsageError: Variables may be declared in one cell and referenced in multiple; or declared in multiple cells and used as a referenced nowhere. Cells
x, y, z
declarevariable
, and Cellsa, b, c
referencevariable
. Remove all references tovariable
or reduce declarations to one.
Okay, yes, that's much better than my concise error message :)
from marimo.
All repeated cell-level variables will have to be kept in memory and then swapped out from hidden context to global context.
@dmadisetti Or we could re-run ancestors as needed. If a cell that has been requested for execution has a reference whose definition isn't in globals, run the cell that defines it (as well as its descendants) before running the requested cell. Thoughts?
from marimo.
Certainly a solution, but could also lead to a number of unintentional reruns.
Another suggestion: app = App(local=("ax", "fig", "i"))
might be the easiest from an engineering perspective. Also a couple of user benefits:
- outside of runtime context, which is important for initial graph construction
- explicit behavior. You might trigger one of the modes and not even know it otherwise until you try to use a 3rd cell.
- no runtime penalty
- can be tuned for jupyter conversion (cells may utilize "axis" or "figure" instead of "ax" or "fig", and they might be a keyword worth marking as local)
As for implementation:
- propagate information to visitor.is_local and just use the current local mechanism
- Raise custom error if one of the special locals is a ref
- Ensure exporter saves the information
- (maybe) UI means of editing the values
I think runtime implementation could put notebooks in an ambiguous state, which is not a great user experience. This is a compromise with the downside of slightly hiding the information that some values will treated as private (but partially mitigated with a custom error).
from marimo.
How would one specify the locals from inside a notebook? Or is the idea that locals would only be specified by editing the notebook file?
I think with some careful bookkeeping, one could minimize unintentional re-runs. Your solution of keeping variables in memory would also work, and would have no unintentional reruns, at the cost of memory consumption.
from marimo.
The variables panel is a great place to mark variables as being private:
Maybe a checkbox? I was only mentioning saving inside the notebook for persistence (same way layout file reference is saved behind the scenes, or show_code
state on the cell level).
The book-keeping approach would have to account for additional kernels (thinking the strict execute pr)- but still could work
from marimo.
I think runtime implementation could put notebooks in an ambiguous state, which is not a great user experience.
Yeah, you're totally right. That's what I concluded too after experimentation
... This is a compromise with the downside of slightly hiding the information that some values will treated as private (but partially mitigated with a custom error).
The hiding could be confusing ... but otherwise I agree with your pros, especially with the explicitness.
Mock up example:
Nice mockup! If we wanted to use the existing mechanism for local variables, "change to private read-only variable" could just prepend an underscore to the variable name.
Another solution that would be explict, without having to use underscores, would be to define a custom type annotation that marimo understood. Something like:
fig: "local"
ax: "local"
fig, ax = plt.subplots()
...
That's something I would be open to. But maybe underscores is fine for now ...
from marimo.
Closing as not planned for now — in favor of the marimo convert improvement — given how this can get the runtime in an ambiguous state. Still happy to revist relaxing the constraint later, if anyone has further thoughts, and still happy to riff on other ways we can improve this experience
from marimo.
Thanks for seriously considering it! Maybe it's possible to improve this somehow....
from marimo.
Related Issues (20)
- Content created through JS bypasses user defined CSS HOT 4
- Configuration option to display dataframes using plain formatter
- Create a Copy Notebook Option
- Interruptible RPCs
- use external editor (e.g. neovim) for writing code, hot-reload it in the marimo editor HOT 2
- docstring popup for functions when in brackets HOT 3
- Some Firefox keyboard shortcuts are captured, while some not HOT 4
- Headings with the same name are highlighted multiple time in the table of contents
- showing a polars dataframe with array column is slow HOT 3
- mo.mpl.interactive() leaves blank figures if you're not looking HOT 13
- lazy cells (re-work of disabled cells) HOT 5
- Show Python results in Python notation HOT 2
- ipyaladin viewer style sheet issue in marimo notebooks HOT 2
- "no module named pandas" when plotting Polars DataFrame with Altair HOT 2
- Assertion errors when adding a custom logging handler
- ability to disable `import *` error HOT 2
- support for spyder-like view of dataframes and arrays HOT 2
- cell output location is not preserved when exporting HOT 1
- Inquiry about a jupytext-like tool for Marimo notebooks. HOT 9
- Font enlargement HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from marimo.