First of all thanks for the new Prompt RegEx feature โค๏ธ
I did just try it out and after some fiddeling around with the escaping, it works like a charm.
Expected behaviour
First of all I think in some edge cases it might make a difference to mention that the RegEx you type in your config.py
is a JS RegEx and not a python one.
So that said IMHO it should work similar to this:
copybutton_prompt_text = r"<valid_js_reg_ex> "
copybutton_prompt_is_regexp = True
๐ ๐ฆ
What I did
In my case the prompt I wanted to get rid of was of the form
which is the jupyter console
style and pretty much the same as the iPython style from the docs
"\\[\\d*\\]: |\\.\\.\\.: "
So it should be straight forward
"In \\[\\d*\\]: |\\.\\.\\.: "
as in the docs and I would be done?
But it didn't work, so I did some digging and ended up with:
"In \\[\\d*\\]: |\\.\\.\\.: |\\$ "
What internally happens
In my config.py
, I set copybutton_prompt_text
copybutton_prompt_text = r"In \[\d*\]: |\.\.\.: "
which is the same as:
copybutton_prompt_text = "In \\[\\d*\\]: |\\.\\.\\.: "
Sphinx now replaces the context variable copybutton_prompt_text
in copybutton.js_t
and generates copybutton.js
.
But when we look at that string which should have just been passed as is, it is missing backslashes.
'In \[\d*\]: |\.\.\.: |\$ '
Well this looks like a totaly valid JS RegEx, so this is fine?
And it would be if written like this (JS just wants keep people on their toes, so why not a different character for RegEx strings?):
/In \[\d*\]: |\.\.\.: |\$ /
Now the string gets passed to the RegExp constuctor, which expects strings to be escaped like python, if you don't use raw strings.
But it doesn't match.
> "In [1]: bar".match(new RegExp('^(In \[\d*\]: |\.\.\.: )(.*)'))
null
Why is this going wrong and why didn't the tests catch this?
TLDR
IMHO this all comes down ro JS RegExp 'helping' you in some cases when it can interprete what you might have ment, which gives you false security (false positiv match in the tests).
Details
Disclaimer I'm in no way a JS expert with insight into the internals, I just found this due to WTF moments + trail and error.
> let proper_regex = new RegExp('^(>>> |\\$ |In \\[\\d*\\]: |\\[\\d*\\]: |\\.\\.\\.: )(.*)')
undefined
> let false_positive_regex = new RegExp('^(>>> |\\$ |In \[\d*\]: |\[\d*\]: |\.\.\.: )(.*)')
undefined
>"$ bar".match(proper_regex)
(3)ย ["$ bar", "$ ", "bar", index: 0, input: "$ bar", groups: undefined]
> "[1]: bar".match(proper_regex)
(3)ย ["[1]: bar", "[1]: ", "bar", index: 0, input: "[1]: bar", groups: undefined]
> "[1]: bar".match(false_positive_regex)
(3)ย ["[1]: bar", "[1]: ", "bar", index: 0, input: "[1]: bar", groups: undefined]
> "...: bar".match(proper_regex)
(3)ย ["...: bar", "...: ", "bar", index: 0, input: "...: bar", groups: undefined]
> "...: bar".match(false_positive_regex)
(3)ย ["...: bar", "...: ", "bar", index: 0, input: "...: bar", groups: undefined]
So far so good, all testcases work fine, as we would expect since they pass.
> "In [1]: bar".match(proper_regex)
(3)ย ["In [1]: bar", "In [1]: ", "bar", index: 0, input: "In [1]: bar", groups: undefined]
> "In [1]: bar".match(false_positive_regex)
null
The only explaination I got for this behaviour, is that JS wants to "help" the user by guessing it's intention, like the famous adding thingy:
but at some points it is like "I don't get it", and the pattern that worked before fails.
Possible fix
Use repr
in add_to_context
If this line was changed to:
{"copybutton_prompt_text": r"{}".format(repr(config.copybutton_prompt_text))[1: -1]}
I know it looks ugly, but would add extra backslashes for escaping (maybe someone else has a more elegant solution).
I'm not sure how much escapeRegExp
would need to be adjusted to those changes.
I didn't have luck with backslash replacing in JS:
> 'In \[\d*\]: |\.\.\.: '.replace(/\\/, '\\\\');
"In [d*]: |...: "
> 'In \[\d*\]: |\.\.\.: '
"In [d*]: |...: "
So in conclusion:
Python โค๏ธ
JS ๐
P.S.: Wish we had the sphinx-copybutton
on github, for the JS blobs I posted ๐ข