Coder Social home page Coder Social logo

Comments (9)

protesilaos avatar protesilaos commented on July 28, 2024

Currently denote-faces--file-name-regexp only recognize ascii title and keywords, can we change it to make it recognize non-ascii charactors?

Yes, sure! Is [:word:] the correct class? I am reviewing the manual now and get the impression that [:alpha:] is what we need (not tested and not sure). Evaluate:

(info "(elisp) Char Classes")

And maybe also set the value of denote--punctuation-regexp to [[:punct:]] to make denote aware of non-ascii punctuations?

Again, no problem. We just need to be certain that the current set is not affected and we don't get false positives. For example, we allow - but not _.

from denote.

 avatar commented on July 28, 2024

Ok I did some reading and test, I think both [:word:] and [:alnum:] will suffice while [:word:] might be the safer option.
The description of [:word:] or \w is a little unclear

Word constituents: ‘w’
Parts of words in human languages. These are typically used in
variable and command names in programs. All upper- and lower-case
letters, and the digits, are typically word constituents.

but when invoke describe-char on a ASCII letter or multi-byte letter, it showssyntax: w which means: word so I guess Emacs might prefer [:word:]...

[:alpha:] however won't match digits, it only match characters.

Punctuation is kinda tricky, since Emacs' regexp is consumable, one can't match the same char twice in one regexp, something that match all punctuation but not - can't represent by a single regexp.

My suggestion is either make denote--punctuation-regexp customizable to allow user append any character they want, or change the logic so that it match all punctuation except for characters listed in say denote--punctuation-reserved-regexp.

from denote.

protesilaos avatar protesilaos commented on July 28, 2024

Ok I did some reading and test, I think both [:word:] and [:alnum:] will suffice while [:word:] might be the safer option.

I am fine either way. The key is to not break what we have.

My suggestion is either make denote--punctuation-regexp customizable to allow user append any character they want

Maybe we can do with a defvar instead of a defcustom? I try not to add defcustoms too early, because we may come up with a better solution.

Would this proof-of-concept diff work for you?

 denote.el | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/denote.el b/denote.el
index 920a707..2cdeaec 100644
--- a/denote.el
+++ b/denote.el
@@ -327,6 +327,8 @@ (defconst denote--punctuation-regexp "[][{}!@#$%^&*()_=+'\"?,.\|;:~`‘’“”
   "Regular expression of punctionation that should be removed.
 We consider those characters illigal for our purposes.")
 
+(defvar denote-punctuation-excluded nil)
+
 (defvar denote-last-path nil "Store last path.")
 (defvar denote-last-title nil "Store last title.")
 (defvar denote-last-keywords nil "Store last keywords.")
@@ -353,7 +355,7 @@ (defun denote-directory ()
 
 (defun denote--slug-no-punct (str)
   "Convert STR to a file name slug."
-  (replace-regexp-in-string denote--punctuation-regexp "" str))
+  (replace-regexp-in-string (concat denote--punctuation-regexp denote-punctuation-excluded) "" str))
 
 (defun denote--slug-hyphenate (str)
   "Replace spaces with hyphens in STR.

or change the logic so that it match all punctuation except for characters listed in say denote--punctuation-reserved-regexp

Yes, this is another possibility.

from denote.

 avatar commented on July 28, 2024

Yes the patch works, I can now remove full width punctuations simply by some thing like this:

(setq denote-punctuation-excluded
      "[『』〖〗{}「」【】〔〕[]()《》〈〉«»!#¥%…&"'*,。;:、?—]*")

from denote.

protesilaos avatar protesilaos commented on July 28, 2024

Thank you! I just pushed the change:

  commit fb534987cef774a6e8b1584630bf019556db3d8f
  Author: Protesilaos Stavrou <[email protected]>
  Date:   Wed Jul 20 07:16:32 2022 +0300
  
      Add variable to extend "illegal" punctuation
      
      This is to allow users to omit non-ascii characters from file names.
      
      Thanks to pRot0ta1p for the feedback in issue 57 over at the GitHub
      mirror: <https://github.com/protesilaos/denote/issues/57>.
      
      Example based on the input of pRot0ta1p:
      
          (setq denote-punctuation-excluded-extra-regexp
                "[『』〖〗{}「」【】〔〕[]()《》〈〉«»!#¥%…&"'*,。;:、?—]*")
  
   denote.el | 11 ++++++++++-
   1 file changed, 10 insertions(+), 1 deletion(-)

I wrote this for the variable, so that we keep it in mind if a robust
solution comes up:

(defvar denote-punctuation-excluded-extra-regexp nil
  "Additional punctuation that is removed from file names.
This variable is for advanced users who need to extend the
`denote--punctuation-regexp'.  Once we have a better
understanding of what we should be omitting, we will update
things accordingly.")

from denote.

 avatar commented on July 28, 2024

Thanks! That's one problem solved, what should we do with denote-faces--file-name-regexp ? Change it to a defvar to test it out before we figure out the perfect regexp?

from denote.

protesilaos avatar protesilaos commented on July 28, 2024

It already is a defvar. But maybe you mean we should follow the same approach of concating what we have with a user-defined variable? I am okay with that. Just asking to be sure.

from denote.

 avatar commented on July 28, 2024

Oops, sorry I didn't check the source code refore reply.
I'm ok with what it is now, the definiton of denote-faces--file-name-regexp is clear and simple, user who want to hack it should have no problem.

Close this issue for now, in case there are any breakage I will report to you. Thanks again.

from denote.

protesilaos avatar protesilaos commented on July 28, 2024

You are welcome! Please let me know how it goes.

from denote.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.