Feature suggestion: use custom filter to split sections about tw-section HOT 8 CLOSED

Gk0Wk commented on June 1, 2024 2

Feature suggestion: use custom filter to split sections

from tw-section.

Comments (8)

pmario commented on June 1, 2024 2

if ((options.all || options.h1) && /^!\s+[^\n]+$/.test(line)) {

That's not how the TW parser tests for headings. See: !!.myStyle heading text comes here and the whitespace after !! is optional.

from tw-section.

kookma commented on June 1, 2024 2

Hello! Earlier in the TiddlyWiki Talk, I mentioned that I would like to be able to segment tiddlers not only based on first and second-level headers, but also in a more flexible way. After that, I tried to write a custom filter to achieve this.
Then, change the filter expression in line 11 of source/section/macros/main.tid to [<sourceText>sectionsplit[h1+h2+h3]!is[blank]regexp<nonWhitespace>].

Thank you, I welcome your contribution! Unfortunately I know little JS. but this is a nice filter if it can accept different rules!
Would you please also add rules to split on <tag ...> contents </tag> a tag can be section, or article or other sematic html5 tag!

Then TW-Section can split toddler by h1, h2 and h3. You can change it to [h1+h2+h3+h4] and so on to take more split rules. Now this filter support:

h1 ~ h6

hr

blank, means blank line

all, means all above

does this means I can pass sectionsplit[h1+h2] and split only on heading one and two?

Also as @pmario said, in Tiddlywiki we have !!.myStyle heading text comes here which means you can apply styles through css classes! Would you please correct the filter to handle such cases!

If you think this filter is useful, I can add more rules to it. 😄

Yes, this is absolutely useful!

Thank you

from tw-section.

Gk0Wk commented on June 1, 2024 1

Hi @kookma !

I suddenly had a better idea to use TiddlyWiki's internal syntax parser to generate a document parse instead of writing one myself. This approach should allow for easier and more flexible segmentation (including the features mentioned above). I'll try this approach when I have time.

from tw-section.

Gk0Wk commented on June 1, 2024

That's not how the TW parser tests for headings. See: !!.myStyle heading text comes here and the whitespace after !! is optional.

@pmario Thank you! I have changed the expressions of header checking.

does this means I can pass sectionsplit[h1+h2] and split only on heading one and two?

@kookma Yes, sectionsplit[] (but useless), sectionsplit[h2], sectionsplit[h1+h2], sectionsplit[hr+h3+h1] can also work. The order and number of rules do not matter.

It is possible to split tiddler with HTML tags, but there are some problems:

I need to write a state machine to find opened/closed tags, the regex of </[^>]+>/gim can work on most HTML text but some special ones like <tag attr=">"></tag>.
The HTML section may not start at the head of the line, and may not end at the tail of the line. For example:

* Hello <div>
World
</div>!

The HTML sections may not close or close wrongly, how to solve them?

So, I think there are some questions for us to discuss before adding this important rule.

I also have another question. How does the plugin work? If I remove some blank lines and trim returned text (for example, \n123\n -> 123), can the plugin work in the proper way?

Thanks all.

Updated js code:

JS Code

/*\
title: $:/core/modules/filters/sectionsplit.js
type: application/javascript
module-type: filteroperator
creator: Gk0WK(nmg_wk @yeah.net)

Split sections of wikitext

\*/
(function() {

    /*jslint node: true, browser: true */
    /*global $tw: false */
    "use strict";

    /*
    Export our filter function
    */
    exports.sectionsplit = function(source, operator, options) {
        var splitOptions = {};
        operator.operand.split('+').forEach(option => {
            splitOptions[option] = true;
        })
        var results = [];
        source(function(tiddler, title) {
            var section = [];
            title.split('\n').forEach(line => {
                var newSection = false;

                if ((splitOptions.all || splitOptions.h1) && /^!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.h2) && /^!!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.h3) && /^!!!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.h4) && /^!!!!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.h5) && /^!!!!!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.h6) && /^!!!!!!(\.[\w\-]+)?\s+[^\n]+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.hr) && /^---+$/.test(line)) {
                    newSection = true;
                } else if ((splitOptions.all || splitOptions.blank) && /^\s*$/.test(line)) {
                    newSection = true;
                }

                if (newSection && section.length > 0) {
                    results.push(section.join('\n').trim());
                    section = [];
                }
                section.push(line);
            });
            results.push(section.join('\n').trim());
        });
        return results;
    };

})();

from tw-section.

kookma commented on June 1, 2024

@Gk0Wk thank you!
I decided to keep the simple version only a wikitext plugin (without JS). Then we can have subplung or advanced SE to include the JS and more sophisticated parsing!

I think for now

only sectionize the part started by a heading (e.g. !, !!, etc.)
ignore the part at the beginning of tidler with no heading

support for <section>...</section>
OR some custom tag like <chapter>....</chapter>

but not any tag!

from tw-section.

kookma commented on June 1, 2024

That would be great! I appreciate if you could give me a short working example!

from tw-section.

Gk0Wk commented on June 1, 2024

@kookma Hi. I'm sorry to inform you that the TiddlyWiki core parser does not parse results with substring location information, which makes the syntax tree unusable for splitting sections. I think a better way is still to write a separate syntax parser. As it happens, I am implementing such a parser in my TW5-CodeMirror-Enhanced project, and if I finish a simple parser I will try to use it in TW_Section.

BTW, I encountered some problems in using TW_Section, which are supposed to be caused by overly simple regular expressions. TW_Section does not correctly divide and render the following two cases:

case 1:

<<<
! a
<<<

case 2:

```
! a
```

from tw-section.

kookma commented on June 1, 2024

I am implementing such a parser in my TW5-CodeMirror-Enhanced project, and if I finish a simple parser I will try to use it in TW_Section.

Lovely, I am sure the TW5-CodeMirror-Enhanced will receive a lot of attention, as it modernize the editor and simplify the writing in TW.

BTW, I encountered some problems in using TW_Section, which are supposed to be caused by overly simple regular expressions. TW_Section does not correctly divide and render the following two cases:

Yes, I am aware of this and will add this in documentation!
Another issue is with local macros! If you sectionize a tiddler with local macro, then macro do not works in other sections!

That is why I raised the definition of tiddlers in Talk Tiddlywiki! Section Editor best works for plain tiddler!

<<<
! a
<<<

do you have a workaround for this? In documentation I recommend to use <h1>... but this is a temporary solution.

from tw-section.

Feature suggestion: use custom filter to split sections about tw-section HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent