Coder Social home page Coder Social logo

Have a search bar about mdbook HOT 53 CLOSED

rust-lang avatar rust-lang commented on August 15, 2024 6
Have a search bar

from mdbook.

Comments (53)

olivernn avatar olivernn commented on August 15, 2024 3

So, I've put together a basic implementation of a Lunr backend in Rust. It works, the included example will generate an index that can be loaded by lunr.js.

There is a lot still to do, I've opened issues trying to cover the major ones. Also, as previously mentioned I'm still new to Rust so I'm positive there are a thousand ways the implementation could be improved, is anyone willing to lend a hand?

from mdbook.

azerupi avatar azerupi commented on August 15, 2024 1

Thanks for the input!

I think that tantivy would provide more performant results than lunr js but would no longer be a purely static output.

Yeah I think that loosing the static aspect is a really big drawback. I am not sure I would be willing to accept that. Gitbook uses Lunr, I think that is the way to go, at least for now.

from mdbook.

mattico avatar mattico commented on August 15, 2024 1

I have a port of elasticlunr to rust which may be useful. It's pretty early, but I plan on fleshing it out over the coming weeks.

from mdbook.

olivernn avatar olivernn commented on August 15, 2024 1

@mattico looks like a good start, I'm curious though why you went with elasticlunr instead of Lunr?

I might take a stab and putting something similar together for Lunr, I'm an enthusiastic amateur (at best) when it comes to Rust so will probably take it slow.

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024 1

@mattico I could also bring some of my time to this project, so I just go and do the UI and client side stuff. (hope you did not start that as well :D). Branch: https://github.com/Phaiax/mdBook/tree/search-eljs

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024 1

I moved the search results to the center, added history functionality, and fixed mobile for chrome on android at least. Probably need browser cache flush (Ctrl+Shift+R): https://phaiax.github.io/mdBook/cli/init.html

I also tried to integrate elasticlunr-rs, but I did not use the resulting index on the browser side. Phaiax@ac236e8 . @mattico I'll wait until you got forward with your crate.

More Todos:

  • Quit search on ESC
  • Colors of search bar on other themes

from mdbook.

mattico avatar mattico commented on August 15, 2024 1

Quick status update about elasticulnr-rs.

The json index looks correct except that the leaf nodes of the index don't actually link to any documents. Current plan is to port over tests from the JS version, which should uncover the issue. Then get it to work on stable, polish, docs, etc. Likely won't have time to work on this until the weekend.

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024 1

The search is usable now .
Currently the generated index does not work properly, but I have not investigated yet. As a workaround, I use the documents in the index to regenerate it in the browser :D .

I'm not sure about the design of the results and the preview text yet.

Has anyone an idea for an algorithm to select a better snippet from the unformatted paragraph string?

  • Input:
    • The search terms: "link previous"
    • The document, but it is possible that it does not contain any of the search terms or contains one more than once.

It would be best to decide for some snippet that contains the area of the document where the search terms occour the most.

And for testing purposes, this is the new search with the second edition of the rust book. The searchindex.json file is 2.3 MB, but is loaded asynchonously.

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024 1

@budziq

  • Limiting the number of search results is possible, but i'll wait until the config overhaul is done
  • Search highlight is now removed when Esc is pressed. Is that enough?
  • Yes, the different fields of a search document can be 'boosted', that means prioritized. Just a number, but I will make that configurable.
  • For the custom metadata: It is always possible to add a <metadata>Crates: rayon</metadata> to the md files and style those tags as display: none.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

A search bar would indeed be very useful, but I don't think it's trivial to implement.

Ideally there would be some priority, like this:

Sidebar title > h1 > h2 > h3 > h4 > ... > bold > other text

that would require to parse the markdown a second time, maybe we can hook into the parser from pulldown-cmark without the renderer?

Being able to search for programming keywords would actually make it very unusual and better than most too (outside the scope of a regular search bar though)

A more general approach could be to specify high priority keywords in book.json. I am not sure how difficult that would be and how well that would work though.

from mdbook.

steveklabnik avatar steveklabnik commented on August 15, 2024

👍 this is something people have asked from rustbook forever see rust-lang/rust#22786

from mdbook.

asolove avatar asolove commented on August 15, 2024

Is anyone else interested in working on this? If not, I might take a crack at generating a naive search index and seeing how it works on the rust book.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

I have exams until end of January, so.. be my guest! 😉

I think using lunr.js has been considered in the past, so maybe take a look. But if you have other ideas, go ahead. I haven't done a lot of research about the subject.

If you have any question about the project, or you want to share advancement on this matter, feel free to post here! Good luck 👍

from mdbook.

 avatar commented on August 15, 2024

So I just noticed I can't search the Rust reference anymore, which isn't really helpful.

Looking already existing search frameworks has resulted in these:

Using Tantivy would require compiling to asmjs (I guess around 10MB), or rewriting the search functionality in JS. And the search index needs to be deployed with the book.

With the two JS variants the situation is a bit different: The index could be generated during compilation with node.js, or when the book is opened.

I personally prefer the last option. It saves space for the Index, does not introduce additional build dependencies and the search framework can be easily updated.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

So I just noticed I can't search the Rust reference anymore

I wasn't aware there was a search functionality in the old reference? Or was it just the browsers ctrl-f? Because in that case there is a workaround, the 'print' icon in the upper right corner redirects to a page where all chapters are displayed on one page that can be ctrl-fed.

Granted, it's not intuitive and a real search functionality would be better :)

from mdbook.

steveklabnik avatar steveklabnik commented on August 15, 2024

@azerupi you are correct on all counts.

from mdbook.

crumblingstatue avatar crumblingstatue commented on August 15, 2024

Just here because I also ran into rust-lang/reference#73.

from mdbook.

lilianmoraru avatar lilianmoraru commented on August 15, 2024

I wanted to open this same issue...
The reference has become very hard to use(trying to abstain myself from harsher words) - I just use the Rust 1.12 reference manual.
I did not think of the print icon(usually this creates a PDF file or something like this) but it is definitely not intuitive, I expect most people to not reach for it.
I use the reference manual very often to find what kind of configs I can use(target_os, etc...).

from mdbook.

budziq avatar budziq commented on August 15, 2024

Hi @crumblingstatue and @lilianmoraru search is certainly somewhere on the wishlist but currently we have no ETA for such a feature. But if anyone is willing to start working on it (or even start a discussion about possible implementation) they are very welcome!

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

There are two paths I can see with this implementation.

Use Lunr JS in the front end

The first is simply using lunr.js and doing everything in the frontend (pretty inefficient as the site gets bigger). This is something that mkdocs does already.

Process would be to generate a JSON document that lunr.js can load and use to provide search capabilities. This would just be an extra output type.

Use Tantivy in the back end

The second would be to support a server-based search implementation. This would mean though that mdbook wouldn't be a purely static output, rather it would require a server to serve requests. Something like tantivy would be a perfect fit. Taking it a step further maybe compiling to WASM so you could run it on the front end, although I don't know how feasible that would be.

Process here would be maybe to switch it on as a flag, but have it running in the backend as a server, and have tantivy reindex on build.

I think that tantivy would provide more performant results than lunr js but would no longer be a purely static output.

from mdbook.

budziq avatar budziq commented on August 15, 2024

Unfortunately introducing dynamic backend requirement as a requirement would be a nogo for me in terms of rust-cookbook maintenance 😞.

I would also go with the Lunr.js. It seams really snappy on large gitbooks (on par or larger than rust-book) that I've tested sofar (both when searching the index and with full text search). I think that we might think about optimization once we encounter any significant performance problems.

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

Yep, I figured that a dynamic backend wouldn't be appropriate, but put it out there as a possible option. I still like the WASM idea though! Don't know if that is feasible at this stage, but would be awesome.

Is NPM/Node already a requirement for building using mdbook? I couldn't see a package.json anywhere. If so, you can use lunr.js to prebuild an index which might speed things up a little bit.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

Is NPM/Node already a requirement for building using mdbook?

It is not required for building mdBook, but it is required as a dev dependency to compile the stylus files.

you can use lunr.js to prebuild an index which might speed things up a little bit.

I would definitely build the index at build time. Is Lunr required for this or is it simple enough to do it ourselves with Rust?

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

You can either supply a list of documents in json format and have lunr build the index in the browser. This is simple because the json is basically just field and value. But it does mean it is a bit slower.

Building the index ahead of time seems a bit less trivial, and while doable, I think would involve reimplementing quite a lot of lunr js stuff. There may be an existing rust library out there that would be able to do it, although nothing I know of.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

If there is some sort of "specification" of the JSON index that Lunr uses we can decide if we construct it ourselves or not. I am under the impression that it could be relatively easy to do with Serde. We could then release it as a separate crate that anyone can use, I'm sure other people would find it useful :)

from mdbook.

steveklabnik avatar steveklabnik commented on August 15, 2024

One thing that I'd like to consider, just generally, is a way to do this separate from mdbook directly; that is, I have dreams of rustdoc integrating mdbook someday, and ideally search would work across all docs, not just the book. If there's a way to make that possible here, that'd be extra amazing. At the minimum, a way to turn mdbook's built in search off would enable this.

It's far-future work but I figured i'd mention it.

from mdbook.

budziq avatar budziq commented on August 15, 2024

If there is some sort of "specification" of the JSON index that Lunr uses we can decide if we construct it ourselves or not.

@azerupi I've reached out to the lunr.js guys. The required work might be a little more involved than expected but they are eager to help with inderstanding the fine details.
The JSON schema is described here

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

@budziq Thanks a lot for the initiative!
I am going to take some time to figure out the JSON schema. The idea of leveraging tantivy to generate the index is interesting. I'm not sure how feasible it is.

from mdbook.

budziq avatar budziq commented on August 15, 2024

We could also think about stealing rustdoc's search code index generator although its not really general purpose

from mdbook.

steveklabnik avatar steveklabnik commented on August 15, 2024

We are basically not considering that a viable move forward for new rustdoc, as it's home-grown and not very good.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

@steveklabnik what solution is considered in the new rustdoc? It might be better to join forces on this.

from mdbook.

steveklabnik avatar steveklabnik commented on August 15, 2024

We haven't gotten that far yet; joining forces would be a great idea.

from mdbook.

budziq avatar budziq commented on August 15, 2024

@mattico Awesome!

cc @Phaiax

from mdbook.

mattico avatar mattico commented on August 15, 2024

@olivernn no big grand reason.

I liked that elasticlunr supported boolean queries, query-time boosting, and that its indexes are smaller. It uses a standard Inverted-Index which was easier to understand, though recursive trees like that are always a bit of a pain in Rust.

On the other hand, its documentation is not as good, it's not as popular (I assume), and none of the above features are super important for this use-case. Having a documented JSON schema is also a point for Lunr.

If I were going to do it all again I'd probably go with Lunr. I may still fork this to create lunr-rs. Unless you beat me to it 😉 .

from mdbook.

budziq avatar budziq commented on August 15, 2024

@olivernn @mattico @Phaiax I'm thrilled that you are all interested in this super important feature. I'm basically game with whatever you decide on using 👍

from mdbook.

budziq avatar budziq commented on August 15, 2024

awesome!

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024

I did a first stab at the UI, please comment. For demonstration purposes, it indexes the current visible page with elasticlunr.js and searches that.
https://phaiax.github.io/mdBook/cli/init.html

  • S is the search hotkey
  • Currently uses 'AND' for the search criteria, but 'OR' would also be fine because of the scoring, but gives way more results.
  • I chose <h1>/<h2>/<h3> as the document dividing criteria for indexing, but we should rethink that maybe. (But that's for the offline indexing anyway)
  • It finds words if they are not completely written, e.g. 'gen' finds 'generate'. I think this is better, but that behaviour can be configured.
  • Should it limit number of results?

Todo:

  • Some kind of ajax hot reload for results to some other that the current page. (Does this maybe conflict with the header.hbs idea?)
  • Display the chapter path: 'Command Line Tool -> Init'

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

Nice! Very cool :)

Displaying the search results in the sidebar is pretty cool, but it limits the number of results we can display at any one time. I'm also not sure if it will be usable on mobile.

What about something like rustdoc or having a dropdown list with the results? This would give a little more playing room to display the results in a comprehensive way.

Should it limit number of results?

If we do something like rustdoc (or a dropdown) we can set the limit to a high value and provide a scroll. I suppose there should be a limit to avoid really irrelevant results, but depending on the design that limit can be set to 20+ results.

Some kind of ajax hot reload for results to some other that the current page. (Does this maybe conflict with the header.hbs idea?)

I'm not sure I understand what you are proposing here. Could you elaborate? :)

Display the chapter path: 'Command Line Tool -> Init'

Yes, that could be useful. I suppose this would rely on the search engine to provide this info along with the results? Is that a feature they provide (lunr in particular, since that is what we settled on)?

Otherwise, really great job! It looks very promising 👍

from mdbook.

mattico avatar mattico commented on August 15, 2024

@Phaiax go for it! I'm not very experienced with web stuff.

Having each section be a separate document does seem to be the way to go. With the way fields are added to documents, you can't really support subsection headings unless each section is a document.

For results on other pages, I was thinking using history.pushState(). We could add url parameters that tell an in-page bit of JS where the search result is so it can scroll to and highlight it. Or maybe just ?search=append+search+here, and have it search the page again after navigation.

from mdbook.

olivernn avatar olivernn commented on August 15, 2024

I suppose this would rely on the search engine to provide this info along with the results? Is that a feature they provide

Lunr will only return you whatever you defined the 'document' reference to be, though this can be used to look up a title or summary from some other data structure.

We could add url parameters that tell an in-page bit of JS where the search result is so it can scroll to and highlight it. Or maybe just ?search=append+search+here, and have it search the page again after navigation.

Sounds like a url fragment, e.g. #some-heading?

from mdbook.

mattico avatar mattico commented on August 15, 2024

Sounds like a url fragment, e.g. #some-heading?

Sure, plus a span for highlighting #some-heading?s=47&l=6 "Start at the 47th character of the body text and highlight the next 6 characters".

from mdbook.

lilianmoraru avatar lilianmoraru commented on August 15, 2024

@Phaiax This looks darn nice.
There is one issue though, it can search only within the same page(not the entire book).
Example: Trying to search for "MathJax" on that first page, would not find anything.

from mdbook.

azerupi avatar azerupi commented on August 15, 2024

@Phaiax great job! That looks very slick!

@lilianmoraru yes, it is just a design proposal for now using the current page as a sample. We still need to build an index during the build and hook the two up. But this is a great step forward! :)

from mdbook.

budziq avatar budziq commented on August 15, 2024

This looks and works awesome! 👍
The only suggestions I would have would be to:

  • limit the number of results (or make it configurable)
  • make the search highlight less permanent (currently there is no way to remove it other than page change or triggering empty search)
  • is there a way to prioritize the results matching chapter titles or headings?

Also a slightly off topic question, would there be a way to search by some metadata added in js by hand (ie crate names in rust cookbook)?

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

For the snippets, have a look at how solr handles this: https://lucene.apache.org/solr/guide/6_6/highlighting.html

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024

For the snippets, I chose my own algorithm, understanding the complex solr highlighters would take to much time.

from mdbook.

budziq avatar budziq commented on August 15, 2024

@Phaiax very cool!

Search highlight is now removed when Esc is pressed. Is that enough?

I'd probably want the highlight as unsticky as possible (possibly removed on any kind of interaction). But someone with actual UX knowhow should probably give us some insight. For now I'd be happy to just have a js func to clear the highlight.

We might consult other stakeholders.
cc @steveklabnik & @carols10cents do you have any thoughts on how the search in RBE / TRPL should behave UI/UX wise? Current POC by @Phaiax.

from mdbook.

lilianmoraru avatar lilianmoraru commented on August 15, 2024

This doesn't need to be perfect from the beginning.
This already improves the experience significantly(especially for the Rust book and the reference).

from mdbook.

budziq avatar budziq commented on August 15, 2024

This doesn't need to be perfect from the beginning.

Totally agree

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

What's the status of this? Ready for a PR or is there some more work to be done?

from mdbook.

Phaiax avatar Phaiax commented on August 15, 2024

I am just waiting for #457 , but that would not really be a blocker.
And I have to do some additional work with the index generation, maybe I am doing this today.
@cetra3 if you want, you can do a review of #472 which is the PR to this issue.

@azerupi Where are youuuuuuu? 😃

Edit: Sry for the mails, it did not show the comments after clicking on 'comment'

from mdbook.

cetra3 avatar cetra3 commented on August 15, 2024

Looks good! I have noticed a couple of things but not major, and are related to some other issues (#463 and mattico/elasticlunr-rs#2)

from mdbook.

budziq avatar budziq commented on August 15, 2024

@Phaiax & @cetra3 The repo is currently looking for new owner maintainer. We (the collaborators) will try to help some more in the meantime. I'm especially interested in this PR and hope to start nudging it forward in the upcoming days.

Thanks!

from mdbook.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.