paulbricman / dual-obsidian-client Goto Github PK

View Code? Open in Web Editor NEW

240.0 13.0 7.0 28.03 MB

A skilled virtual assistant for Obsidian.

Home Page: https://paulbricman.com/thoughtware/dual

License: Mozilla Public License 2.0

JavaScript 10.20% TypeScript 43.51% CSS 46.29%

tools-for-thought chatbot zettelkasten second-brain non-linear-note-taking

dual-obsidian-client's People

Contributors

Stargazers

Watchers

Forkers

bensleveritt onlurking yangwao wahidmounir psionica leiyanhua akosbalasko

dual-obsidian-client's Issues

Add keyboard shortcut to focus on chat input box

Extend open dialogue regex surface with common question words

How/What/Why/When/Where should also trigger open dialogue, even if there's no question mark involved.

Make messages selectable

It would be nice to be able to copy paste from the chat.

Implement generation in new backend

Just like #48 and #49, a barebones endpoint which receives a prompt and returns text. The query itself is not interpreted on the backend side, as per #45. Should use a pretrained GPT-Neo 350M, with customization possibilities in the future.

Implement embedding cache.

In order for subsequent operations to be tractable on light hardware, a caching strategy should be used. The goal is to maintain a dictionary of precomputed embeddings for each file. This way, subsequent operations can simply load the embeddings without computing them again.

This cache manager should:

remove the embeddings of the files which have been deleted in the meantime
add new dictionary entries for files which have been recently created
update dictionary entries for files which have been recently updated

Embeddings should be based on the sentence-bert module, like in MemNav. The dictionary can be stored in a pickle in a hidden local folder.

Front matter should be ignored. And perhaps also headings. Some Markdown-specific module might already be able to sort this out.

Implement proper LaTeX parsing when copying snapshot and loading entry contents

It would be nice to turn LaTeX strings into more readable plain text, or remove it entirely. Or, perhaps, rendering it inside the responses!

Expand removeMd to cover more of Obsidian markup

I found some issues with .md removal. Here's what I used for testing:
Obsidian Cleanup testcases.md

Implement rule-based query parsing

A function should wrap around Persona's functionality and deliver results based on a parsed query. Should be similar to the lists of commands used by virtual assistants. Probably using simple regex rules.

Create instructions for running backend from source

Instructions for setting up the Python-based server should be added to the README. This is primarily aimed at development, because in production the server should be a binary being released.

Implement question answering

On top of files selected through a fluid search (#2) based on the query, a straight-forward question-answering pipeline is used here to answer user questions.

Make "Dual" name customizable

Some might want to name their Dual some other way.

Implement recipe matching in new frontend

Based on a call to the backend (#48) containing bundled example commands from all recipes, the first task of the new recipe engine is to determine which recipe has to be followed given a query. This can be a user query in the chat, or a query made by another recipe when composed. This behavior should be contained in a function which simply returns the recipe filename based on a given query.

issue with executing server

python 3.9
MacOS Catalina 10.15.7

not sure if this is related, but I couldn't find an essence.zip folder but I looked in the sample vault on the repo and made a copy of the config.json to the same directory on my other machine

Error deriveing the essence

Google keeps giving me this error:

ValueError                                Traceback (most recent call last)
<ipython-input-16-3fe4c666eb07> in <module>()
----> 1 output = trainer.train()

3 frames
/usr/local/lib/python3.7/dist-packages/torch/utils/data/sampler.py in __init__(self, data_source, replacement, num_samples, generator)
   102         if not isinstance(self.num_samples, int) or self.num_samples <= 0:
   103             raise ValueError("num_samples should be a positive integer "
--> 104                              "value, but got num_samples={}".format(self.num_samples))
   105 
   106     @property

ValueError: num_samples should be a positive integer value, but got num_samples=0

Implement fluid search in new backend

The new backend component (roughly housing the original skeleton and essence) will simply be a web server exposing a few generic endpoints for the NLP tasks involved, as described here. The most basic one of them is fluid search. Based on a query and a collection of documents, retrieve the closest ones in terms of meaning. No regex matching involved, something related will be taken up the frontend as described in #45. Caching would simply become additive, and only for the documents, not for the query.

Implement question generation

Question generation can enable the user's persona to quiz them on a subject. Mixed with knowledge probes, it can support reflection. Core module should simply return a batch of questions, leaving the dialogue up to the future wrapper interface.

Proposal for functionality changes and recipe framework design

Progress towards solving existing issues and setting up a proper roadmap had been slowed in the past days by the fear of prematurely settling on an architecture and API design given that this space of conversational interfaces over personal knowledge bases is quite unexplored.

The following describes a suggestion for heavily restructuring the functionality and the codebase, a tentative something in between a spec and a user story.

Architecture

Dual is based on two components: the backend and the frontend. The backend is a server which exposes two main endpoints:

/extract, which returns entries from one's knowledge base based on a natural language description, with some options
/generate, which generates text given a prompt, with some options

However, the user doesn't usually interact with the endpoints directly. Rather, they use recipes. Recipes tell Dual how to answer certain commands. They can be predefined, user defined, or contributed by some other user. Recipes are simple Markdown files with the following structure:

---
tags: "#dualrecipe"
pattern: "What is the answer to the ultimate question of life, the universe, and everything?"
---

42, naturally.

If the user has this recipe in their vault as a note, then whenever they ask their Dual that question, they'll get the contents of the note as an answer.

The pattern field of a recipe is a regex pattern. It can also house groups, which can then be referenced in the content.

---
tags: "#dualrecipe"
pattern: "My name is (.*)"
---

Hi there, \1!

With this recipe, if the user tells their Dual My name is John, it'll reply with Hi there, John!.

All this is cute, but not all that useful or interesting. Among the recipes there's also this predefined recipe:

---
tags: "#dualrecipe"
pattern: "Find a note which (.*)"
---

'''dual
GET "/extract/This text \1"
'''

Now, this is good old descriptive search, expressed as a recipe which makes use of the /extract endpoint. When asking Find a note which describes a metaphor between machine learning and sociology, it'll answer with a list of results based on that GET HTTP call made behind the scene to the endpoint.

But if you wanted to customize the command triggers even for this predefined command, you could just wrap a new recipe around it, or change the original one. Here's a wrapper recipe:

---
tags: "#dualrecipe"
pattern: "Yo show me a thing which (.*)"
---

Here ya go:

'''dual
ASK "Find a note which \1"
'''

Cool, you just made your Dual a bit edgier.

So this is how you can express good old descriptive search and fluid search as recipes. What about good old open dialogue?

---
tags: "#dualrecipe"
pattern: "^(([Ww]hy|[Ww]hat|[Ww]hen|[Ww]here|[Ww]ho|[Hh]ow).*)"
---

'''dual
GET "/extract/This text is about \1"
'''

Q: \1
A:

'''dual
GET "/generate/"
'''

Now, when you ask it a question with that structure, Dual assembles the relevant notes in there, composes the prompt further with your query, and then generates the response. Good old open dialogue, but expressed as a recipe. Every command becomes a customizable recipe.

Now you want to teach your Dual to come up with writing prompts, you create this recipe:

---
tags: "#dualrecipe"
pattern: "^[Cc]ome up with a writing prompt\.?"
---

prompt: A sentient being has landed on your planet and your civilization's military has confronted it at the landing site of its ship. You are sent closer as a mediator and encounter a mass of energy that has no form but communicates with you in your language.

prompt: Your spaceship has landed on an unknown planet and there is data showing lifeforms who have created artistic structures. There is an artist in your group who wants to make first contact with the beings through art.

prompt: We discover that beneath its seemingly uninhabitable appearance, Mars has an entire race of subterranean alien lifeforms living on it. You are part of the team sent to explore this civilization.

prompt: 

'''dual
GET "/generate/"
'''

You ask it Come up with a writing prompt and you get some in return.

Sure, there are technicalities. The note contents until the generate call should be piped into it as the prompt. The endpoints are shorthand for localhost:5000/..., but you could perhaps change them to refer to a hosted instance at some point in the future. You could make calls to other people's instances through recipes. You could tap into any API through a recipe, turning Dual in a sort of conversational hub. Regex groups have to be entered when making calls. URL's have to be encoded properly because they contain text. Extract calls should know if to supply filenames or contents, through parameters probably. What should a recipe return, the entire contents or the result of the last call? Perhaps a metadata setting. A bunch of things still to settle on.

Remove code snippets from export

Pretty self-explanatory. A regex like the one for front-matter should help. Together with bullet points, subheadings. Are those tackled by beautifulsoup?

Add Dual command to start server inside Obsidian

Add a Dual: Start Server command palette to run python server.py --path [current vault folder] under the hood.

Bundle skeleton in a self-contained binary

Not sure what's the best way to go.

Docker + PyInstaller + Wine spitting out clean binaries for Linux/Windows sounds somewhat doable.
Or somehow turning Docker containers into binaries themselves? Those would be huge.
Several users contributing their binaries using PyInstaller on their own OS?

Implement cache sync as a first step for all commands

It's likely that a user might edit notes while Persona is still running, and interacting with it in the meanwhile. Each one of the four actions must start with calling a cache sync.

Fix regex conflict between fluid and descriptive search

There seems to be some conflict between regex strings for fluid and descriptive search. Some descriptive search queries are interpreted as fluid search. They should be ironed out.

Properly parse wikilinks from note contents

There's a pretty high chance wikilinks aren't considered in the md_to_text function in util.py. Apparently python-markdown has a solution.

Add option in alignment utility to use smaller model

Older machines have a hard time accommodating the GPT-2 medium model, the small or distilled one might be good alternatives.

Specify target file structure for setup

The essence.zip location is confusing for many.

Implement topic search

Fluid search would perform a semantic search as in MemNav using the precomputed embeddings in the cache. Some parameters should define how many items are selected in each of the two passes. Filenames should be returned.

torch.embedding IndexError: index out of range in self

File "...\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\torch\nn\functional.py", line 1916, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

This happens sometimes on Windows in open dialogue. It only pops up with specific questions. My working hypothesis is that nasty characters in retrieved notes hurdle the generation process.

Implement related search

Very similar functionality to fluid search (#2). However, instead of having a search query, this would take in a filename, strip away front matter and perhaps headings, compute its embedding, and treat that as the search query embedding. Should return filenames, just like fluid search.

Implement early stopping in alignment utility

1000 steps might not be optimal for all vault sizes.

Reimplement question answering and question generation through text generation

Text generation is specifically trained for causally generating natural-sounding text. Therefore, it's a much better choice for question answering and generation, which can be obtained through some prompt engineering.

Switch models to GPT-Neo versions

Similar models but with higher performance as they've been trained on more data. Hopefully they're still fine-tunable in a Colab notebook, at least the medium one.

Implement code block detection in new frontend

In order to actually follow recipes, the engine needs to be able to pick up on:

```dual
```

and

```js
```

code blocks and interpret them accordingly, as described in #45. Regex patterns should do, and a function would return an array of the beginning and end character numbers of all such blocks for later interpretation.

Add option to persist convo

Reloading Obsidian currently clears the conversation, an option could persist it. That would require an additional local store somehow... Perhaps in a text file which would also be human-readable.

Document capabilities via command cheat sheet

A list of features, coupled with sample queries and dummy responses should be documented to guide the #7 and eventually #8.

Change "Persona" to "Dual"

In terms of:

UI
repository settings
documentation
code

python3 -m pip install -r requirements.txt returns error

It seems to get stuck at installing regex

Here's the screenshot of the error.

Essence download file location is not open by default

When opening the file structure, users see the whole Linux filesystem.

FIle path slashes should adapt to host OS

Some paths must be hardcoded and it doesn't work well apparently:

Improve preprocessing for snapshot

Image metadata and code blocks clutter the results.

Add sample commands to documentation

Just like for a virtual assistant.

Implement sentence-bert NLI

The sentence-bert NLI implementation enables easy access to individual NLI logits. This can be used for taking into consideration the neutral logit, in contrast with the HuggingFace pipeline implementation. Additionally, some prompt engineering is required.

Implement argument parsing in new frontend

Based on arguments detected in #52, such as *person* or *topic*, the values have to be extracted from the user query using text generation as describer here. Argument names and the query should go into a function, and a dictionary with the proper value attributions should come out.

Implement local server exposing API

A Flask-based API should expose an endpoint for receiving text-based commands and delivering the results via the response. Makes use of the rule-based query parser (#7).

Skeleton should take into account notes placed in subfolders

As of now, it only considers notes placed in the vault root.

Prepare backend for being deployed as binary

The backend component should eventually be a self-contained binary with the following behavior:

It should first expose a limited API which can be used to get a snapshot.
After managing that, it should download and load the two auxiliary models used, the ones for fluid and descriptive search, respectively (#2, #5). The point is that while working behind the scenes to download those, the user can start using the alignment notebook.
The aligned model can be loaded if present, otherwise not. But this should change the behavior of the open dialogue function. It should essentially return more instructions, although those would have been presented initially, too.

Turn filenames into clickable links to open notes

Right now they aren't even selectable.

Online notebook crashes with huge snapshots

User reports 70K lines as an upper limit. Conversely, small snapshots should be met with a warning.

Implement argument detection in new frontend

As described here, recipes contain fields such as *topic* or *person*. Those should simply be extracted into an array, for later processing by the recipe engine, following #51. The contents of a recipe should go in, and out come a list of such argument names.

Implement descriptive search in new backend

Similar to the refactoring for #48, descriptive search would need a barebones endpoint which receives a query and a collection of documents. No regex involved, as that is taken up by the frontend in #45. Highest scores of document-query entailment are returned.

Add raw whitelist/blacklist mechanism for notes being considered

Both for fine-tuning and live retrieval.