Hi, I'm Piggy, a student software engineer from the south west of Australia.
piggypiglet / docdex Goto Github PK
View Code? Open in Web Editor NEWJSON API & Discord Bot for Javadocs
Home Page: https://docdex.helpch.at
License: MIT License
JSON API & Discord Bot for Javadocs
Home Page: https://docdex.helpch.at
License: MIT License
Hi, I'm Piggy, a student software engineer from the south west of Australia.
Currently, the bot simply posts a link to the web version of the javadoc if the requested description etc is too long. In my opinion, some sort of preview that is truncated to fit the Discord embed limits would be a good addition. The text could simply be cut off and ...
or something could be appended. This could result in ugly styling issues (e.g. if this happens in the middle of a code block), so a more sophisticated strategy might be worth discussing.
Are slash commands planned / would a slash commands PR be merged?
If you use the d;help command it gives you the ability to flick between pages using reactions which is great. A suggestion would be to have a way to know which page you are currently viewing. This could be done by only remove second reaction when a new page is selected or have it shown in the message itself.
Javadocs hosted by the official docdex instance, https://docdex.helpch.at & https://piggypiglet.me/docdex. Please request new javadocs here, by leaving the name, and a link to the javadoc jar (preferably from a maven repo). I need the jar as these docs are hosted locally. I'll react to your message with a thumbs up when I've added your javadoc.
I'd raise a pr by myself, but you would need to setup the additions on your server anyway, considering you host everything yourself, a pr would be a bit useless.
Currently when searching, a similarity/distance metric has to be generated against the query and each individual element stored in memory for that javadoc & type (unless there's a direct match). The amount of searches increases drastically if it's a method, as to scan for methods AND parameters. If parameters are provided and they aren't "full" (i.e. name + type), it has to sort 5 lists just to get a result. This can result in the route taking seconds to return a result, which is far too long. I'm looking into an alternative, specifically the symmetric delete spelling correction algorithm. Problem I have with this is it uses edit distances (however virtually all of these data structures do, e.g. bk-trees), instead of finding the best match, even if that match is far away. Perhaps I'll need to add an error "No results were found" if no results were returned in a reasonable (configurable) edit distance.
Additionally, the SymSpell java implementations don't use jaro winkler, which is my preferred algorithm for searching right now. I'll need to maintain my own implementation I suppose.
Ability to select the default providing docs within a discord server
not sure why
Title explains it
Documentation commands are breaking with the error message:
Something went very wrong, me.piggypiglet.docdex.documentation.IndexURLBuilder@<hashcode>
Some other commands are failing with no error message, with the command message just being deleted, not sure if that's related, though.
Tested in the Java Community server.
An auto updater is essential as the javadoc list continues to grow. It's unreasonable to expect myself to manually update these javadocs at every new release, and it's also unreasonable to expect javadoc owners/requestees to have to message me everytime they update. Therefore, a program to automatically update the javadocs is necessary as this project starts gaining traction. Currently the plan is to have a relatively simple system, with different strategies of fetching the latest javadocs for different javadoc download locations. The initial implementation will be a "maven latest" strategy, which will download the javadoc based on the info in the maven-metadata.xml of a repository. With the jar, it'll place it in the documentation webroot directory, and unpack it, making sure to also assign correct permissions. Additionally it'll also seize control of the docdex config, making sure it stays up to date. i.e. this updater will be configured manually, not docdex. It'll also link into pterodactyl to restart docdex, as that's where it's currently hosted. Obviously this updater is extremely coupled to my particular setup for the public docdex instance, so it's not really of any use to the public.
There's not actually anything wrong with the performance of sorting at the moment, but science isn't about why, it's about WHY NOT, and for science I say we research the fuck out of this and find the best way to sort.
Here are the ideas that came up in the discord convo today:
see jdabuilder#jdabuilder for an example.
I tried to make a real orm to use, but couldn't quite get it done due to it's complexity. I want to revisit this issue in the future, but I'm worried that if I spend too much time on it right now with the current rate of progress, I'll lose motivation for the project. Therefore, I'm leaving it behind in favour of the more manual (but still effective) approach I'm implementing now.
Last commit with it: 6cbe6cd
I'll probably come back to it for RPF, which would benefit greatly from an orm like the one I was attempting to implement. Sadly, such a thing is simply not needed for docdex.
The bot currently extracts description and other textual data from Javadoc as plain text. It would be nice if it converted the HTML used in Javadoc to Markdown, the markup format used by Discord.
This would include emphasis, strong emphasis, inline code
, code blocks and hyperlinks which can all be displayed nicely in Discord's flavour of Markdown. This would also fix current ugly properties such as spaces being the only whitespace or lists being stripped away entirely (even though Discord doesn't render Markdown lists, it would still look better).
If you're looking for a HTML to MD converter, I can recommend flexmark, specifically the flexmark-html2md-converter
module.
Note: This would also resolve #17 .
When using something like d;spigot NamespacedKey#minecraft
it turns into an invalid query because on HelpChat, #minecraft
it is a channel and DocDex takes the message literally. If possible, ignore such formatting (doesn't JDA have a method to get the raw message, with #minecraft
instead of <#382856648064237568>
?).
Adding this will also fix #7
Was doing some testing, discovered that the limit of responses can affect the nature of the response.
For example, https://docdex.helpch.at/index?javadoc=jdk&query=map~getordefault(key,%20defaultvalue)&limit=10 will return one result, an exact match - as it should.
However, if you take the 0 off the end, https://docdex.helpch.at/index?javadoc=jdk&query=map~getordefault(key,%20defaultvalue)&limit=1 returns one match, but it is not an exact match, nor the same result as the query with a limit of 10.
d;snakeyaml Yaml#loadAs
This command stated above fails to send the doc while itβs counterpart
d;snakeyaml Yaml#load
works fine this is likely due to the sneaky /./ in this:
https://docdex.helpch.at/index?javadoc=snakeyaml&query=yaml~loadas
Don't keep spamming error messages, just cancel the task after a single error. e.g. when a non-matching result is returned and the message is paginated, then the bot proceeds to send 6 errors as it cant add the 6 reactions.
Add a command like d;constructors
which will display all constructors of a class. An alternative to this would be to make d;<javadoc> ClassName#ClassName
return all constructors and not just one if the class has an empty constructor.
ItemStack has an empty constructor, so the rest are ignored.
To see all constructors, you have to use this:
It could be helpful to categorize the list of Javadocs when using d;docs
My suggestions:
Java
for JDK docsSpigot
for the Spigot docsPaperMC
for the PaperMC docsPlugins
for the various plugin docsOther
(Or maybe Misc
) for things like the commons docsThese javadocs aren't getting indexed for some reason:
e.g. the example in plotsquared-bukkit's ChunkCoordinator
not sure why I made it like that
Currently docdex resolves all documentation to the parent objects. This leads to the explicit implementation JavaDoc not being accessible.
For example:
d; jdk Set#add
returns the JavaDoc for Collection#add
.
When querying for, for example, sub-interfaces of a class (something straight up unpossible), nothing is returned to discord.
Would make a nicer issue but uuuh ehem lazy
d;super_interfaces paper player
not working
pretty self explanatory
Had this conversation with the DocDex bot recently:
It would be ideal if, given a limit of n
, when an exact match is found to build an embed around that instead of returning the same query in a list of suggestions - giving the bot a limit, I had thought, would cause it to create several paginated embeds with the first result first, but even if this is not the case when an exact match is found but a limit other than 'first' is given, the error should probably not say that no match was found π
Some (not all) mismatched method queries suggest certain methods multiple times when there is only one of them
Spigot's ServicesManager
only contains one load
method yet it shows multiple:
The JDK's Map
class contains in reality a single get
method:
Make the pagination buttons accessible only by the message author (maybe add an option to enable / disable this?)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.