ukitinu / markov-words Goto Github PK
View Code? Open in Web Editor NEWWord generation using Markov chains and customisable data
License: The Unlicense
Word generation using Markov chains and customisable data
License: The Unlicense
Log should be on file only, I think this is the only sensible option for a command-line tool.
Every command, with options, and every exception should be logged. I can't think of anything else of note at the moment.
Things to validate:
... anything else?
The syntax I'd like to have is markov-words command options...
(using picocli subcommands).
A few ideas on how to improve word generation.
At the moment, a word ends as soon as a word end char is written, regardless of its length. While this is consequence of the Markov property, it may also lead to the generation of silly words and is the reason of the write.max_length property. It could be a good idea to store data on word length (still not sure how). Its usage should be optional.
Allow the user to chain multiple words, that is don't stop the word generation at the first word end char created, but at the n-th, with n user-defined. This should be almost free as n-gram generation was already coded to support it.
A filename cannot contain � (unit separator char), so another solution has to be found.
Maybe I could go back to underscore as WORD_END and disallow most punctuation from alphabets (or at least that symbol)?
Implement last piece of CLI, in the Main class.
In every dictionary I created, _.dat always has a _1; in its list of grams, which may cause the generation of empty words.
Although improbable, it may happen, especially in small dictionaries, so it should be fixed.
I should check how to automate the creation of artefacts to download.
The artefacts are:
java -jar...
) for ease of use;The config file should allow to set the following values:
... anything else?
Add an optional 'description' field to Dict.
At the moment it's not showing, but with some test changes, PMD gave the following warning on FileRepo: Possible God Class (WMC=47, ATFD=17, TCC=22.794%)
.
It may be worth to look into reordering/spitting it.
Throwing out a few ideas of sample dictionaries to provide.
... more?
For Mac-OS the problem is that the --static option of native-image
is not supported, stops the workflow (don't know why as it should be ignored) and I have no environment to test whether the lack of static linking invalidates the executable.
For Windows, there is also a problem with the workflow. The step Get release name needs to be fixed, and I don't know (and care) enough about batch to fix it.
Add readme to src/test/resources/ to describe the purpose of each file there, before I forget
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.