aichaos / alice-benchmarks Goto Github PK
View Code? Open in Web Editor NEWTesting how well A.L.I.C.E. runs on RiveScript.
Testing how well A.L.I.C.E. runs on RiveScript.
I noticed that your begin.rive file is missing in this repo "include a document named begin.rive that contains some configuration settings for your bot's brain. The most useful settings that would be set here include substitutions, which are able to make changes to the user's message before a reply is looked for." Would you be kind enough to point me to right file with all the bot settings? And fantastic work so far by the way. Any plans of trying to benchmark rosie?
Converted from: https://www.kirsle.net/wiki/Optimize-RiveScript
None of the RiveScript modules can effectively handle a brain the size of Alice's. The Golang version is able to load Alice the fastest (< 1 second) whereas the others take closer to 20+ seconds. However, when actually fetching a reply they all take about 15 seconds.
The root problem is probably in the sorted reply structure, which looks generally like this:
sorted = {
"random": [ // topic name
["how are you", pointer ], // triggers ordered by priority
["hello bot", pointer ],
["*", pointer]
]
};
Under a topic, all triggers are sorted in their optimal sort order, which is generally: atomic triggers with the most number of words are first, less specific triggers later, least specific last. But triggers with custom priorities ({weight}
tags, or from a topic that inherits other topics, etc.) always come before lower priority sets of triggers.
In the Alice reply set this means there's about 68,000 triggers in one giant array under the "random" topic, so the code has to scan through several tens of thousands of triggers when finding a match.
Alicebot Program V is an AIML bot and it stores patterns in a more efficient way: it separates the first word of the pattern away from the rest. When looking up a response for the user, it can then use the first word as a dictionary key (there's a relatively small set of distinct first words), and then have a much simpler array of triggers to look at. Example:
# The following patterns are represented here:
# ITS *
# ITS BORING
# ITS FUN
# ITS GOOD *
$data = {
aiml => {
matches => {
'ITS' => [
'* <that> * <topic> * <pos> 17818',
'BORING <that> * <topic> * <pos> 17819',
'FUN <that> * <topic> * <pos> 17820',
'GOOD * <that> * <topic> * <pos> 17821',
],
},
},
};
My blog entry has more details. The <pos>
refers to an array index where the pattern's details are; in the more recent RiveScript implementations (CoffeeScript and Go) we keep pointers with the triggers in the sorted structure so we don't need to worry about that.
At first glance a Program V style way of sorting triggers looks good, but in RiveScript triggers are much more complicated and "regexp-y", for example:
(what is|what was) your name
These things would still need to be taken into account. Also the relative priority of each trigger via {weight}
and topic inheritance.
Change the sort structure to look more like this:
sorted = {
"random": [ // topic name
[ // these arrays are for priority level, higher on top
[
"hello", // first word
[ // list of triggers under that word
["hello bot", pointer]
]
],
["how", [ ["how are you", pointer] ],
["*", [ ["*", pointer] ]
]
]
}
So the logic for matching a trigger would be along these lines:
user_first_word = re.split(r'\s+', message)[0]
for priority in self._sorted.topics[topic]:
for first_word in priority:
# this next line would actually be a regexp for * triggers, etc.
if user_first_word == first_word[0]:
# Their first word matches! Look through all the triggers for this word.
for trigger in first_word[1]:
# Again this would be a regexp in reality
if message == trigger[0]:
# Have a match!
matched = trigger[1]
# now `matched` points to the trigger's details for the
# replies, conditions, etc.
For finding the first words, a function like getFirstWords(trigger)
could be added that returns one or multiple first words.
[
or (
, return the first words of all the regexp-y parts.
(what time|when) is it
would return ["what", "when"]
how are you
would return ["how"]
*
at the bottom.Hi Kirsle,
I tried finding your email and contacting you but the mailer daemon fails in gmail. I have been working on this brain set for some days now and I am actually building a Java rive manager of sorts. It takes all the brain files, crunches them into one massive file, sorts them alphabetically from top to bottom and as of now it can find all the triggers that are dependent on a particular trigger as I am about to do cascade deletes.
The master file has 284915 lines of rive script roughly and my Java program can easily manage stuff with them. Here is a screenshot of this in action
As per the above screenshot, I am trying to find all the rules that are related to this trigger called "+ what languages do you speak"
Outgoing links are the redirects a trigger has to other triggers recursively while incoming links are the list of other triggers that refer to yours. So when you delete a trigger, you dont have to worry about missing redirects and hence this program. I noticed that the alice brain files have tonnes of triggers with a % previous in them but I cannot find any bot response for most of the % that I encountered. Would it be possible for you to let me know if that is really the case? I will be more than willing to give this program to you after I add a few more features to it. Currently doing what I feel is necessary to delete unwanted triggers quickly. the idea is to remove facts from rules
Facts like who was the first president of xyz can be fetched from a factual db whereas pure rive rules in the master script will load things much faster and make this master file far better
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.