kanishkamisra / footrulr Goto Github PK
View Code? Open in Web Editor NEWCompare sentences using Machine Translation and Text Summarization evaluation metrics
License: Other
Compare sentences using Machine Translation and Text Summarization evaluation metrics
License: Other
Adding data for the source, target, generated candidate sets as well as human annotated reference texts would be wonderful to produce vignettes and demonstrate how these metrics can be computed over entire corpora.
Please suggest the corpora that can be included in this issue :)
TODO:
?pkgdown::deploy_site_github
)... you will need to setup your deployment keys. The easiest way is to call t
ravis::use_travis_deploy()
. This will generate and push the necessary keys to your GitHub and Travis accounts. See the travis package website for more details.
Example:
Some badges to consider:
use_cran_badge()
use_lifecycle_badge("experimental")
These functions also add badges:
use_coverage()
use_travis()
TODO:
usethis::use_coverage()
(Adapted fron forestgeo/learn#182)
Diagnosis from running devtools::release()
update.packages()
spell_check_package()
-- R CMD check results -------------------------------- footrulr 0.0.0.9000 ----
Duration: 24.5s
> checking examples ... ERROR
Running examples in 'footrulr-Ex.R' failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: bleu
> ### Title: BLEU (Bilingual Evaluation Understudy)
> ### Aliases: bleu
>
> ### ** Examples
>
> # Our candidate and reference data
> sample_data <- list(list(candidate = "The cat the cat on the mat", references = c("The cat is on the mat", "There is a cat on the mat")),
+ list(candidate = "The cat the cat on the map", references = c("The cat is on the ccccccat", "There is a cat on the mat")))
>
>
>
> # Run function
> bleu(sample_data)
Error: `map_df()` requires dplyr
Execution halted
> checking tests ...
See below...
> checking dependencies in R code ... NOTE
Unexported object imported by a ':::' call: 'purrr:::probe'
See the note in ?`:::` about the use of this operator.
> checking R code for possible problems ... NOTE
footrulr: no visible global function definition for 'map2'
Undefined global functions or variables:
map2
> checking Rd line widths ... NOTE
Rd file 'bleu.Rd':
\examples lines wider than 100 characters:
sample_data <- list(list(candidate = "The cat the cat on the mat", references = c("The cat is on the mat", "There is a cat on the mat") ... [TRUNCATED]
list(candidate = "The cat the cat on the map", references = c("The cat is on the ccccccat", "There is a cat on the mat")))
These lines will be truncated in the PDF manual.
-- Test failures ------------------------------------------------- testthat ----
> library(testthat)
> library(footrulr)
>
> test_check("footrulr")
-- 1. Error: bleu works (@test-bleu.R#9) --------
`map_df()` requires dplyr
1: bleu(sample_data, 1) at testthat/test-bleu.R:9
2: map_df(.data, function(item) {
cand <- item$candidate
ref <- item$references
scores <- map_through_ngram(item)
tibble(candidate = cand, references = list(ref),
scores)
})
3: abort("`map_df()` requires dplyr")
== testthat results =============================
OK: 2 SKIPPED: 0 FAILED: 1
1. Error: bleu works (@test-bleu.R#9)
Error: testthat unit tests failed
Execution halted
2 errors x | 0 warnings v | 3 notes x
devtools::check()
Remember to remove dev version.
WARNING: version (0.0.0.9000) should have exactly three components
After fixing local R CMD check, remember to run rhub::check_for_cran()
check_win_devel()
After first release remember to use_news_md()
Remember to update DESCRIPTION. Particularly, ensure there is no typo on Title:
and Description
, and check author details.
If submitting to CRAN, remember to use_cran_comments()
Currently I have a naive implementation for BLEU that does not penalize short translations. This issue is circumvented using a Brevity Penalty.
Shamelessly stealing an example from Rachel Tatman's blog post:
Consider a sentence: J’ai mangé trois filberts with Reference Translations:
And some of the candidate translations:
Both of these get a BLEU-2 score of 1 (since all bigrams in "I ate" are in the reference translation #1.)
A brevity penalty penalizes these short translations that can misguide the analysis from results of a mediocre translation system that produces smaller sentences as translations that still get high BLEU scores.
A reference for the computation can be found here: https://github.com/vikasnar/Bleu/blob/master/calculatebleu.py
Once we have sample texts as well as more metric implementations, it would be nice to have Vignettes to show how the metrics work as well as how this package helps in computing them.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.