Coder Social home page Coder Social logo

censobr's People

Contributors

diraol avatar nealrichardson avatar rafapereirabr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

censobr's Issues

Add microdata documentation

  • 1970 - Available in the dev version. Planned for v0.2.0
  • 1980 - Available in the dev version. Planned for v0.2.0
  • 1991 - Available in the dev version. Planned for v0.2.0
  • 2000 - Available in the dev version. Planned for v0.2.0
  • 2010 - Available in the dev version. Planned for v0.2.0

Persistent error in Github Actions macOS-latest (oldrel)

The package currently passes in every check when tested locally. It also passes the tests in Github Actions in every OS, except for macOS-latest (oldrel). Here's the output of GHA, rather difficult to interpret, tbh.

The error occurs when building the vignette. I've tried removing the vignette entirely, and all checks passed. See this.

Run options(crayon.enabled = TRUE)
── R CMD build ─────────────────────────────────────────────────────────────────
pdflatex not found! Not building PDF manual.
* checking for file ‘.../DESCRIPTION’ ... OK
* preparing ‘censobr’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ...sh: line 1:  5061 Illegal instruction: 4  '/Library/Frameworks/R.framework/Resources/bin/Rscript' --vanilla --default-packages= -e "tools::buildVignettes(dir = '.', tangle = TRUE)" > '/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T//RtmpnQbntJ/xshell[13](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:14)95ed07180' 2>&1
 ERROR
--- re-building ‘censobr.Rmd’ using rmarkdown
 *** caught illegal operation ***
address 0x1120d8a63, cause 'illegal opcode'
Traceback:
 1: Table__from_ExecPlanReader(self)
 2: x$read_table()
 3: as_arrow_table.RecordBatchReader(reader)
 4: as_arrow_table(reader)
 5: as_arrow_table.arrow_dplyr_query(x)
 6: as_arrow_table(x)
 7: doTryCatch(return(expr), name, parentenv, handler)
 8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) {    augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(filter(pop, abbrev_state == "RJ"))
13: collect(filter(pop, abbrev_state == "RJ"))
[14](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:15): group_by(collect(filter(pop, abbrev_state == "RJ")), V0606)
[15](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:16): summarize(group_by(collect(filter(pop, abbrev_state == "RJ")),     V0606), higher_edu = sum(V0010[which(V6400 == 4)])/sum(V0010),     pop = sum(V0010))
[16](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:17): collect(summarize(group_by(collect(filter(pop, abbrev_state ==     "RJ")), V0606), higher_edu = sum(V0010[which(V6400 == 4)])/sum(V0010),     pop = sum(V0010)))
[17](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:18): eval(expr, envir, enclos)
[18](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:19): eval(expr, envir, enclos)
[19](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:20): eval_with_user_handlers(expr, envir, enclos, user_handlers)
[20](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:21): withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers))
[21](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:22): withCallingHandlers(withVisible(eval_with_user_handlers(expr,     envir, enclos, user_handlers)), warning = wHandler, error = eHandler,     message = mHandler)
[22](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:23): handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr,     envir, enclos, user_handlers)), warning = wHandler, error = eHandler,     message = mHandler))
[23](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:24): timing_fn(handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr,     envir, enclos, user_handlers)), warning = wHandler, error = eHandler,     message = mHandler)))
[24](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:25): evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos,     debug = debug, last = i == length(out), use_try = stop_on_error !=         2L, keep_warning = keep_warning, keep_message = keep_message,     log_echo = log_echo, log_warning = log_warning, output_handler = output_handler,     include_timing = include_timing)
[25](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:26): evaluate::evaluate(...)
[26](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:27): evaluate(code, envir = env, new_device = FALSE, keep_warning = if (is.numeric(options$warning)) TRUE else options$warning,     keep_message = if (is.numeric(options$message)) TRUE else options$message,     stop_on_error = if (is.numeric(options$error)) options$error else {        if (options$error && options$include)             0L        else 2L    }, output_handler = knit_handlers(options$render, options))
[27](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:28): in_dir(input_dir(), expr)
[28](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:29): in_input_dir(evaluate(code, envir = env, new_device = FALSE,     keep_warning = if (is.numeric(options$warning)) TRUE else options$warning,     keep_message = if (is.numeric(options$message)) TRUE else options$message,     stop_on_error = if (is.numeric(options$error)) options$error else {        if (options$error && options$include)             0L        else 2L    }, output_handler = knit_handlers(options$render, options)))
[29](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:30): eng_r(options)
[30](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:31): block_exec(params)
[31](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:32): call_block(x)
[32](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:33): process_group.block(group)
[33](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:34): process_group(group)
[34](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:35): withCallingHandlers(if (tangle) process_tangle(group) else process_group(group),     error = function(e) if (xfun::pkg_available("rlang", "1.0.0") &&         !xfun::check_old_package("learnr", "0.11.3")) rlang::entrace(e))
[35](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:36): withCallingHandlers(withCallingHandlers(if (tangle) process_tangle(group) else process_group(group),     error = function(e) if (xfun::pkg_available("rlang", "1.0.0") &&         !xfun::check_old_package("learnr", "0.11.3")) rlang::entrace(e)),     error = function(e) {        setwd(wd)        write_utf8(res, output %n% stdout())        message("\nQuitting from lines ", paste(current_lines(i),             collapse = "-"), if (labels[i] != "")             sprintf(" [%s]", labels[i]), sprintf(" (%s)", knit_concord$get("infile")))    })
[36](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:37): process_file(text, output)
[37](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:38): knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
[38](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:39): rmarkdown::render(file, encoding = encoding, quiet = quiet, envir = globalenv(),     output_dir = getwd(), ...)
[39](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:40): vweave_rmarkdown(...)
[40](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:41): engine$weave(file, quiet = quiet, encoding = enc)
[41](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:42): doTryCatch(return(expr), name, parentenv, handler)
[42](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:43): tryCatchOne(expr, names, parentenv, handlers[[1L]])
[43](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:44): tryCatchList(expr, classes, parentenv, handlers)
[44](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:45): tryCatch({    engine$weave(file, quiet = quiet, encoding = enc)    setwd(startdir)    output <- find_vignette_product(name, by = "weave", engine = engine)    if (!have.makefile && vignette_is_tex(output)) {        texi2pdf(file = output, clean = FALSE, quiet = quiet)        output <- find_vignette_product(name, by = "texi2pdf",             engine = engine)    }    outputs <- c(outputs, output)}, error = function(e) {    thisOK <<- FALSE    fails <<- c(fails, file)    message(gettextf("Error: processing vignette '%s' failed with diagnostics:\n%s",         file, conditionMessage(e)))})
[45](https://github.com/ipeaGIT/censobr/actions/runs/6055327169/job/16434102385#step:11:46): tools::buildVignettes(dir = ".", tangle = TRUE)
An irrecoverable exception occurred. R is aborting now ...
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.

Add Interview manual

running the function interview_manual() will open the pdf of an interview manual on the web browser.

  • 1970 - Available in the dev version. Planned for v0.2.0
  • 1980 - Available in the dev version. Planned for v0.2.0
  • 1991 - Available in the dev version. Planned for v0.2.0
  • 2000 - Available in the dev version. Planned for v0.2.0
  • 2010 - Available in the dev version. Planned for v0.2.0

Add tests to the interview_manual() function, and create vignette

  • tests
  • vignette

Add 2022 Census data

  • microdata - population
  • microdata - households
  • census-tract level data
  • data dictionary
  • questionnaire
  • interview_manual

add merge_households parameter

add a merge_households (logical) parameter to indicate whether the function should merge household variables to the output data.

  • 1970 population
  • 1980 population
  • 1991 population
  • 2000 population
  • 2000 families
  • 2010 population
  • 2010 emigration
  • 2010 mortality

Add 2000 and 2010 microdata

  • 2000
    • households (ready for v0.2.0)
    • population (ready for v0.2.0)
    • families (ready for v0.2.0)
  • 2010
    • households (ready for v0.2.0)
    • population (ready for v0.2.0)
    • deaths (ready for v0.2.0)
    • emmigration (ready for v0.2.0)

Add 1960 Census data

  • microdata - population
  • microdata - households
  • census-tract level data
  • data dictionary
  • questionnaire
  • interview_manual

Improve code coverage

as of 06/Sept/2023

censobr Coverage: 28.31%
R/add_labels_emigration.R: 4.91%
R/add_labels_households.R: 14.89%
R/add_labels_population.R: 23.85%
R/add_labels_families.R: 38.24%
R/add_labels_mortality.R: 44.74%
R/read_families.R: 90.48% R/read_households.R: 90.48%
R/read_population.R: 90.48%
R/censobr_cache.R: 94.44% R/read_emigration.R: 95.24%
R/read_mortality.R: 95.24%

New function to add labels to variables

Initially, here's the idea. One function per data set

  • add_labels_households(arrw, lang = c('PT', 'EN'))
  • add_labels_population(arrw, lang = c('PT', 'EN'))
  • add_labels_mortality(arrw, lang = c('PT', 'EN'))

Depending on how it goes, it might be better to have a single function that applies to different data sets. E.g.

  • add_labels(arrw, dataset = c('households', 'population', 'mortality'), lang = c('PT', 'EN'))

The downside of this first approach is having too many functions, code repetition because of some variables that are in common between datasets. Meanwhile, the downside of the second approach is that the function will be too big, and harder to manage.

Censo 2022

Tem alguma previsão para o censo 2022 seja incluído no pacote?

Missing municipality id in 1991 population table

Hi @rafapereirabr !

I have used the population and household tables from 2010 to 1980. In 1991 I found a problem in the identification code column of the municipality of the observation. I don't know if I'm doing something wrong, but of the 17,045,653 observations, 8,575,800 are missing from this column. I was surprised when I couldn't filter the information for the municipality of Rio de Janeiro. Here's the code I used to test it:

> census1991_2 <- read_population(year = 1991, cache = T)
Reading data cached locally.
> census1991_2 |> 
+   select(code_muni) |> 
+   mutate(test = is.na(code_muni)) |> 
+   count(test) |>
+   collect()
# A tibble: 2 × 2
  test        n
  <lgl>   <int>
1 FALSE 8469853
2 TRUE  8575800

Add questionnaires

running the function questionnaire() will open the pdf of a questionnaire on the web browser.

  • 1970 - Available in the dev version. Planned for v0.2.0
  • 1980 - Available in the dev version. Planned for v0.2.0
  • 1991 - Available in the dev version. Planned for v0.2.0
  • 2000 - Available in the dev version. Planned for v0.2.0
  • 2010 - Available in the dev version. Planned for v0.2.0

Add tests to the questionnaire() function, and create vignette

  • tests
  • vignette

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.