Comments (7)
If I remember correctly, the orginal describe
was developped by @briatte
from questionr.
I like that idea too, but I cannot remember much of the original describe
function I wrote, which was basically a port from Stata.
If the idea is to make the output useful for data exploration, I would suggest adding "obs." next to vector length, and show the percentage of missing values. I would also suggest, like memisc
, "translating" factor into "Nominal", ord. factor into "Ordinal", and numeric/integer into "Numeric".
> describe(d$factor_var)
[2000 obs.] Categorie socio-professionnelle
Nominal: "Employe" NA "Technicien" "Technicien" "Employe" ...
7 levels: Ouvrier specialise | Ouvrier qualifie | Technicien | Profession intermediaire
| Cadre | Employe | Autre
NAs: 347 (17.4%)
I would even go as far as to suggest, through an argument like help = TRUE
:
- the most appropriate method to view stats, e.g.
freq(x)
- the most appropriate method to plot, e.g.
plot(table(x))
from questionr.
You can test this commit larmarange@b681ccf
If it's OK, I will add it to a pull request.
I kept the original possibility of providing a list of variables in case of data.frame. It's working with data.frame, data_frame and data.table.
Some examples:
> describe(hdv2003$age)
[2000 obs.]
integer: 28 23 59 34 71 ...
min: 18 - max: 97 - NAs: 0 (0%) - 78 unique values
> describe(hdv2003$age)
[2000 obs.]
integer: 28 23 59 34 71 ...
min: 18 - max: 97 - NAs: 0 (0%) - 78 unique values
> describe(hdv2003)
[2000 obs. x 20 variables] data.frame
$id:
integer: 1 2 3 4 5 ...
min: 1 - max: 2000 - NAs: 0 (0%) - 2000 unique values
$age:
integer: 28 23 59 34 71 ...
min: 18 - max: 97 - NAs: 0 (0%) - 78 unique values
$sexe:
nominal factor: "Femme" "Femme" "Homme" "Homme" "Femme" ...
2 levels: Homme | Femme
NAs: 0 (0%)
$nivetud:
nominal factor: "Enseignement superieur y compris technique superieur" NA "Derniere annee d'etudes primaires" "Enseignement superieur y compris technique superieur" "Derniere annee d'etudes primaires" ...
8 levels: N'a jamais fait d'etudes | A arrete ses etudes, avant la derniere annee d'etudes primaires | Derniere annee d'etudes primaires | 1er cycle | 2eme cycle | Enseignement technique ou professionnel court | Enseignement technique ou professionnel long | Enseignement superieur y compris technique superieur
NAs: 112 (0.1%)
$poids:
numeric: 2634.3982157 9738.3957759 3994.1024587 5731.6615081 4329.0940022 ...
min: 78.0783403 - max: 31092.14132 - NAs: 0 (0%) - 1877 unique values
$occup:
nominal factor: "Exerce une profession" "Etudiant, eleve" "Exerce une profession" "Exerce une profession" "Retraite" ...
7 levels: Exerce une profession | Chomeur | Etudiant, eleve | Retraite | Retire des affaires | Au foyer | Autre inactif
NAs: 0 (0%)
$qualif:
nominal factor: "Employe" NA "Technicien" "Technicien" "Employe" ...
7 levels: Ouvrier specialise | Ouvrier qualifie | Technicien | Profession intermediaire | Cadre | Employe | Autre
NAs: 347 (0.2%)
$freres.soeurs:
integer: 8 2 2 1 0 ...
min: 0 - max: 22 - NAs: 0 (0%) - 19 unique values
$clso:
nominal factor: "Oui" "Oui" "Non" "Non" "Oui" ...
3 levels: Oui | Non | Ne sait pas
NAs: 0 (0%)
$relig:
nominal factor: "Ni croyance ni appartenance" "Ni croyance ni appartenance" "Ni croyance ni appartenance" "Appartenance sans pratique" "Pratiquant regulier" ...
6 levels: Pratiquant regulier | Pratiquant occasionnel | Appartenance sans pratique | Ni croyance ni appartenance | Rejet | NSP ou NVPR
NAs: 0 (0%)
$trav.imp:
nominal factor: "Peu important" NA "Aussi important que le reste" "Moins important que le reste" NA ...
4 levels: Le plus important | Aussi important que le reste | Moins important que le reste | Peu important
NAs: 952 (0.5%)
$trav.satisf:
nominal factor: "Insatisfaction" NA "Equilibre" "Satisfaction" NA ...
3 levels: Satisfaction | Insatisfaction | Equilibre
NAs: 952 (0.5%)
$hard.rock:
nominal factor: "Non" "Non" "Non" "Non" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$lecture.bd:
nominal factor: "Non" "Non" "Non" "Non" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$peche.chasse:
nominal factor: "Non" "Non" "Non" "Non" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$cuisine:
nominal factor: "Oui" "Non" "Non" "Oui" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$bricol:
nominal factor: "Non" "Non" "Non" "Oui" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$cinema:
nominal factor: "Non" "Oui" "Non" "Oui" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$sport:
nominal factor: "Non" "Oui" "Oui" "Oui" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$heures.tv:
numeric: 0 1 0 2 3 ...
min: 0 - max: 12 - NAs: 5 (0%) - 30 unique values
> describe(hdv2003, "cuisine", "heures.tv")
[2000 obs. x 2 variables] data.frame
$cuisine:
nominal factor: "Oui" "Non" "Non" "Oui" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
$heures.tv:
numeric: 0 1 0 2 3 ...
min: 0 - max: 12 - NAs: 5 (0%) - 30 unique values
> describe(hdv2003, "trav*")
[2000 obs. x 2 variables] data.frame
$trav.imp:
nominal factor: "Peu important" NA "Aussi important que le reste" "Moins important que le reste" NA ...
4 levels: Le plus important | Aussi important que le reste | Moins important que le reste | Peu important
NAs: 952 (0.5%)
$trav.satisf:
nominal factor: "Insatisfaction" NA "Equilibre" "Satisfaction" NA ...
3 levels: Satisfaction | Insatisfaction | Equilibre
NAs: 952 (0.5%)
> describe(hdv2003, "trav|lecture")
[2000 obs. x 3 variables] data.frame
$trav.imp:
nominal factor: "Peu important" NA "Aussi important que le reste" "Moins important que le reste" NA ...
4 levels: Le plus important | Aussi important que le reste | Moins important que le reste | Peu important
NAs: 952 (0.5%)
$trav.satisf:
nominal factor: "Insatisfaction" NA "Equilibre" "Satisfaction" NA ...
3 levels: Satisfaction | Insatisfaction | Equilibre
NAs: 952 (0.5%)
$lecture.bd:
nominal factor: "Non" "Non" "Non" "Non" "Non" ...
2 levels: Non | Oui
NAs: 0 (0%)
> describe(femmes)
[2000 obs. x 17 variables] tbl_df tbl data.frame
$id_femme: Identifiant de l'enquêtée
integer: 391 1643 85 881 1981 ...
min: 1 - max: 2000 - NAs: 0 (0%) - 2000 unique values
$id_menage: Identifiant du ménage
integer: 381 1515 85 844 1797 ...
min: 1 - max: 1814 - NAs: 0 (0%) - 1814 unique values
$poids: Poids statistique
numeric: 1.80315 1.80315 1.80315 1.80315 1.80315 ...
min: 0.044629 - max: 4.396831 - NAs: 0 (0%) - 351 unique values
$date_entretien: Date de passation du questionnaire
Date: 2012-05-05 2012-01-23 2012-01-21 2012-01-06 2012-05-11 ...
min: 2011-12-01 - max: 2012-05-31 - NAs: 0 (0%) - 165 unique values
$date_naissance: Date de naissance
Date: 1997-03-07 1982-01-06 1979-01-01 1968-03-29 1986-05-25 ...
min: 1962-02-07 - max: 1997-03-13 - NAs: 0 (0%) - 1740 unique values
$age: Âge révolu (en années) à la date de passation du questionnaire
numeric: 15 30 33 43 25 ...
min: 14 - max: 49 - NAs: 0 (0%) - 36 unique values
$milieu: Milieu de résidence
labelled numeric: 2 2 2 2 2 ...
2 labels: [1] urbain [2] rural
min: 1 - max: 2 - NAs: 0 (0%) - 2 unique values
$region: Région de résidence
labelled numeric: 4 4 4 4 4 ...
4 labels: [1] Nord [2] Est [3] Sud [4] Ouest
min: 1 - max: 4 - NAs: 0 (0%) - 4 unique values
$educ: Niveau d'éducation
labelled numeric: 0 0 0 0 1 ...
4 labels: [0] aucun [1] primaire [2] secondaire [3] supérieur
min: 0 - max: 3 - NAs: 0 (0%) - 4 unique values
$travail: A un emploi ?
labelled numeric: 1 1 0 1 1 ...
2 labels: [0] non [1] oui
min: 0 - max: 9 - NAs: 0 (0%) - 3 unique values
$matri: Statut matrimonial
labelled numeric: 0 2 2 2 1 ...
6 labels: [0] célibataire [1] mariée [2] en concubinage [3] veuve [4] divorcée [5] séparée
min: 0 - max: 5 - NAs: 0 (0%) - 6 unique values
$religion: Religion
labelled numeric: 1 3 2 3 2 ...
5 labels: [1] musulmane [2] chrétienne [3] protestante [4] sans religion [5] autre
min: 1 - max: 5 - NAs: 4 (0%) - 6 unique values
$journal: Lit la presse ?
labelled numeric: 0 0 0 0 0 ...
2 labels: [0] non [1] oui
min: 0 - max: 1 - NAs: 0 (0%) - 2 unique values
$radio: Ecoute la radio ?
labelled numeric: 0 1 1 0 0 ...
2 labels: [0] non [1] oui
min: 0 - max: 1 - NAs: 0 (0%) - 2 unique values
$tv: Regarde la télévision ?
labelled numeric: 0 0 0 0 0 ...
2 labels: [0] non [1] oui
min: 0 - max: 1 - NAs: 0 (0%) - 2 unique values
$nb_enf_ideal: Nombre idéal d'enfants
labelled numeric: 4 4 4 4 4 ...
1 labels: [96] Ne sait pas
min: 0 - max: 99 - NAs: 0 (0%) - 18 unique values
$test: A déjà fait un test de dépistage du VIH ?
labelled numeric: 0 9 0 0 1 ...
2 labels: [0] non [1] oui
min: 0 - max: 9 - NAs: 0 (0%) - 3 unique values
from questionr.
Looks pretty good to me, very helpful output that immediately shows things that would require two or three functions to get in base R.
from questionr.
I have prepared a new pull request with labelled functions, freq, lookfor, describe and ltabs
from questionr.
cf. Pull Request #57
from questionr.
cf. #72
from questionr.
Related Issues (20)
- Bugs in irec()? HOT 5
- Remove `dplyr::recode` in `irec`
- Fix error in `irec` when `forcats` or `dplyr` are not loaded
- Utiliser fct_relevel pour réordonner les modalités d'un facteur HOT 2
- Découpage interactif avec des années HOT 4
- error message styler HOT 2
- na.rm et na.show avec wtd.table HOT 7
- Easy ggplot2 from survey objects HOT 5
- Syntax forcats des add-in HOT 8
- Error in describe for a labelled vector with only NAs HOT 1
- questionr deprecated functions na.rm & na.show HOT 3
- Inclure la borne supérieure par défaut dans `icut` ? HOT 2
- Erreur avec questionr:::irec() HOT 6
- Levels recoding not working HOT 6
- Level recoding HOT 3
- error in factor ordering and recoding when colnames starts with a number HOT 2
- Problème d'installation du package questionr HOT 5
- Erreur au chargement de questionr : "erreur : spécification de version incorrecte ‘1,5’" HOT 6
- Remove Linux requirement HOT 3
- `wtd.table` ne marche pas avec des vecteurs `labelled`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from questionr.