# fauxnaif

faux-naïf (/ˌfoʊ.naɪˈif/): a person who pretends to be simple or innocent

fauxnaif: an R package for simplifying data by pretending values are `NA`

## Overview

fauxnaif provides an extension to `dplyr::na_if()`. Unlike dplyr’s `na_if()`, `na_if_in()` allows you to specify multiple values to be replaced with `NA` using a single function. fauxnaif also includes a complementary function `na_if_not()` to specify values to keep.

## Installation

You can install `fauxnaif` from CRAN:

``install.packages("fauxanif")``

Or the development version from GitHub:

``````# install.packages("remotes")
remotes::install_github("rossellhayes/fauxnaif")``````

## Usage

``````library(dplyr)
library(fauxnaif)``````

### The basics

Let’s say we want to remove an unwanted negative value from a vector of numbers

``````-1:10
#>  [1] -1  0  1  2  3  4  5  6  7  8  9 10``````

We can replace -1…

… explicitly:

``````na_if_in(-1:10, -1)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10``````

… by specifying values to keep:

``````na_if_not(-1:10, 0:10)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10``````

… using a formula:

``````na_if_in(-1:10, ~ . < 0)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10``````

### A little more complex

``messy_string <- c("abc", "", "def", "NA", "ghi", 42, "jkl", "NULL", "mno")``

We can replace unwanted values…

… one at a time:

``````na_if_in(messy_string, "")
#> [1] "abc"  NA     "def"  "NA"   "ghi"  "42"   "jkl"  "NULL" "mno"``````

… or all at once:

``````na_if_in(messy_string, "", "NA", "NULL", 1:100)
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"
na_if_in(messy_string, c("", "NA", "NULL", 1:100))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"
na_if_in(messy_string, list("", "NA", "NULL", 1:100))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"``````

… or using a clever formula:

``````grepl("[a-z]{3,}", messy_string)
#> [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
na_if_not(messy_string, ~ grepl("[a-z]{3,}", .))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"``````

### With data frames

``````faux_census
#> # A tibble: 5 × 4
#>   state    age  income gender
#>   <chr>  <dbl>   <dbl> <chr>
#> 1 TX        57 9999999 Gender is a social construct
#> 2 Canada    49  149000 Male
#> 3 NY       557   90750 f
#> 4 LA         2   61000 Male
#> 5 TN        64 9999999 M``````

na_if_in() is particularly useful inside `dplyr::mutate()`:

``````faux_census %>%
mutate(
income = na_if_in(income, 9999999),
age    = na_if_in(age, ~ . < 18, ~ . > 120),
state  = na_if_not(state, ~ grepl("^[A-Z]{2,}\$", .)),
gender = na_if_in(gender, ~ nchar(.) > 20)
)
#> # A tibble: 5 × 4
#>   state   age income gender
#>   <chr> <dbl>  <dbl> <chr>
#> 1 TX       57     NA <NA>
#> 2 <NA>     49 149000 Male
#> 3 NY       NA  90750 f
#> 4 LA       NA  61000 Male
#> 5 TN       64     NA M``````

Or you can use `dplyr::across()` on data frames:

``````faux_census %>%
mutate(
across(age, na_if_in, ~ . < 18, ~ . > 120),
across(state, na_if_not, ~ grepl("^[A-Z]{2,}\$", .)),
across(where(is.character), na_if_in, ~ nchar(.) > 20),
across(everything(), na_if_in, 9999999)
)
#> # A tibble: 5 × 4
#>   state   age income gender
#>   <chr> <dbl>  <dbl> <chr>
#> 1 TX       57     NA <NA>
#> 2 <NA>     49 149000 Male
#> 3 NY       NA  90750 f
#> 4 LA       NA  61000 Male
#> 5 TN       64     NA M``````

Hex sticker fonts are Bodoni* by indestructible type* and Source Code Pro by Adobe.

Image adapted from icon made by Freepik from flaticon.com.

Please note that fauxnaif is released with a Contributor Code of Conduct.