package | use |
---|---|
dplyr | data manipulation operations tool |
ggplot | powerful graphics language for creating elegant and complex plots |
forcats | reordering factor levels |
haven | Import and Export ‘SPSS’, ‘Stata’ and ‘SAS’ Files to R |
purrr | enhances R’s functional programming toolkit |
readr | read data (like ‘csv’, ‘tsv’, and ‘fwf’) |
readxl | get data out of Excel and into R |
stringr | make working with strings as easy as possible (regular expressions) |
tidyr | The goal of tidyr is to help you create tidy data. Tidy data is data where: |
library(dplyr)
address <- url("http://www.trutschnig.net/RTR2015.RData")
load(address)
head(RTR2015)
## id mtime
## 329731 Oadff95f3-880a-4cb3-9534-b85e932fb137 2014-01-01 00:10:00
## 368684 Oc2852599-37bd-4c3c-8fe1-15c4884f5926 2014-01-01 00:16:00
## 137787 O48a47c74-c7a6-4ac6-9c24-fc442d0674dd 2014-01-01 00:21:00
## 200828 O69f25c23-5089-45d9-8407-3159f35f4a7e 2014-01-01 00:21:00
## 450318 Oede0db57-1d56-432c-bf1d-73327add2591 2014-01-01 00:27:00
## 106797 O383e9306-9612-4c39-af8f-1b018f6db871 2014-01-01 00:41:00
## mymd mym op_name nw_cat device
## 329731 2014-01-01 2014-01 Pear 3G Galaxy Note 2
## 368684 2014-01-01 2014-01 Pear 3G Galaxy Note 2
## 137787 2014-01-01 2014-01 Kiwi 3G Galaxy Note 10.1 LTE
## 200828 2014-01-01 2014-01 Kiwi 3G Galaxy S3
## 450318 2014-01-01 2014-01 Kiwi 3G Galaxy Note 10.1 LTE
## 106797 2014-01-01 2014-01 Apple 3G iPhone 5s
## device_platform device_has_lte longitude latitude iso_adm2
## 329731 Android FALSE 13.20764 47.34299 at0504
## 368684 Android FALSE 13.22863 47.34588 at0504
## 137787 Android TRUE 15.75156 48.20457 at0319
## 200828 Android FALSE 16.39714 48.23074 at0900
## 450318 Android TRUE 15.63205 48.20800 at0302
## 106797 iOS TRUE 15.51962 47.12490 at0606
## rtr_speed_dl rtr_speed_ul rtr_ping
## 329731 3226 419 56.6295
## 368684 2068 572 41.7948
## 137787 11604 2881 63.5496
## 200828 1948 1039 45.0728
## 450318 6277 2484 59.1642
## 106797 3577 2433 47.9287
RTR=tbl_df(RTR2015)
RTR
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## * <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
glimpse(RTR)
## Observations: 239,060
## Variables: 15
## $ id <chr> "Oadff95f3-880a-4cb3-9534-b85e932fb137", "Oc28...
## $ mtime <dttm> 2014-01-01 00:10:00, 2014-01-01 00:16:00, 201...
## $ mymd <date> 2014-01-01, 2014-01-01, 2014-01-01, 2014-01-0...
## $ mym <chr> "2014-01", "2014-01", "2014-01", "2014-01", "2...
## $ op_name <chr> "Pear", "Pear", "Kiwi", "Kiwi", "Kiwi", "Apple...
## $ nw_cat <fct> 3G, 3G, 3G, 3G, 3G, 3G, 3G, 3G, 3G, 3G, 2G, 3G...
## $ device <fct> Galaxy Note 2, Galaxy Note 2, Galaxy Note 10.1...
## $ device_platform <fct> Android, Android, Android, Android, Android, i...
## $ device_has_lte <lgl> FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, F...
## $ longitude <dbl> 13.20764, 13.22863, 15.75156, 16.39714, 15.632...
## $ latitude <dbl> 47.34299, 47.34588, 48.20457, 48.23074, 48.208...
## $ iso_adm2 <fct> at0504, at0504, at0319, at0900, at0302, at0606...
## $ rtr_speed_dl <dbl> 3226, 2068, 11604, 1948, 6277, 3577, 6526, 315...
## $ rtr_speed_ul <dbl> 419, 572, 2881, 1039, 2484, 2433, 1999, 1921, ...
## $ rtr_ping <dbl> 56.6295, 41.7948, 63.5496, 45.0728, 59.1642, 4...
Verb | Semantic |
---|---|
select() |
Select variables based on their name |
filter() |
Select cases based on their values |
mutate() /transmute() |
Add new variables that are functions of existing variables |
arrange() |
Reorder the cases |
summarise() |
Condense multiple values to a single value |
select()
select(RTR, device, device_platform, op_name)
## # A tibble: 239,060 x 3
## device device_platform op_name
## * <fct> <fct> <chr>
## 1 Galaxy Note 2 Android Pear
## 2 Galaxy Note 2 Android Pear
## 3 Galaxy Note 10.1 LTE Android Kiwi
## 4 Galaxy S3 Android Kiwi
## 5 Galaxy Note 10.1 LTE Android Kiwi
## 6 iPhone 5s iOS Apple
## 7 iPhone 5s iOS Apple
## 8 iPhone 4s iOS Pear
## 9 Galaxy S4 LTE Android Pear
## 10 iPhone 5s iOS Kiwi
## # ... with 239,050 more rows
Function | Semantic |
---|---|
starts_with() |
Each variable starting with a defined string |
ends_with() |
Each variable ending with a defined string |
contains() |
Each variable containing a defined string |
matches() |
Each variable matching a defined string |
num_range() |
Each variable matching a defined string + number |
one_of() |
Each variable whose names are in a group of names |
select()
select(RTR, rtr_speed_dl:rtr_ping)
## # A tibble: 239,060 x 3
## rtr_speed_dl rtr_speed_ul rtr_ping
## * <dbl> <dbl> <dbl>
## 1 3226. 419. 56.6
## 2 2068. 572. 41.8
## 3 11604. 2881. 63.5
## 4 1948. 1039. 45.1
## 5 6277. 2484. 59.2
## 6 3577. 2433. 47.9
## 7 6526. 1999. 47.5
## 8 3155. 1921. 39.9
## 9 11354. 837. 47.8
## 10 13600. 3048. 59.0
## # ... with 239,050 more rows
select(RTR, -c(id, iso_adm2))
## # A tibble: 239,060 x 13
## mtime mymd mym op_name nw_cat device
## * <dttm> <date> <chr> <chr> <fct> <fct>
## 1 2014-01-01 00:10:00 2014-01-01 2014-01 Pear 3G Galaxy Note 2
## 2 2014-01-01 00:16:00 2014-01-01 2014-01 Pear 3G Galaxy Note 2
## 3 2014-01-01 00:21:00 2014-01-01 2014-01 Kiwi 3G Galaxy Note 10.1…
## 4 2014-01-01 00:21:00 2014-01-01 2014-01 Kiwi 3G Galaxy S3
## 5 2014-01-01 00:27:00 2014-01-01 2014-01 Kiwi 3G Galaxy Note 10.1…
## 6 2014-01-01 00:41:00 2014-01-01 2014-01 Apple 3G iPhone 5s
## 7 2014-01-01 00:42:00 2014-01-01 2014-01 Apple 3G iPhone 5s
## 8 2014-01-01 00:48:00 2014-01-01 2014-01 Pear 3G iPhone 4s
## 9 2014-01-01 01:05:00 2014-01-01 2014-01 Pear 3G Galaxy S4 LTE
## 10 2014-01-01 01:07:00 2014-01-01 2014-01 Kiwi 3G iPhone 5s
## # ... with 239,050 more rows, and 7 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
device <- 5
select(RTR, device)
## # A tibble: 239,060 x 1
## device
## * <fct>
## 1 Galaxy Note 2
## 2 Galaxy Note 2
## 3 Galaxy Note 10.1 LTE
## 4 Galaxy S3
## 5 Galaxy Note 10.1 LTE
## 6 iPhone 5s
## 7 iPhone 5s
## 8 iPhone 4s
## 9 Galaxy S4 LTE
## 10 iPhone 5s
## # ... with 239,050 more rows
select(RTR, identity(device))
## # A tibble: 239,060 x 1
## op_name
## * <chr>
## 1 Pear
## 2 Pear
## 3 Kiwi
## 4 Kiwi
## 5 Kiwi
## 6 Apple
## 7 Apple
## 8 Pear
## 9 Pear
## 10 Kiwi
## # ... with 239,050 more rows
select(RTR, !! device)
## # A tibble: 239,060 x 1
## op_name
## * <chr>
## 1 Pear
## 2 Pear
## 3 Kiwi
## 4 Kiwi
## 5 Kiwi
## 6 Apple
## 7 Apple
## 8 Pear
## 9 Pear
## 10 Kiwi
## # ... with 239,050 more rows
select(RTR, contains("ng"))
## # A tibble: 239,060 x 2
## longitude rtr_ping
## * <dbl> <dbl>
## 1 13.2 56.6
## 2 13.2 41.8
## 3 15.8 63.5
## 4 16.4 45.1
## 5 15.6 59.2
## 6 15.5 47.9
## 7 15.5 47.5
## 8 13.2 39.9
## 9 13.8 47.8
## 10 15.2 59.0
## # ... with 239,050 more rows
one_of()
)!select(RTR, one_of(c("device", "op_name", "rtr_ping")))
## # A tibble: 239,060 x 3
## device op_name rtr_ping
## * <fct> <chr> <dbl>
## 1 Galaxy Note 2 Pear 56.6
## 2 Galaxy Note 2 Pear 41.8
## 3 Galaxy Note 10.1 LTE Kiwi 63.5
## 4 Galaxy S3 Kiwi 45.1
## 5 Galaxy Note 10.1 LTE Kiwi 59.2
## 6 iPhone 5s Apple 47.9
## 7 iPhone 5s Apple 47.5
## 8 iPhone 4s Pear 39.9
## 9 Galaxy S4 LTE Pear 47.8
## 10 iPhone 5s Kiwi 59.0
## # ... with 239,050 more rows
RTR_devices <- select(RTR, starts_with("dev"))
select(RTR_devices, ends_with("e"))
## # A tibble: 239,060 x 2
## device device_has_lte
## * <fct> <lgl>
## 1 Galaxy Note 2 FALSE
## 2 Galaxy Note 2 FALSE
## 3 Galaxy Note 10.1 LTE TRUE
## 4 Galaxy S3 FALSE
## 5 Galaxy Note 10.1 LTE TRUE
## 6 iPhone 5s TRUE
## 7 iPhone 5s TRUE
## 8 iPhone 4s FALSE
## 9 Galaxy S4 LTE TRUE
## 10 iPhone 5s TRUE
## # ... with 239,050 more rows
mutate()
transform()
transmute()
mutate(RTR, diff_speed = rtr_speed_ul - rtr_speed_dl)
## # A tibble: 239,060 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 9 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>,
## # diff_speed <dbl>
mutate(RTR,
abs_diff_speed = abs(rtr_speed_ul - rtr_speed_dl),
perc_diff_speed = abs_diff_speed / rtr_speed_dl * 100)
## # A tibble: 239,060 x 17
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 10 more variables:
## # device_platform <fct>, device_has_lte <lgl>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>, abs_diff_speed <dbl>,
## # perc_diff_speed <dbl>
transmute(RTR,
diff_speed = rtr_speed_ul - rtr_speed_dl,
abs_diff_speed = abs(rtr_speed_ul - rtr_speed_dl))
## # A tibble: 239,060 x 2
## diff_speed abs_diff_speed
## <dbl> <dbl>
## 1 -2807. 2807.
## 2 -1496. 1496.
## 3 -8723. 8723.
## 4 -909. 909.
## 5 -3793. 3793.
## 6 -1144. 1144.
## 7 -4527. 4527.
## 8 -1234. 1234.
## 9 -10517. 10517.
## 10 -10552. 10552.
## # ... with 239,050 more rows
var <- "Hello World"
mutate(RTR, new = var)
## # A tibble: 239,060 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 9 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>, new <chr>
var <- seq(1, nrow(RTR))
mutate(RTR, new = var)
## # A tibble: 239,060 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 9 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>, new <int>
mutate()
mutate(RTR, sum_long_lat = longitude + latitude)
## # A tibble: 239,060 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 9 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>,
## # sum_long_lat <dbl>
mutate(RTR, ping_seconds = rtr_ping / 1000)
## # A tibble: 239,060 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 9 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>,
## # ping_seconds <dbl>
transmute(RTR, quot_lat_long = latitude / longitude)
## # A tibble: 239,060 x 1
## quot_lat_long
## <dbl>
## 1 3.58
## 2 3.58
## 3 3.06
## 4 2.94
## 5 3.08
## 6 3.04
## 7 3.04
## 8 3.59
## 9 3.40
## 10 3.21
## # ... with 239,050 more rows
filter()
|
as logical “or” operator between variablesis.na
, and %in%
filter(RTR, op_name == "Pear")
## # A tibble: 100,965 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3-… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 2 Oc2852599-… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy…
## 3 Oef71e021-… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone…
## 4 Oef957387-… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy…
## 5 Oe0bab19d-… 2014-01-01 01:25:00 2014-01-01 2014… Pear 3G Nexus 4
## 6 O25373236-… 2014-01-01 02:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 7 O8e2eaee3-… 2014-01-01 02:29:00 2014-01-01 2014… Pear 3G Galaxy…
## 8 O6b01181c-… 2014-01-01 06:57:00 2014-01-01 2014… Pear 2G SM-T211
## 9 O69618fa8-… 2014-01-01 07:14:00 2014-01-01 2014… Pear 4G Galaxy…
## 10 Od48e6a8c-… 2014-01-01 07:15:00 2014-01-01 2014… Pear 3G Galaxy…
## # ... with 100,955 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, op_name == "Pear", nw_cat == "3G")
## # A tibble: 53,091 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3-… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 2 Oc2852599-… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy…
## 3 Oef71e021-… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone…
## 4 Oef957387-… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy…
## 5 Oe0bab19d-… 2014-01-01 01:25:00 2014-01-01 2014… Pear 3G Nexus 4
## 6 O25373236-… 2014-01-01 02:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 7 O8e2eaee3-… 2014-01-01 02:29:00 2014-01-01 2014… Pear 3G Galaxy…
## 8 Od48e6a8c-… 2014-01-01 07:15:00 2014-01-01 2014… Pear 3G Galaxy…
## 9 O89a7c80e-… 2014-01-01 07:17:00 2014-01-01 2014… Pear 3G LG G2
## 10 Of8ff4a2d-… 2014-01-01 07:35:00 2014-01-01 2014… Pear 3G Nexus 5
## # ... with 53,081 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
%in%
filter(RTR, device %in% c("Galaxy Note 2", "Galaxy Note 10.1"))
## # A tibble: 2,733 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3-… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 2 Oc2852599-… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy…
## 3 O2dfb19a1-… 2014-01-01 10:34:00 2014-01-01 2014… Kiwi 3G Galaxy…
## 4 Oc88d177a-… 2014-01-01 11:59:00 2014-01-01 2014… Apple 3G Galaxy…
## 5 O879652cc-… 2014-01-01 13:49:00 2014-01-01 2014… Pear 2G Galaxy…
## 6 O5ac02695-… 2014-01-01 15:13:00 2014-01-01 2014… Pear 3G Galaxy…
## 7 O9baf2d9f-… 2014-01-02 07:11:00 2014-01-02 2014… Pear 3G Galaxy…
## 8 O7f3e3424-… 2014-01-02 07:20:00 2014-01-02 2014… Pear 3G Galaxy…
## 9 O1939c394-… 2014-01-02 09:32:00 2014-01-02 2014… Apple 3G Galaxy…
## 10 Ob369df63-… 2014-01-02 09:49:00 2014-01-02 2014… Apple 3G Galaxy…
## # ... with 2,723 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
|
filter(RTR, rtr_speed_dl < 3000 | rtr_speed_ul < 500)
## # A tibble: 55,830 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 Oe2f2adbe… 2014-01-01 01:14:00 2014-01-01 2014… Kiwi 2G HTC Wil…
## 5 Oe0bab19d… 2014-01-01 01:25:00 2014-01-01 2014… Pear 3G Nexus 4
## 6 O7cb19690… 2014-01-01 01:26:00 2014-01-01 2014… Kiwi 2G Galaxy …
## 7 Oe1239547… 2014-01-01 02:14:00 2014-01-01 2014… Apple 2G Galaxy …
## 8 Od5a015ed… 2014-01-01 02:31:00 2014-01-01 2014… Apple 3G Galaxy …
## 9 O8e8bb20f… 2014-01-01 06:53:00 2014-01-01 2014… Apple 3G Galaxy …
## 10 O6b01181c… 2014-01-01 06:57:00 2014-01-01 2014… Pear 2G SM-T211
## # ... with 55,820 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter()
filter(RTR, device == "Galaxy Note 2")
## # A tibble: 2,082 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3-… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 2 Oc2852599-… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy…
## 3 O879652cc-… 2014-01-01 13:49:00 2014-01-01 2014… Pear 2G Galaxy…
## 4 O5ac02695-… 2014-01-01 15:13:00 2014-01-01 2014… Pear 3G Galaxy…
## 5 O9baf2d9f-… 2014-01-02 07:11:00 2014-01-02 2014… Pear 3G Galaxy…
## 6 O7f3e3424-… 2014-01-02 07:20:00 2014-01-02 2014… Pear 3G Galaxy…
## 7 Ob369df63-… 2014-01-02 09:49:00 2014-01-02 2014… Apple 3G Galaxy…
## 8 Obec412c9-… 2014-01-02 11:38:00 2014-01-02 2014… Pear 3G Galaxy…
## 9 O71038e64-… 2014-01-02 11:45:00 2014-01-02 2014… Pear 3G Galaxy…
## 10 Oc7129058-… 2014-01-02 12:16:00 2014-01-02 2014… Apple 3G Galaxy…
## # ... with 2,072 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, device_platform == "iOS", device_has_lte == TRUE)
## # A tibble: 75,025 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 2 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 3 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## 4 O56352054… 2014-01-01 02:23:00 2014-01-01 2014… Kiwi 3G iPhone …
## 5 O170f6f0d… 2014-01-01 08:41:00 2014-01-01 2014… Apple 3G iPhone 5
## 6 O1f55288c… 2014-01-01 09:08:00 2014-01-01 2014… Pear 3G iPhone 5
## 7 O497ba5c7… 2014-01-01 09:23:00 2014-01-01 2014… Pear 3G iPhone 5
## 8 O3a0cc9be… 2014-01-01 10:22:00 2014-01-01 2014… Apple 3G iPad mi…
## 9 O8afb0b63… 2014-01-01 10:29:00 2014-01-01 2014… Apple 3G iPhone 5
## 10 O71c8d6ea… 2014-01-01 10:30:00 2014-01-01 2014… Pear 3G iPhone 5
## # ... with 75,015 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, longitude > 13, longitude < 14)
## # A tibble: 23,772 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3-… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy…
## 2 Oc2852599-… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy…
## 3 Oef71e021-… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone…
## 4 Oef957387-… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy…
## 5 Od5a015ed-… 2014-01-01 02:31:00 2014-01-01 2014… Apple 3G Galaxy…
## 6 Of6074f8d-… 2014-01-01 02:40:00 2014-01-01 2014… Apple 3G HTC On…
## 7 Of4f037aa-… 2014-01-01 08:06:00 2014-01-01 2014… Kiwi 3G Galaxy…
## 8 O2e2223e0-… 2014-01-01 08:45:00 2014-01-01 2014… Pear 3G iPhone…
## 9 Obe795f00-… 2014-01-01 10:36:00 2014-01-01 2014… Pear 3G Galaxy…
## 10 Oaabff0d5-… 2014-01-01 10:41:00 2014-01-01 2014… Apple 3G Galaxy…
## # ... with 23,762 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, longitude %in% c(13:16))
## # A tibble: 37 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O3b76250e-0… 2014-10-11 07:39:44 2014-10-11 2014… Apple 3G iPhon…
## 2 Od45f5834-0… 2014-10-16 14:37:03 2014-10-16 2014… Kiwi 3G iPhon…
## 3 O19cd7844-b… 2014-10-20 08:16:10 2014-10-20 2014… Kiwi 3G iPhon…
## 4 O39f99095-0… 2014-10-26 04:43:33 2014-10-26 2014… Pear 3G iPhon…
## 5 O24bd0f28-6… 2014-12-29 20:39:35 2014-12-29 2014… Pear 4G iPhon…
## 6 Odf21e5e7-9… 2015-01-25 19:00:03 2015-01-25 2015… Apple 4G iPhon…
## 7 O9636cf3d-a… 2015-02-18 13:17:55 2015-02-18 2015… Pear 3G iPhon…
## 8 O59aa5e04-b… 2015-03-07 21:27:03 2015-03-07 2015… Apple 4G iPhon…
## 9 O3b6389d8-0… 2015-04-15 22:46:46 2015-04-15 2015… Apple 4G iPhon…
## 10 O4cbaa4aa-6… 2015-04-15 22:47:12 2015-04-15 2015… Apple 4G iPhon…
## # ... with 27 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, !is.na(device_platform))
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, rtr_speed_dl > mean(rtr_speed_dl))
## # A tibble: 76,742 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O418baa97-… 2014-01-01 07:13:00 2014-01-01 2014… Apple 3G ME302KL
## 2 O69618fa8-… 2014-01-01 07:14:00 2014-01-01 2014… Pear 4G Galaxy…
## 3 Oefdc9660-… 2014-01-01 09:22:00 2014-01-01 2014… Pear 3G Galaxy…
## 4 O9a22bd24-… 2014-01-01 11:59:00 2014-01-01 2014… Pear 3G Galaxy…
## 5 Oef360337-… 2014-01-01 15:49:00 2014-01-01 2014… Pear 4G Galaxy…
## 6 Ob815d039-… 2014-01-01 15:50:00 2014-01-01 2014… Pear 4G Galaxy…
## 7 O9ccd7011-… 2014-01-02 00:49:00 2014-01-02 2014… Kiwi 4G iPhone…
## 8 Ob9e210c4-… 2014-01-02 05:27:00 2014-01-02 2014… Apple 4G Nexus 7
## 9 O8c771612-… 2014-01-02 06:28:00 2014-01-02 2014… Pear 3G Galaxy…
## 10 Oc1e6bea7-… 2014-01-02 06:48:00 2014-01-02 2014… Apple 4G Nexus 7
## # ... with 76,732 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, (rtr_speed_ul / rtr_speed_dl) < 0.5)
## # A tibble: 168,036 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 6 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 7 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## 8 Oe2f2adbe… 2014-01-01 01:14:00 2014-01-01 2014… Kiwi 2G HTC Wil…
## 9 Oe0bab19d… 2014-01-01 01:25:00 2014-01-01 2014… Pear 3G Nexus 4
## 10 O7cb19690… 2014-01-01 01:26:00 2014-01-01 2014… Kiwi 2G Galaxy …
## # ... with 168,026 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
filter(RTR, rtr_speed_dl > 10000| device_has_lte == TRUE)
## # A tibble: 187,675 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 2 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 3 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 4 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 5 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 6 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## 7 Oe0bab19d… 2014-01-01 01:25:00 2014-01-01 2014… Pear 3G Nexus 4
## 8 O25373236… 2014-01-01 02:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 9 Oe1239547… 2014-01-01 02:14:00 2014-01-01 2014… Apple 2G Galaxy …
## 10 O56352054… 2014-01-01 02:23:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 187,665 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
arrange()
arrange
do?arrange(.data,...)
.data
= df/tbl…
= Comma separated list of unquoted variable namesarrange(x, ...)
arrange(x, desc(...))
arrange(RTR,mtime) #lowest on top
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
arrange(RTR,desc(mtime)) #highest on top
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Of7c626b… 2015-11-24 23:54:27 2015-11-24 2015… Kiwi 3G Galaxy S…
## 2 O8a0b863… 2015-11-24 23:48:12 2015-11-24 2015… Pear 4G iPhone 6…
## 3 O50932ec… 2015-11-24 23:44:50 2015-11-24 2015… Apple 4G iPhone 6
## 4 O3d996a9… 2015-11-24 23:44:46 2015-11-24 2015… Apple 3G Samsung …
## 5 O9bfca0d… 2015-11-24 23:43:20 2015-11-24 2015… Apple 3G Samsung …
## 6 O5119e98… 2015-11-24 23:41:34 2015-11-24 2015… Pear 4G Samsung …
## 7 O24c51b7… 2015-11-24 23:23:56 2015-11-24 2015… Pear 4G Samsung …
## 8 Oafedb30… 2015-11-24 23:18:35 2015-11-24 2015… Pear 4G Galaxy S…
## 9 O6036555… 2015-11-24 23:10:42 2015-11-24 2015… Pear 4G Galaxy S…
## 10 Oe45067c… 2015-11-24 23:08:54 2015-11-24 2015… Pear 3G Galaxy S…
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
arrange(A, weekday)
## # A tibble: 1,096 x 5
## ymd weekday nr_weekday sum_out holiday
## <fct> <fct> <int> <int> <dbl>
## 1 2007-01-05 Fri 5 25650 0.500
## 2 2007-01-12 Fri 5 13890 0.
## 3 2007-01-19 Fri 5 13770 0.
## 4 2007-01-26 Fri 5 14660 0.
## 5 2007-02-02 Fri 5 17010 0.
## 6 2007-02-09 Fri 5 20940 0.
## 7 2007-02-16 Fri 5 15900 0.
## 8 2007-02-23 Fri 5 13940 0.
## 9 2007-03-02 Fri 5 20190 0.
## 10 2007-03-09 Fri 5 18920 0.
## # ... with 1,086 more rows
arrange()
can arrange by multiple columns
arrange(x,col_1,…,col_i)
RTR_sel <- select(RTR, id, rtr_speed_ul,rtr_speed_dl,rtr_ping)
arrange(RTR_sel, rtr_speed_ul)
## # A tibble: 239,060 x 4
## id rtr_speed_ul rtr_speed_dl rtr_ping
## <chr> <dbl> <dbl> <dbl>
## 1 O432bcf6b-b63f-4d43-af52-87fb5adec5… 1. 42. 79.7
## 2 O9f5662ea-7408-401c-83cb-87911f3442… 1. 19. 220.
## 3 Oe6465982-64f0-49d5-843e-45474ab0ca… 1. 418. 83.9
## 4 O3866a77e-67ec-4611-a7a4-dc5d33efbd… 1. 126. 320.
## 5 O104a7cfe-8c62-4a8a-9afc-37c189965d… 1. 2250. 55.3
## 6 O546f6037-0dab-49d7-9b96-21c5e9cc9e… 1. 57. 279.
## 7 O74d7ac03-ff7f-4b3c-8869-7398eb6cd2… 1. 20. 139.
## 8 Oe9113807-1dbe-481f-aae9-b9e686c81f… 1. 105. 158.
## 9 O1dab072d-bc1a-411a-a8b5-ecb590b054… 1. 92. 325.
## 10 O88da59f3-ec80-4f48-a9a2-1b2eedeec7… 1. 23. 360.
## # ... with 239,050 more rows
arrange(RTR_sel, rtr_speed_ul,rtr_speed_dl,rtr_ping)
## # A tibble: 239,060 x 4
## id rtr_speed_ul rtr_speed_dl rtr_ping
## <chr> <dbl> <dbl> <dbl>
## 1 O9f5662ea-7408-401c-83cb-87911f3442… 1. 19. 220.
## 2 O74d7ac03-ff7f-4b3c-8869-7398eb6cd2… 1. 20. 139.
## 3 O88da59f3-ec80-4f48-a9a2-1b2eedeec7… 1. 23. 360.
## 4 O432bcf6b-b63f-4d43-af52-87fb5adec5… 1. 42. 79.7
## 5 O94402959-1e7d-4ed5-95d3-33bb4e99ac… 1. 42. 338.
## 6 O2cbca85a-8bad-47bd-a977-0094b72787… 1. 52. 501.
## 7 O546f6037-0dab-49d7-9b96-21c5e9cc9e… 1. 57. 279.
## 8 O1dab072d-bc1a-411a-a8b5-ecb590b054… 1. 92. 325.
## 9 Oe9113807-1dbe-481f-aae9-b9e686c81f… 1. 105. 158.
## 10 O3866a77e-67ec-4611-a7a4-dc5d33efbd… 1. 126. 320.
## # ... with 239,050 more rows
rank = rank(x)
with mutate, over a grouped datasetarrange(rank)
### fastest device, within group
RTR %>%
filter(!is.na(rtr_speed_ul) & !is.na(rtr_speed_dl))%>%
group_by(device) %>%
mutate(total_speed = rtr_speed_dl + rtr_speed_ul, rank_per_device = rank(desc(total_speed))) %>%
arrange(device, rank_per_device) %>%
select(device, total_speed, rank_per_device)
## # A tibble: 239,060 x 3
## # Groups: device [980]
## device total_speed rank_per_device
## <fct> <dbl> <dbl>
## 1 02_jbla668 13209. 1.
## 2 2206_jbla768_f 8828. 1.
## 3 5042D 121736. 1.
## 4 5042D 95326. 2.
## 5 5042D 82375. 3.
## 6 5042D 76154. 4.
## 7 5042D 75454. 5.
## 8 5042D 72084. 6.
## 9 5042D 71631. 7.
## 10 5042D 70441. 8.
## # ... with 239,050 more rows
###fastest device, between group
RTR %>%
filter(!is.na(rtr_speed_ul) & !is.na(rtr_speed_dl))%>%
group_by(device) %>%
summarise(mean_total_speed = mean(rtr_speed_dl + rtr_speed_ul)) %>%
mutate(rank_best_device = rank(desc(mean_total_speed)))%>%
arrange(rank_best_device) %>%
select(device, mean_total_speed, rank_best_device)
## # A tibble: 980 x 3
## device mean_total_speed rank_best_device
## <fct> <dbl> <dbl>
## 1 STV100-4 150651. 1.
## 2 PLK-UL00 132022. 2.
## 3 E560 124192. 3.
## 4 X12 116530. 4.
## 5 Samsung Note 4 (SM-N910F) 115078. 5.
## 6 ZTE Blade A452 114368. 6.
## 7 SM-N920F 114082. 7.
## 8 SM-A510F 112789. 8.
## 9 PLK-AL10 112204. 9.
## 10 SM-A310F 109553. 10.
## # ... with 970 more rows
### find the device with the smallest range in total speed
RTR %>%
filter(!is.na(rtr_speed_ul) & !is.na(rtr_speed_dl))%>%
group_by(device) %>%
summarise(range_tot_speed = diff(range(rtr_speed_dl + rtr_speed_ul)), n_devices = n()) %>%
filter(n_devices > 5) %>%
mutate(rank_smallest_range = rank(range_tot_speed)) %>%
arrange(rank_smallest_range)
## # A tibble: 601 x 4
## device range_tot_speed n_devices rank_smallest_range
## <fct> <dbl> <int> <dbl>
## 1 ZP300 158. 10 1.
## 2 RC0709B 1223. 8 2.
## 3 RunboX5-King 1579. 9 3.
## 4 LA-Q1 2196. 7 4.
## 5 Galaxy Mini (GT-S5570) 2269. 7 5.
## 6 C2005 2299. 10 6.
## 7 MEDION LIFE P4310 2512. 32 7.
## 8 ZTE MEDION LIFE P4310 2790. 7 8.
## 9 QUANTUM 4 2946. 7 9.
## 10 Sony Xperia tipo Dual 2990. 6 10.
## # ... with 591 more rows
arrange()
arrange(RTR, rtr_speed_ul)
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O432bcf6b-b… 2014-01-02 14:53:00 2014-01-02 2014… Kiwi 3G iPhon…
## 2 O9f5662ea-7… 2014-01-07 12:24:00 2014-01-07 2014… Kiwi 2G iPhon…
## 3 Oe6465982-6… 2014-01-11 10:18:00 2014-01-11 2014… Apple 3G iPhon…
## 4 O3866a77e-6… 2014-01-19 07:19:00 2014-01-19 2014… Pear 2G iPhon…
## 5 O104a7cfe-8… 2014-01-26 19:45:00 2014-01-26 2014… Apple 3G iPhon…
## 6 O546f6037-0… 2014-03-29 18:53:00 2014-03-29 2014… Pear 2G iPhon…
## 7 O74d7ac03-f… 2014-04-26 15:21:00 2014-04-26 2014… Apple 3G iPhon…
## 8 Oe9113807-1… 2014-05-05 10:17:00 2014-05-05 2014… Kiwi 2G iPhon…
## 9 O1dab072d-b… 2014-05-24 15:57:00 2014-05-24 2014… Pear 2G iPhon…
## 10 O88da59f3-e… 2014-08-09 14:24:00 2014-08-09 2014… Pear 2G iPhon…
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
desc()
- arrange by upload-speed, highest on top and in second order by lowest pingarrange(RTR, desc(rtr_speed_ul), rtr_ping)
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O87536f15-… 2014-08-08 18:45:00 2014-08-08 2014… Pear 4G Galaxy…
## 2 O430537d0-… 2015-11-18 21:57:42 2015-11-18 2015… Pear 4G SM-G92…
## 3 Ofbc28c59-… 2014-09-08 16:58:00 2014-09-08 2014… Pear 4G Galaxy…
## 4 O976b7a52-… 2014-11-07 12:14:40 2014-11-07 2014… Pear 4G Galaxy…
## 5 Ocfddd626-… 2014-11-06 07:27:00 2014-11-06 2014… Pear 4G Galaxy…
## 6 O008e8fca-… 2014-12-11 05:11:22 2014-12-11 2014… Pear 4G A0001
## 7 Od84cb3d1-… 2014-08-28 17:28:00 2014-08-28 2014… Pear 4G Galaxy…
## 8 Of214831b-… 2014-09-04 06:21:00 2014-09-04 2014… Pear 4G Galaxy…
## 9 Ofd40caec-… 2014-08-29 14:34:00 2014-08-29 2014… Pear 4G Galaxy…
## 10 O4d895570-… 2014-10-19 09:04:39 2014-10-19 2014… Pear 4G Sony X…
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
arrange(RTR,rtr_speed_ul,rtr_speed_dl,desc(rtr_ping))
## # A tibble: 239,060 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 O9f5662ea-7… 2014-01-07 12:24:00 2014-01-07 2014… Kiwi 2G iPhon…
## 2 O74d7ac03-f… 2014-04-26 15:21:00 2014-04-26 2014… Apple 3G iPhon…
## 3 O88da59f3-e… 2014-08-09 14:24:00 2014-08-09 2014… Pear 2G iPhon…
## 4 O94402959-1… 2014-08-24 10:08:00 2014-08-24 2014… Pear 2G iPhon…
## 5 O432bcf6b-b… 2014-01-02 14:53:00 2014-01-02 2014… Kiwi 3G iPhon…
## 6 O2cbca85a-8… 2015-07-06 08:42:42 2015-07-06 2015… Pear 2G iPhon…
## 7 O546f6037-0… 2014-03-29 18:53:00 2014-03-29 2014… Pear 2G iPhon…
## 8 O1dab072d-b… 2014-05-24 15:57:00 2014-05-24 2014… Pear 2G iPhon…
## 9 Oe9113807-1… 2014-05-05 10:17:00 2014-05-05 2014… Kiwi 2G iPhon…
## 10 O3866a77e-6… 2014-01-19 07:19:00 2014-01-19 2014… Pear 2G iPhon…
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
temp1 <- arrange(RTR, full_speed = rtr_speed_dl + rtr_speed_ul, desc(rtr_ping))
select(temp1, device, nw_cat)
## # A tibble: 239,060 x 2
## device nw_cat
## <fct> <fct>
## 1 iPhone 5s 2G
## 2 iPhone 5s 2G
## 3 iPhone 5s 2G
## 4 Galaxy S3 2G
## 5 iPhone 5 2G
## 6 iPhone 5s 2G
## 7 iPad 2 GSM 3G
## 8 Sony Xperia SP 2G
## 9 HTC One 3G
## 10 Galaxy S2 2G
## # ... with 239,050 more rows
summarise()
summarise()
do?summarise(.data,…)
…
are name-value pairs of comma separated summary functions
name
= name of the variable in the resultsvalue
= an expression that returns a single value (e.g. min(x)) summarise(RTR, max_download_speed = max(rtr_speed_dl))
## # A tibble: 1 x 1
## max_download_speed
## <dbl>
## 1 149989.
(a few) Functions | Semantic |
---|---|
min(x) | smallest value of x |
max(x) | gives back the biggest value of x |
n() | number of observations |
n_distinct(x) | number of unique observations of x |
mean(x) | mean of x |
median(x) | median of x |
RTR_fil <- filter(RTR, !is.na(rtr_speed_dl))
summarise(RTR_fil, n_devices = n(), n_unique_devices = n_distinct(device), mean_dl_speed = mean(rtr_speed_dl))
## # A tibble: 1 x 3
## n_devices n_unique_devices mean_dl_speed
## <int> <int> <dbl>
## 1 239060 980 21904.
summarise()
?
summarise()
summarise(RTR, n_observed = n(), unique_devices = n_distinct(device))
## # A tibble: 1 x 2
## n_observed unique_devices
## <int> <int>
## 1 239060 980
summarise(RTR, max(rtr_speed_dl), min(rtr_speed_ul), median(mymd), mean(rtr_ping))
## # A tibble: 1 x 4
## `max(rtr_speed_dl)` `min(rtr_speed_ul)` `median(mymd)` `mean(rtr_ping)`
## <dbl> <dbl> <date> <dbl>
## 1 149989. 1. 2015-01-20 54.1
temp1 <- filter(RTR, !is.na(rtr_speed_dl) & device_platform == "Android")
summarise(temp1, n_devices = n(), n_unique_devices = n_distinct(device), mean_dl_speed = mean(rtr_speed_dl))
## # A tibble: 1 x 3
## n_devices n_unique_devices mean_dl_speed
## <int> <int> <dbl>
## 1 154519 951 20261.
.data %>% function(x)
equivalent to function(.data,x)
.data
= df/tblx
= any parameters summarise(RTR,min_longitude=min(longitude),max_longitude=max(longitude))
RTR %>%
summarise(min_longitude=min(longitude),max_longitude=max(longitude))
## # A tibble: 1 x 2
## min_longitude max_longitude
## <dbl> <dbl>
## 1 9.54 17.1
summarise(RTR,min_latitude=min(latitude),max_latitude=max(latitude))
RTR %>%
summarise(min_latitude=min(latitude),max_latitude=max(latitude))
## # A tibble: 1 x 2
## min_latitude max_latitude
## <dbl> <dbl>
## 1 46.4 49.0
### Piping
RTR %>%
summarise(mean_rtr_speed_dl = mean(rtr_speed_dl, na.rm = TRUE), median_rtr_speed_ul=median(rtr_speed_ul,na.rm = TRUE))
## # A tibble: 1 x 2
## mean_rtr_speed_dl median_rtr_speed_ul
## <dbl> <dbl>
## 1 21904. 2656.
RTR %>%
filter(device_has_lte==TRUE) %>%
summarise(mean_rtr_speed_dl=mean(rtr_speed_dl))
## # A tibble: 1 x 1
## mean_rtr_speed_dl
## <dbl>
## 1 27156.
RTR %>%
filter(device_has_lte==TRUE) %>%
summarise(median_rtr_speed_dl=median(rtr_speed_dl))
## # A tibble: 1 x 1
## median_rtr_speed_dl
## <dbl>
## 1 16838.
RTR %>%
filter(device_has_lte==FALSE) %>%
summarise(median_rtr_speed_dl=median(rtr_speed_dl), max_rtr_speed_dl=max(rtr_speed_dl))
## # A tibble: 1 x 2
## median_rtr_speed_dl max_rtr_speed_dl
## <dbl> <dbl>
## 1 3261. 127231.
RTR %>%
filter(op_name=='Pear') %>%
select(contains('rtr')) %>%
summarise_all(funs(mean))
## # A tibble: 1 x 3
## rtr_speed_dl rtr_speed_ul rtr_ping
## <dbl> <dbl> <dbl>
## 1 25213. 9063. 51.4
RTR %>%
filter(device=='Galaxy Note 2', op_name=='Kiwi',rtr_ping > 70, rtr_speed_ul> 2000) %>%
summarise(mean_rtr_speed_dl = mean(rtr_speed_dl))
## # A tibble: 1 x 1
## mean_rtr_speed_dl
## <dbl>
## 1 5658.
e_median_dl=RTR %>%
filter(longitude>13.033333) %>%
summarise(east_median_dl= median(rtr_speed_dl))
w_median_dl= RTR %>%
filter(longitude<13.033333) %>%
summarise(west_median_dl= median(rtr_speed_dl))
e_median_dl=e_median_dl$east_median_dl
w_median_dl=w_median_dl$west_median_dl
cat(paste('The median download rate is ',e_median_dl,' in the east of Salzburg and ',w_median_dl,' in the west.',sep=''))
## The median download rate is 11034 in the east of Salzburg and 9478 in the west.
lat=RTR %>%
summarise(latitude_median= median(latitude))
lat_median=lat$latitude_median[1]
north_median=RTR %>%
filter(latitude>lat_median) %>%
summarise(north_median_ul= median(rtr_speed_ul))
south_median=RTR %>%
filter(latitude<lat_median) %>%
summarise(south_median_ul= median(rtr_speed_ul))
north_median=north_median$north_median_ul
south_median=south_median$south_median_ul
cat(paste("The latitude median is ",lat_median,'. The obtained median upload rate north this line is ',north_median,' and south \n this line ',south_median, '.',sep=''))
## The latitude median is 48.08865835. The obtained median upload rate north this line is 2807 and south
## this line 2506.
group_by()
group_by()
affect the basic verbs?Verb | Effect when grouped |
---|---|
select() |
Same effect as ungrouped. Grouping variables are always retained |
arrange() |
Same effect as ungrouped. When .by_group = TRUE it first orders by grouping variables |
filter() |
Same effect as ungrouped. Useful functions like rank() or min(x) == x |
mutate() |
Same effect as ungrouped. Useful functions like rank() or min(x) == x |
summarise() |
Computes the summary for each group |
group_by()
is most powerful when combined with summarise()
!group_by(RTR, device)
## # A tibble: 239,060 x 15
## # Groups: device [980]
## id mtime mymd mym op_name nw_cat device
## * <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Oadff95f3… 2014-01-01 00:10:00 2014-01-01 2014… Pear 3G Galaxy …
## 2 Oc2852599… 2014-01-01 00:16:00 2014-01-01 2014… Pear 3G Galaxy …
## 3 O48a47c74… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 4 O69f25c23… 2014-01-01 00:21:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 5 Oede0db57… 2014-01-01 00:27:00 2014-01-01 2014… Kiwi 3G Galaxy …
## 6 O383e9306… 2014-01-01 00:41:00 2014-01-01 2014… Apple 3G iPhone …
## 7 O4b28bf72… 2014-01-01 00:42:00 2014-01-01 2014… Apple 3G iPhone …
## 8 Oef71e021… 2014-01-01 00:48:00 2014-01-01 2014… Pear 3G iPhone …
## 9 Oef957387… 2014-01-01 01:05:00 2014-01-01 2014… Pear 3G Galaxy …
## 10 O6c813ae1… 2014-01-01 01:07:00 2014-01-01 2014… Kiwi 3G iPhone …
## # ... with 239,050 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
summarise()
peels off one level of the groupingdev <- group_by(RTR, device_platform, device)
(per_dev <- summarise(dev, count = n()))
## # A tibble: 980 x 3
## # Groups: device_platform [?]
## device_platform device count
## <fct> <fct> <int>
## 1 Android 02_jbla668 1
## 2 Android 2206_jbla768_f 1
## 3 Android 5042D 43
## 4 Android 6039Y 10
## 5 Android 6045K 6
## 6 Android 6045Y 7
## 7 Android 7043K 1
## 8 Android 709v82_jbla118 14
## 9 Android 8030Y 2
## 10 Android 96WJ99PJ76LT39B1 6
## # ... with 970 more rows
(per_plat <- summarise(per_dev, count = sum(count)))
## # A tibble: 2 x 2
## device_platform count
## <fct> <int>
## 1 Android 154519
## 2 iOS 84541
dl <- RTR %>%
group_by(device) %>%
summarise(count = n(), dl = mean(rtr_speed_dl, na.rm = TRUE), ping = mean(rtr_ping, na.rm = TRUE)) %>%
filter(dl < 60000, count > 100)
ggplot(dl, aes(dl, ping)) +
geom_point(aes(size = count), alpha = 1/2) +
geom_smooth() +
scale_size_area()
group_by()
per_device <- RTR %>%
group_by(nw_cat, device) %>%
summarise(count_dev = n(), ul = mean(rtr_speed_ul, na.rm = TRUE))
(per_nw_cat <- per_device %>%
summarise(count_nw = n(), ul = weighted.mean(ul, count_dev, na.rm = TRUE)) %>%
mutate(rank = rank(desc(ul))) %>%
arrange(rank))
## # A tibble: 3 x 4
## nw_cat count_nw ul rank
## <fct> <int> <dbl> <dbl>
## 1 4G 302 19625. 1.
## 2 3G 913 1966. 2.
## 3 2G 462 125. 3.
RTR %>%
group_by(device) %>%
summarise(freq_of_use = n()) %>%
mutate(rank = rank(desc(freq_of_use))) %>%
arrange(rank)
## # A tibble: 980 x 3
## device freq_of_use rank
## <fct> <int> <dbl>
## 1 iPhone 6 21743 1.
## 2 iPhone 5s 18917 2.
## 3 iPhone 5 14031 3.
## 4 Galaxy S4 LTE 10634 4.
## 5 Galaxy S5 10366 5.
## 6 Galaxy S3 9551 6.
## 7 iPhone 6 Plus 6734 7.
## 8 LGE Nexus 5 5702 8.
## 9 iPad mini 2G Cellular 5638 9.
## 10 Galaxy S4 LTE-A 4852 10.
## # ... with 970 more rows
RTR %>%
select(rtr_speed_dl, rtr_speed_ul) %>%
filter(rtr_speed_dl > mean(rtr_speed_dl) & rtr_speed_ul > mean(rtr_speed_ul)) %>%
mutate(sum_speed = (rtr_speed_dl + rtr_speed_ul), rank_speed = rank(desc(sum_speed))) %>%
arrange(rank_speed) %>%
select(sum_speed, rank_speed)
## # A tibble: 55,288 x 2
## sum_speed rank_speed
## <dbl> <dbl>
## 1 197710. 1.
## 2 196830. 2.
## 3 196075. 3.
## 4 195933. 4.
## 5 195860. 5.
## 6 195675. 6.
## 7 195147. 7.
## 8 194772. 8.
## 9 194707. 9.
## 10 194615. 10.
## # ... with 55,278 more rows
Tip: your group-by call needs 3 columns and afterwards you need 2 summarise calls.
Tip 2: use device_platform, nw_cat, device in this order; summarise with number_devices = n() to get number of devices per model and further summarise the n_distinct of number_devices to see how many unique models run on their respective nw-category split into Android and iOS-phones
RTR %>%
group_by(device_platform, nw_cat,device) %>%
summarise(n_per_device = n()) %>%
summarise(unique_devices = n_distinct(n_per_device))
## # A tibble: 6 x 3
## # Groups: device_platform [?]
## device_platform nw_cat unique_devices
## <fct> <fct> <int>
## 1 Android 2G 71
## 2 Android 3G 179
## 3 Android 4G 117
## 4 iOS 2G 18
## 5 iOS 3G 28
## 6 iOS 4G 20
join
do?inner_join(x,y,by=NULL,copy = FALSE,suffix = c(".x",".y"),...)
x
,y
) with a shared character vector of variables (by =
“key-col”) together
join
and not merge
(temp1 <- left_join(file1, file2, by = "id")) #explicitly stating keycol
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
(temp2 <- merge(file1, file2, by = "id", all.x = T))
## id device
## 1 Offfb0e8a-8a4e-46d1-ae82-3ccd58e48964 iPhone 5s
## 2 Offfb42a3-58d7-469e-bbfd-53031a70836c Galaxy S4 Mini
## 3 Offfb681e-19c5-4285-b29e-81984e6bd36e HTC Desire S
## 4 Offfbe951-bf98-4653-97e0-b4da1f6683f8 Galaxy S4 Active
## 5 Offfcf842-b732-4b28-b083-ea2878afbcd4 Galaxy S4 Active
## 6 Offfd173b-89ee-440c-bfb2-56a6efac165a Galaxy S4 LTE-A
## 7 Offfd5337-2602-4a82-8f6d-37acfaaf754c Galaxy S3 LTE
## 8 Offfdc5a7-7490-463d-abfe-e92b7e083e21 HTC One S
## 9 Offfe3f7c-51f6-4b98-a833-a7b1e43eef99 Galaxy S4 LTE-A
## 10 Offfe464e-3c77-4bd6-b6f2-18a5646f8481 iPhone 6
## 11 Offfe4ba7-55fe-4537-bf2b-1f9402709510 Galaxy S4 LTE-A
## 12 Offfe5a19-84ae-4909-9cae-b3443d1204da iPad Air 2 (Wi-Fi/Cellular)
## 13 Offfe8df3-5180-411e-965a-b8ebc824676d Galaxy S4 LTE
## 14 Offfeb095-0ff8-4cb9-b276-1c9e97e637c5 SM-G925F
## 15 Offfecdb4-76d0-4697-be0e-11b2911963ac iPhone 5
## 16 Offfeec5c-4b40-46a4-ae97-c121f326e1e2 HTC Desire 510
## 17 Offff0ee1-1cb0-4b23-ac4c-aa8b704b62f9 Galaxy S5
## 18 Offff830e-3122-4992-ab3e-a19fbef43747 iPhone 4s
## 19 Offffa9ea-865d-4f7e-8d14-e5be6006bd1a iPhone 6s Plus
## 20 Offfffc9d-75a5-4665-9aa1-c7378f0811c2 iPhone 6s
## device_platform device_has_lte rtr_speed_dl rtr_speed_ul rtr_ping
## 1 iOS TRUE 3835 2137 37.2755
## 2 Android TRUE 15259 3160 38.0284
## 3 Android FALSE 655 872 50.3343
## 4 Android TRUE 3520 1877 47.8821
## 5 Android TRUE 19065 1750 39.5813
## 6 Android TRUE 13085 827 31.5617
## 7 Android TRUE 63335 22201 25.0297
## 8 Android FALSE 3962 3012 70.8923
## 9 Android TRUE 3340 65 37.9499
## 10 iOS TRUE 40061 987 39.9457
## 11 Android TRUE 140156 48355 19.7262
## 12 iOS TRUE 51751 22994 21.2528
## 13 Android TRUE 25868 3362 40.4511
## 14 Android TRUE 45008 19824 42.7394
## 15 iOS TRUE 3810 1343 39.1077
## 16 Android TRUE 39025 18378 18.7053
## 17 Android TRUE 2486 58 140.3440
## 18 iOS FALSE 5608 1984 60.9711
## 19 iOS TRUE 59876 13818 39.8866
## 20 iOS TRUE 105078 25598 25.3622
## mtime mymd mym op_name nw_cat longitude
## 1 2014-12-22 18:29:19 2014-12-22 2014-12 Kiwi 3G 16.34548
## 2 2014-08-28 09:28:00 2014-08-28 2014-08 Apple 3G 16.57574
## 3 2015-09-10 18:03:04 2015-09-10 2015-09 Pear 3G 16.32590
## 4 2014-05-22 09:57:00 2014-05-22 2014-05 Apple 3G 14.33279
## 5 2014-07-21 20:10:00 2014-07-21 2014-07 Pear 3G 12.56355
## 6 2014-05-19 13:51:00 2014-05-19 2014-05 Pear 3G 15.99446
## 7 2015-03-27 08:46:18 2015-03-27 2015-03 Kiwi 4G 16.76051
## 8 2014-06-24 19:46:00 2014-06-24 2014-06 Apple 3G 13.63181
## 9 2014-02-22 08:34:00 2014-02-22 2014-02 Pear 3G 15.89740
## 10 2015-08-19 09:47:50 2015-08-19 2015-08 Apple 4G 14.31039
## 11 <NA> <NA> <NA> <NA> <NA> NA
## 12 <NA> <NA> <NA> <NA> <NA> NA
## 13 <NA> <NA> <NA> <NA> <NA> NA
## 14 <NA> <NA> <NA> <NA> <NA> NA
## 15 <NA> <NA> <NA> <NA> <NA> NA
## 16 <NA> <NA> <NA> <NA> <NA> NA
## 17 <NA> <NA> <NA> <NA> <NA> NA
## 18 <NA> <NA> <NA> <NA> <NA> NA
## 19 <NA> <NA> <NA> <NA> <NA> NA
## 20 <NA> <NA> <NA> <NA> <NA> NA
## latitude iso_adm2
## 1 48.15482 at0900
## 2 48.30935 at0308
## 3 48.24507 at0900
## 4 46.61040 at0201
## 5 47.58698 at0704
## 6 46.68344 at0623
## 7 47.94608 at0107
## 8 48.00336 at0417
## 9 47.24231 at0622
## 10 46.62826 at0201
## 11 NA <NA>
## 12 NA <NA>
## 13 NA <NA>
## 14 NA <NA>
## 15 NA <NA>
## 16 NA <NA>
## 17 NA <NA>
## 18 NA <NA>
## 19 NA <NA>
## 20 NA <NA>
setequal(temp1, temp2)
## TRUE
identical(temp1, temp2)
## [1] FALSE
left_join(file1, file2) #natural join - see message
## Joining, by = "id"
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
left_join
: matching rows of y get joined into xright_join
: matching rows of x get joined into yinner_join
: “exclusive join” only rows with matches in the other set get joinedfull_join
: every row from y gets merged into xleft_join(file1, file2, by = "id")
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
right_join(file1, file2, by = "id")
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe4… iPhon… iOS TRUE 40061. 987.
## 2 Offfe3… Galax… Android TRUE 3340. 65.
## 3 Offfdc… HTC O… Android FALSE 3962. 3012.
## 4 Offfd5… Galax… Android TRUE 63335. 22201.
## 5 Offfd1… Galax… Android TRUE 13085. 827.
## 6 Offfcf… Galax… Android TRUE 19065. 1750.
## 7 Offfbe… Galax… Android TRUE 3520. 1877.
## 8 Offfb6… HTC D… Android FALSE 655. 872.
## 9 Offfb4… Galax… Android TRUE 15259. 3160.
## 10 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 11 Offfaf… <NA> <NA> NA NA NA
## 12 Offfaf… <NA> <NA> NA NA NA
## 13 Offfa1… <NA> <NA> NA NA NA
## 14 Offf97… <NA> <NA> NA NA NA
## 15 Offf97… <NA> <NA> NA NA NA
## 16 Offf93… <NA> <NA> NA NA NA
## 17 Offf92… <NA> <NA> NA NA NA
## 18 Offf91… <NA> <NA> NA NA NA
## 19 Offf90… <NA> <NA> NA NA NA
## 20 Offf8c… <NA> <NA> NA NA NA
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
inner_join(file1, file2, by = "id")
## # A tibble: 10 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 8 Offfd5… Galax… Android TRUE 63335. 22201.
## 9 Offfe4… iPhon… iOS TRUE 40061. 987.
## 10 Offfb6… HTC D… Android FALSE 655. 872.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
full_join(file1, file2, by = "id")
## # A tibble: 30 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## # ... with 20 more rows, and 9 more variables: rtr_ping <dbl>,
## # mtime <dttm>, mymd <date>, mym <chr>, op_name <chr>, nw_cat <fct>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>
file1 <- rename(file1, id_number = id)
left_join(file1, file2) #with error
## Error: `by` required, because the data sources have no common variables
left_join(file1, file2, by = c("id_number" = "id")) #solution
## # A tibble: 20 x 15
## id_number device device_platform device_has_lte rtr_speed_dl
## <chr> <fct> <fct> <lgl> <dbl>
## 1 Offfe3f7c-51f6-… Galaxy S4… Android TRUE 3340.
## 2 Offfd173b-89ee-… Galaxy S4… Android TRUE 13085.
## 3 Offfbe951-bf98-… Galaxy S4… Android TRUE 3520.
## 4 Offfdc5a7-7490-… HTC One S Android FALSE 3962.
## 5 Offfcf842-b732-… Galaxy S4… Android TRUE 19065.
## 6 Offfb42a3-58d7-… Galaxy S4… Android TRUE 15259.
## 7 Offff0ee1-1cb0-… Galaxy S5 Android TRUE 2486.
## 8 Offfb0e8a-8a4e-… iPhone 5s iOS TRUE 3835.
## 9 Offfe8df3-5180-… Galaxy S4… Android TRUE 25868.
## 10 Offfd5337-2602-… Galaxy S3… Android TRUE 63335.
## 11 Offfeb095-0ff8-… SM-G925F Android TRUE 45008.
## 12 Offfeec5c-4b40-… HTC Desir… Android TRUE 39025.
## 13 Offfe464e-3c77-… iPhone 6 iOS TRUE 40061.
## 14 Offfecdb4-76d0-… iPhone 5 iOS TRUE 3810.
## 15 Offfb681e-19c5-… HTC Desir… Android FALSE 655.
## 16 Offffa9ea-865d-… iPhone 6s… iOS TRUE 59876.
## 17 Offfe4ba7-55fe-… Galaxy S4… Android TRUE 140156.
## 18 Offfffc9d-75a5-… iPhone 6s iOS TRUE 105078.
## 19 Offff830e-3122-… iPhone 4s iOS FALSE 5608.
## 20 Offfe5a19-84ae-… iPad Air … iOS TRUE 51751.
## # ... with 10 more variables: rtr_speed_ul <dbl>, rtr_ping <dbl>,
## # mtime <dttm>, mymd <date>, mym <chr>, op_name <chr>, nw_cat <fct>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>
semi_join(x,y,by = NULL,copy = FALSE,...)
semi_join
: returns all x with matching rows in y (without adding rows from y)
anti_join
: returns all x without matching rows in y
semi_join(file1, file2)
## Joining, by = "id"
## # A tibble: 10 x 7
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 8 Offfd5… Galax… Android TRUE 63335. 22201.
## 9 Offfe4… iPhon… iOS TRUE 40061. 987.
## 10 Offfb6… HTC D… Android FALSE 655. 872.
## # ... with 1 more variable: rtr_ping <dbl>
anti_join(file1, file2)
## Joining, by = "id"
## # A tibble: 10 x 7
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offff0… Galax… Android TRUE 2486. 58.
## 2 Offfe8… Galax… Android TRUE 25868. 3362.
## 3 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 4 Offfee… HTC D… Android TRUE 39025. 18378.
## 5 Offfec… iPhon… iOS TRUE 3810. 1343.
## 6 Offffa… iPhon… iOS TRUE 59876. 13818.
## 7 Offfe4… Galax… Android TRUE 140156. 48355.
## 8 Offfff… iPhon… iOS TRUE 105078. 25598.
## 9 Offff8… iPhon… iOS FALSE 5608. 1984.
## 10 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 1 more variable: rtr_ping <dbl>
left_join()
)?## left_join
# method 1 - natural call
left_join(file1, file2)
# method 2 - explicitly define the id
left_join(file1, file2, by = "id")
# method 3 - explicit call with pipe-operator
file1 %>%
left_join(file2, by = "id")
# method 4 - natural call with pipe operator
file1 %>%
left_join(file2)
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## 11 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 12 Offfee… HTC D… Android TRUE 39025. 18378.
## 13 Offfe4… iPhon… iOS TRUE 40061. 987.
## 14 Offfec… iPhon… iOS TRUE 3810. 1343.
## 15 Offfb6… HTC D… Android FALSE 655. 872.
## 16 Offffa… iPhon… iOS TRUE 59876. 13818.
## 17 Offfe4… Galax… Android TRUE 140156. 48355.
## 18 Offfff… iPhon… iOS TRUE 105078. 25598.
## 19 Offff8… iPhon… iOS FALSE 5608. 1984.
## 20 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
right_join(file1, file2)
## # A tibble: 20 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe4… iPhon… iOS TRUE 40061. 987.
## 2 Offfe3… Galax… Android TRUE 3340. 65.
## 3 Offfdc… HTC O… Android FALSE 3962. 3012.
## 4 Offfd5… Galax… Android TRUE 63335. 22201.
## 5 Offfd1… Galax… Android TRUE 13085. 827.
## 6 Offfcf… Galax… Android TRUE 19065. 1750.
## 7 Offfbe… Galax… Android TRUE 3520. 1877.
## 8 Offfb6… HTC D… Android FALSE 655. 872.
## 9 Offfb4… Galax… Android TRUE 15259. 3160.
## 10 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 11 Offfaf… <NA> <NA> NA NA NA
## 12 Offfaf… <NA> <NA> NA NA NA
## 13 Offfa1… <NA> <NA> NA NA NA
## 14 Offf97… <NA> <NA> NA NA NA
## 15 Offf97… <NA> <NA> NA NA NA
## 16 Offf93… <NA> <NA> NA NA NA
## 17 Offf92… <NA> <NA> NA NA NA
## 18 Offf91… <NA> <NA> NA NA NA
## 19 Offf90… <NA> <NA> NA NA NA
## 20 Offf8c… <NA> <NA> NA NA NA
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
inner_join(file1, file2)
## # A tibble: 10 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 8 Offfd5… Galax… Android TRUE 63335. 22201.
## 9 Offfe4… iPhon… iOS TRUE 40061. 987.
## 10 Offfb6… HTC D… Android FALSE 655. 872.
## # ... with 9 more variables: rtr_ping <dbl>, mtime <dttm>, mymd <date>,
## # mym <chr>, op_name <chr>, nw_cat <fct>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>
full_join(file1, file2)
## # A tibble: 30 x 15
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offff0… Galax… Android TRUE 2486. 58.
## 8 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 9 Offfe8… Galax… Android TRUE 25868. 3362.
## 10 Offfd5… Galax… Android TRUE 63335. 22201.
## # ... with 20 more rows, and 9 more variables: rtr_ping <dbl>,
## # mtime <dttm>, mymd <date>, mym <chr>, op_name <chr>, nw_cat <fct>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>
(file1 <- rename(file1, id_number = id))
## # A tibble: 20 x 7
## id_number device device_platform device_has_lte rtr_speed_dl
## <chr> <fct> <fct> <lgl> <dbl>
## 1 Offfe3f7c-51f6-… Galaxy S4… Android TRUE 3340.
## 2 Offfd173b-89ee-… Galaxy S4… Android TRUE 13085.
## 3 Offfbe951-bf98-… Galaxy S4… Android TRUE 3520.
## 4 Offfdc5a7-7490-… HTC One S Android FALSE 3962.
## 5 Offfcf842-b732-… Galaxy S4… Android TRUE 19065.
## 6 Offfb42a3-58d7-… Galaxy S4… Android TRUE 15259.
## 7 Offff0ee1-1cb0-… Galaxy S5 Android TRUE 2486.
## 8 Offfb0e8a-8a4e-… iPhone 5s iOS TRUE 3835.
## 9 Offfe8df3-5180-… Galaxy S4… Android TRUE 25868.
## 10 Offfd5337-2602-… Galaxy S3… Android TRUE 63335.
## 11 Offfeb095-0ff8-… SM-G925F Android TRUE 45008.
## 12 Offfeec5c-4b40-… HTC Desir… Android TRUE 39025.
## 13 Offfe464e-3c77-… iPhone 6 iOS TRUE 40061.
## 14 Offfecdb4-76d0-… iPhone 5 iOS TRUE 3810.
## 15 Offfb681e-19c5-… HTC Desir… Android FALSE 655.
## 16 Offffa9ea-865d-… iPhone 6s… iOS TRUE 59876.
## 17 Offfe4ba7-55fe-… Galaxy S4… Android TRUE 140156.
## 18 Offfffc9d-75a5-… iPhone 6s iOS TRUE 105078.
## 19 Offff830e-3122-… iPhone 4s iOS FALSE 5608.
## 20 Offfe5a19-84ae-… iPad Air … iOS TRUE 51751.
## # ... with 2 more variables: rtr_speed_ul <dbl>, rtr_ping <dbl>
left_join(file1, file2)
## Error: `by` required, because the data sources have no common variables
left_join(file1, file2, by = c("id_number" = "id"))
## # A tibble: 20 x 15
## id_number device device_platform device_has_lte rtr_speed_dl
## <chr> <fct> <fct> <lgl> <dbl>
## 1 Offfe3f7c-51f6-… Galaxy S4… Android TRUE 3340.
## 2 Offfd173b-89ee-… Galaxy S4… Android TRUE 13085.
## 3 Offfbe951-bf98-… Galaxy S4… Android TRUE 3520.
## 4 Offfdc5a7-7490-… HTC One S Android FALSE 3962.
## 5 Offfcf842-b732-… Galaxy S4… Android TRUE 19065.
## 6 Offfb42a3-58d7-… Galaxy S4… Android TRUE 15259.
## 7 Offff0ee1-1cb0-… Galaxy S5 Android TRUE 2486.
## 8 Offfb0e8a-8a4e-… iPhone 5s iOS TRUE 3835.
## 9 Offfe8df3-5180-… Galaxy S4… Android TRUE 25868.
## 10 Offfd5337-2602-… Galaxy S3… Android TRUE 63335.
## 11 Offfeb095-0ff8-… SM-G925F Android TRUE 45008.
## 12 Offfeec5c-4b40-… HTC Desir… Android TRUE 39025.
## 13 Offfe464e-3c77-… iPhone 6 iOS TRUE 40061.
## 14 Offfecdb4-76d0-… iPhone 5 iOS TRUE 3810.
## 15 Offfb681e-19c5-… HTC Desir… Android FALSE 655.
## 16 Offffa9ea-865d-… iPhone 6s… iOS TRUE 59876.
## 17 Offfe4ba7-55fe-… Galaxy S4… Android TRUE 140156.
## 18 Offfffc9d-75a5-… iPhone 6s iOS TRUE 105078.
## 19 Offff830e-3122-… iPhone 4s iOS FALSE 5608.
## 20 Offfe5a19-84ae-… iPad Air … iOS TRUE 51751.
## # ... with 10 more variables: rtr_speed_ul <dbl>, rtr_ping <dbl>,
## # mtime <dttm>, mymd <date>, mym <chr>, op_name <chr>, nw_cat <fct>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>
file1 <- rename(file1, id = id_number)
semi_join(file1, file2)
## # A tibble: 10 x 7
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offfe3… Galax… Android TRUE 3340. 65.
## 2 Offfd1… Galax… Android TRUE 13085. 827.
## 3 Offfbe… Galax… Android TRUE 3520. 1877.
## 4 Offfdc… HTC O… Android FALSE 3962. 3012.
## 5 Offfcf… Galax… Android TRUE 19065. 1750.
## 6 Offfb4… Galax… Android TRUE 15259. 3160.
## 7 Offfb0… iPhon… iOS TRUE 3835. 2137.
## 8 Offfd5… Galax… Android TRUE 63335. 22201.
## 9 Offfe4… iPhon… iOS TRUE 40061. 987.
## 10 Offfb6… HTC D… Android FALSE 655. 872.
## # ... with 1 more variable: rtr_ping <dbl>
anti_join(file1,file2)
## # A tibble: 10 x 7
## id device device_platform device_has_lte rtr_speed_dl rtr_speed_ul
## <chr> <fct> <fct> <lgl> <dbl> <dbl>
## 1 Offff0… Galax… Android TRUE 2486. 58.
## 2 Offfe8… Galax… Android TRUE 25868. 3362.
## 3 Offfeb… SM-G9… Android TRUE 45008. 19824.
## 4 Offfee… HTC D… Android TRUE 39025. 18378.
## 5 Offfec… iPhon… iOS TRUE 3810. 1343.
## 6 Offffa… iPhon… iOS TRUE 59876. 13818.
## 7 Offfe4… Galax… Android TRUE 140156. 48355.
## 8 Offfff… iPhon… iOS TRUE 105078. 25598.
## 9 Offff8… iPhon… iOS FALSE 5608. 1984.
## 10 Offfe5… iPad … iOS TRUE 51751. 22994.
## # ... with 1 more variable: rtr_ping <dbl>
union
: takes x and y as inputs and returns all rows that are in at least one datasetintersect
: takes x and y as inputs and returns all rows that the datasets sharesetdiff
: returns all rows that are in x with no matching values in ysetequal
: check if both sets have identical rows, in any orderunion(file3, file4) #all rows, no duplicates
## # A tibble: 60 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf37a3a-… 2015-09-23 22:25:08 2015-09-23 2015… Kiwi 3G iPhone…
## 2 Offf2add1-… 2015-09-12 18:28:33 2015-09-12 2015… Pear 4G Samsun…
## 3 Offf15c81-… 2015-08-30 19:14:38 2015-08-30 2015… Apple 3G HIGHWA…
## 4 Offf43eef-… 2015-07-19 19:28:25 2015-07-19 2015… Pear 3G iPhone…
## 5 Offf4f1dc-… 2015-07-17 10:58:20 2015-07-17 2015… Apple 4G iPhone…
## 6 Offf226bf-… 2015-06-26 17:13:08 2015-06-26 2015… Pear 2G iPhone…
## 7 Offf1b772-… 2015-04-15 14:52:44 2015-04-15 2015… Pear 3G Galaxy…
## 8 Offf26cd8-… 2014-12-29 12:42:12 2014-12-29 2014… Apple 4G iPhone…
## 9 Offf319fe-… 2014-09-29 14:04:00 2014-09-29 2014… Kiwi 3G Galaxy…
## 10 Offf47c9a-… 2014-09-19 19:13:00 2014-09-19 2014… Kiwi 3G iPhone…
## # ... with 50 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
intersect(file3, file4) #matching keys, no duplicates
## # A tibble: 20 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf64ac2… 2014-01-24 12:04:00 2014-01-24 2014… Pear 3G Galaxy …
## 2 Offf92296… 2014-01-26 21:44:00 2014-01-26 2014… Pear 3G Nexus 5
## 3 Offf6c55b… 2014-01-31 19:40:00 2014-01-31 2014… Pear 3G Galaxy …
## 4 Offf58d24… 2014-02-04 00:11:00 2014-02-04 2014… Pear 3G Sony Xp…
## 5 Offf7072a… 2014-02-21 12:06:00 2014-02-21 2014… Pear 3G Galaxy …
## 6 Offf6b1f8… 2014-02-21 15:44:00 2014-02-21 2014… Pear 2G iPad 4 …
## 7 Offf7c734… 2014-06-05 16:04:00 2014-06-05 2014… Kiwi 3G Sony Xp…
## 8 Offf97613… 2014-10-31 06:14:50 2014-10-31 2014… Kiwi 2G iPhone 6
## 9 Offf73f44… 2014-11-29 23:48:43 2014-11-29 2014… Pear 2G Galaxy …
## 10 Offf902d7… 2014-12-07 21:26:04 2014-12-07 2014… Kiwi 3G iPad 4 …
## 11 Offf64782… 2015-01-10 15:20:08 2015-01-10 2015… Pear 3G Galaxy …
## 12 Offf8c086… 2015-01-11 15:53:41 2015-01-11 2015… Apple 4G LGE Nex…
## 13 Offfa188c… 2015-02-17 16:47:14 2015-02-17 2015… Pear 4G LGE LG …
## 14 Offf939b9… 2015-03-01 09:17:46 2015-03-01 2015… Kiwi 3G Z500
## 15 Offfafb38… 2015-05-02 18:55:11 2015-05-02 2015… Apple 3G LG Opti…
## 16 Offf6e668… 2015-05-06 11:09:09 2015-05-06 2015… Pear 3G iPhone …
## 17 Offf91485… 2015-08-03 10:25:33 2015-08-03 2015… Kiwi 3G Galaxy …
## 18 Offf72340… 2015-10-07 18:53:34 2015-10-07 2015… Pear 4G Galaxy …
## 19 Offfaf977… 2015-11-06 22:48:16 2015-11-06 2015… Pear 4G iPad mi…
## 20 Offf978ff… 2015-11-13 15:41:58 2015-11-13 2015… Apple 3G iPhone 6
## # ... with 8 more variables: device_platform <fct>, device_has_lte <lgl>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
setdiff(file3,file4) #all rows of x with no matching key values in y
## # A tibble: 20 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy …
## 2 Offfd173b… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy …
## 3 Offfbe951… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy …
## 4 Offfdc5a7… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC One…
## 5 Offfcf842… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy …
## 6 Offfb42a3… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy …
## 7 Offff0ee1… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy …
## 8 Offfb0e8a… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone …
## 9 Offfe8df3… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy …
## 10 Offfd5337… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy …
## 11 Offfeb095… 2015-08-15 18:53:54 2015-08-15 2015… Apple 4G SM-G925F
## 12 Offfeec5c… 2015-08-19 05:44:17 2015-08-19 2015… Pear 4G HTC Des…
## 13 Offfe464e… 2015-08-19 09:47:50 2015-08-19 2015… Apple 4G iPhone 6
## 14 Offfecdb4… 2015-08-28 11:29:46 2015-08-28 2015… Kiwi 3G iPhone 5
## 15 Offfb681e… 2015-09-10 18:03:04 2015-09-10 2015… Pear 3G HTC Des…
## 16 Offffa9ea… 2015-10-16 11:35:58 2015-10-16 2015… Apple 4G iPhone …
## 17 Offfe4ba7… 2015-11-06 20:29:45 2015-11-06 2015… Pear 4G Galaxy …
## 18 Offfffc9d… 2015-11-17 13:47:03 2015-11-17 2015… Pear 4G iPhone …
## 19 Offff830e… 2015-11-17 18:01:22 2015-11-17 2015… Kiwi 3G iPhone …
## 20 Offfe5a19… 2015-11-18 21:09:23 2015-11-18 2015… Pear 4G iPad Ai…
## # ... with 8 more variables: device_platform <fct>, device_has_lte <lgl>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
## combination:
temp1 <- union(file3,file4)
temp2 <- intersect(file3, file4)
setdiff(temp1, temp2)
## # A tibble: 40 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf37a3a-… 2015-09-23 22:25:08 2015-09-23 2015… Kiwi 3G iPhone…
## 2 Offf2add1-… 2015-09-12 18:28:33 2015-09-12 2015… Pear 4G Samsun…
## 3 Offf15c81-… 2015-08-30 19:14:38 2015-08-30 2015… Apple 3G HIGHWA…
## 4 Offf43eef-… 2015-07-19 19:28:25 2015-07-19 2015… Pear 3G iPhone…
## 5 Offf4f1dc-… 2015-07-17 10:58:20 2015-07-17 2015… Apple 4G iPhone…
## 6 Offf226bf-… 2015-06-26 17:13:08 2015-06-26 2015… Pear 2G iPhone…
## 7 Offf1b772-… 2015-04-15 14:52:44 2015-04-15 2015… Pear 3G Galaxy…
## 8 Offf26cd8-… 2014-12-29 12:42:12 2014-12-29 2014… Apple 4G iPhone…
## 9 Offf319fe-… 2014-09-29 14:04:00 2014-09-29 2014… Kiwi 3G Galaxy…
## 10 Offf47c9a-… 2014-09-19 19:13:00 2014-09-19 2014… Kiwi 3G iPhone…
## # ... with 30 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
union(file3, file4)
## # A tibble: 60 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf37a3a-… 2015-09-23 22:25:08 2015-09-23 2015… Kiwi 3G iPhone…
## 2 Offf2add1-… 2015-09-12 18:28:33 2015-09-12 2015… Pear 4G Samsun…
## 3 Offf15c81-… 2015-08-30 19:14:38 2015-08-30 2015… Apple 3G HIGHWA…
## 4 Offf43eef-… 2015-07-19 19:28:25 2015-07-19 2015… Pear 3G iPhone…
## 5 Offf4f1dc-… 2015-07-17 10:58:20 2015-07-17 2015… Apple 4G iPhone…
## 6 Offf226bf-… 2015-06-26 17:13:08 2015-06-26 2015… Pear 2G iPhone…
## 7 Offf1b772-… 2015-04-15 14:52:44 2015-04-15 2015… Pear 3G Galaxy…
## 8 Offf26cd8-… 2014-12-29 12:42:12 2014-12-29 2014… Apple 4G iPhone…
## 9 Offf319fe-… 2014-09-29 14:04:00 2014-09-29 2014… Kiwi 3G Galaxy…
## 10 Offf47c9a-… 2014-09-19 19:13:00 2014-09-19 2014… Kiwi 3G iPhone…
## # ... with 50 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
intersect(file3, file4)
## # A tibble: 20 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf64ac2… 2014-01-24 12:04:00 2014-01-24 2014… Pear 3G Galaxy …
## 2 Offf92296… 2014-01-26 21:44:00 2014-01-26 2014… Pear 3G Nexus 5
## 3 Offf6c55b… 2014-01-31 19:40:00 2014-01-31 2014… Pear 3G Galaxy …
## 4 Offf58d24… 2014-02-04 00:11:00 2014-02-04 2014… Pear 3G Sony Xp…
## 5 Offf7072a… 2014-02-21 12:06:00 2014-02-21 2014… Pear 3G Galaxy …
## 6 Offf6b1f8… 2014-02-21 15:44:00 2014-02-21 2014… Pear 2G iPad 4 …
## 7 Offf7c734… 2014-06-05 16:04:00 2014-06-05 2014… Kiwi 3G Sony Xp…
## 8 Offf97613… 2014-10-31 06:14:50 2014-10-31 2014… Kiwi 2G iPhone 6
## 9 Offf73f44… 2014-11-29 23:48:43 2014-11-29 2014… Pear 2G Galaxy …
## 10 Offf902d7… 2014-12-07 21:26:04 2014-12-07 2014… Kiwi 3G iPad 4 …
## 11 Offf64782… 2015-01-10 15:20:08 2015-01-10 2015… Pear 3G Galaxy …
## 12 Offf8c086… 2015-01-11 15:53:41 2015-01-11 2015… Apple 4G LGE Nex…
## 13 Offfa188c… 2015-02-17 16:47:14 2015-02-17 2015… Pear 4G LGE LG …
## 14 Offf939b9… 2015-03-01 09:17:46 2015-03-01 2015… Kiwi 3G Z500
## 15 Offfafb38… 2015-05-02 18:55:11 2015-05-02 2015… Apple 3G LG Opti…
## 16 Offf6e668… 2015-05-06 11:09:09 2015-05-06 2015… Pear 3G iPhone …
## 17 Offf91485… 2015-08-03 10:25:33 2015-08-03 2015… Kiwi 3G Galaxy …
## 18 Offf72340… 2015-10-07 18:53:34 2015-10-07 2015… Pear 4G Galaxy …
## 19 Offfaf977… 2015-11-06 22:48:16 2015-11-06 2015… Pear 4G iPad mi…
## 20 Offf978ff… 2015-11-13 15:41:58 2015-11-13 2015… Apple 3G iPhone 6
## # ... with 8 more variables: device_platform <fct>, device_has_lte <lgl>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
setdiff(file3,file4)
## # A tibble: 20 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy …
## 2 Offfd173b… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy …
## 3 Offfbe951… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy …
## 4 Offfdc5a7… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC One…
## 5 Offfcf842… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy …
## 6 Offfb42a3… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy …
## 7 Offff0ee1… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy …
## 8 Offfb0e8a… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone …
## 9 Offfe8df3… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy …
## 10 Offfd5337… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy …
## 11 Offfeb095… 2015-08-15 18:53:54 2015-08-15 2015… Apple 4G SM-G925F
## 12 Offfeec5c… 2015-08-19 05:44:17 2015-08-19 2015… Pear 4G HTC Des…
## 13 Offfe464e… 2015-08-19 09:47:50 2015-08-19 2015… Apple 4G iPhone 6
## 14 Offfecdb4… 2015-08-28 11:29:46 2015-08-28 2015… Kiwi 3G iPhone 5
## 15 Offfb681e… 2015-09-10 18:03:04 2015-09-10 2015… Pear 3G HTC Des…
## 16 Offffa9ea… 2015-10-16 11:35:58 2015-10-16 2015… Apple 4G iPhone …
## 17 Offfe4ba7… 2015-11-06 20:29:45 2015-11-06 2015… Pear 4G Galaxy …
## 18 Offfffc9d… 2015-11-17 13:47:03 2015-11-17 2015… Pear 4G iPhone …
## 19 Offff830e… 2015-11-17 18:01:22 2015-11-17 2015… Kiwi 3G iPhone …
## 20 Offfe5a19… 2015-11-18 21:09:23 2015-11-18 2015… Pear 4G iPad Ai…
## # ... with 8 more variables: device_platform <fct>, device_has_lte <lgl>,
## # longitude <dbl>, latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
temp1 <- union(file3,file4)
temp2 <- intersect(file3, file4)
(temp3 <- setdiff(temp1, temp2))
## # A tibble: 40 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offf37a3a-… 2015-09-23 22:25:08 2015-09-23 2015… Kiwi 3G iPhone…
## 2 Offf2add1-… 2015-09-12 18:28:33 2015-09-12 2015… Pear 4G Samsun…
## 3 Offf15c81-… 2015-08-30 19:14:38 2015-08-30 2015… Apple 3G HIGHWA…
## 4 Offf43eef-… 2015-07-19 19:28:25 2015-07-19 2015… Pear 3G iPhone…
## 5 Offf4f1dc-… 2015-07-17 10:58:20 2015-07-17 2015… Apple 4G iPhone…
## 6 Offf226bf-… 2015-06-26 17:13:08 2015-06-26 2015… Pear 2G iPhone…
## 7 Offf1b772-… 2015-04-15 14:52:44 2015-04-15 2015… Pear 3G Galaxy…
## 8 Offf26cd8-… 2014-12-29 12:42:12 2014-12-29 2014… Apple 4G iPhone…
## 9 Offf319fe-… 2014-09-29 14:04:00 2014-09-29 2014… Kiwi 3G Galaxy…
## 10 Offf47c9a-… 2014-09-19 19:13:00 2014-09-19 2014… Kiwi 3G iPhone…
## # ... with 30 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
temp4 <- bind_rows(setdiff(file3,file4),setdiff(file4,file3)) #temp4
setequal(temp3, temp4)
## TRUE
bind_rows()
bind_rows()
: bind_rows(…,.id = NULL)
…
= data frames to combine, either a
.id
bind_rows(setdiff(file3,file4),union(file3,file4)) #not useful but helps understanding
## # A tibble: 80 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c-… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy…
## 2 Offfd173b-… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy…
## 3 Offfbe951-… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy…
## 4 Offfdc5a7-… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC On…
## 5 Offfcf842-… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy…
## 6 Offfb42a3-… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy…
## 7 Offff0ee1-… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy…
## 8 Offfb0e8a-… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone…
## 9 Offfe8df3-… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy…
## 10 Offfd5337-… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy…
## # ... with 70 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
bind_rows(setdiff = setdiff(file3,file4), union = union(file3,file4), .id = "set_operator") #this is why bind_rows is cool
## # A tibble: 80 x 16
## set_operator id mtime mymd mym op_name nw_cat
## <chr> <chr> <dttm> <date> <chr> <chr> <fct>
## 1 setdiff Offfe… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G
## 2 setdiff Offfd… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G
## 3 setdiff Offfb… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G
## 4 setdiff Offfd… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G
## 5 setdiff Offfc… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G
## 6 setdiff Offfb… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G
## 7 setdiff Offff… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G
## 8 setdiff Offfb… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G
## 9 setdiff Offfe… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G
## 10 setdiff Offfd… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G
## # ... with 70 more rows, and 9 more variables: device <fct>,
## # device_platform <fct>, device_has_lte <lgl>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
bind_cols()
bind_cols()
: bind_cols(…)
…
= (again) dataframes to combine(temp1 <- bind_cols(file5,file6)) #binds it -> combined right!
## # A tibble: 20 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy …
## 2 Offfd173b… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy …
## 3 Offfbe951… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy …
## 4 Offfdc5a7… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC One…
## 5 Offfcf842… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy …
## 6 Offfb42a3… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy …
## 7 Offff0ee1… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy …
## 8 Offfb0e8a… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone …
## 9 Offfe8df3… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy …
## 10 Offfd5337… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy …
## 11 Offfeb095… 2015-08-15 18:53:54 2015-08-15 2015… Apple 4G SM-G925F
## 12 Offfeec5c… 2015-08-19 05:44:17 2015-08-19 2015… Pear 4G HTC Des…
## 13 Offfe464e… 2015-08-19 09:47:50 2015-08-19 2015… Apple 4G iPhone 6
## 14 Offfecdb4… 2015-08-28 11:29:46 2015-08-28 2015… Kiwi 3G iPhone 5
## 15 Offfb681e… 2015-09-10 18:03:04 2015-09-10 2015… Pear 3G HTC Des…
## 16 Offffa9ea… 2015-10-16 11:35:58 2015-10-16 2015… Apple 4G iPhone …
## 17 Offfe4ba7… 2015-11-06 20:29:45 2015-11-06 2015… Pear 4G Galaxy …
## 18 Offfffc9d… 2015-11-17 13:47:03 2015-11-17 2015… Pear 4G iPhone …
## 19 Offff830e… 2015-11-17 18:01:22 2015-11-17 2015… Kiwi 3G iPhone …
## 20 Offfe5a19… 2015-11-18 21:09:23 2015-11-18 2015… Pear 4G iPad Ai…
## # ... with 9 more variables: device_platform <fct>, id1 <chr>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
(temp2 <- bind_cols(file5,file7)) #same as temp1?
## # A tibble: 20 x 16
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy …
## 2 Offfd173b… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy …
## 3 Offfbe951… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy …
## 4 Offfdc5a7… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC One…
## 5 Offfcf842… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy …
## 6 Offfb42a3… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy …
## 7 Offff0ee1… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy …
## 8 Offfb0e8a… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone …
## 9 Offfe8df3… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy …
## 10 Offfd5337… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy …
## 11 Offfeb095… 2015-08-15 18:53:54 2015-08-15 2015… Apple 4G SM-G925F
## 12 Offfeec5c… 2015-08-19 05:44:17 2015-08-19 2015… Pear 4G HTC Des…
## 13 Offfe464e… 2015-08-19 09:47:50 2015-08-19 2015… Apple 4G iPhone 6
## 14 Offfecdb4… 2015-08-28 11:29:46 2015-08-28 2015… Kiwi 3G iPhone 5
## 15 Offfb681e… 2015-09-10 18:03:04 2015-09-10 2015… Pear 3G HTC Des…
## 16 Offffa9ea… 2015-10-16 11:35:58 2015-10-16 2015… Apple 4G iPhone …
## 17 Offfe4ba7… 2015-11-06 20:29:45 2015-11-06 2015… Pear 4G Galaxy …
## 18 Offfffc9d… 2015-11-17 13:47:03 2015-11-17 2015… Pear 4G iPhone …
## 19 Offff830e… 2015-11-17 18:01:22 2015-11-17 2015… Kiwi 3G iPhone …
## 20 Offfe5a19… 2015-11-18 21:09:23 2015-11-18 2015… Pear 4G iPad Ai…
## # ... with 9 more variables: device_platform <fct>, id1 <chr>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
setequal(temp1, temp2) #compare if rows are same (any order)
## FALSE: Rows in x but not y: 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10[...]. Rows in y but not x: 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10[...].
anti_join(file5,file7, by = "id") #do we have keyvalues in x without matching in y?
## # A tibble: 0 x 8
## # ... with 8 variables: id <chr>, mtime <dttm>, mymd <date>, mym <chr>,
## # op_name <chr>, nw_cat <fct>, device <fct>, device_platform <fct>
#no -> arrange both files the same; if yes: semi_join and save -> arrange both sets -> bind_cols -> bind old rows again
file5 <- arrange(file5, id)
file7 <- arrange(file7, id)
temp3 <- bind_cols(file5, file7) #bind ordered
setequal(temp1, temp3) #check if its now right
## TRUE
bind_cols()
or bind_rows()
, when you can use baseR rbind()
and cbind()
?
bind_rows(setdiff(file3,file4),union(file3,file4)) #this is not useful but helps you understand how bind_rows work
## # A tibble: 80 x 15
## id mtime mymd mym op_name nw_cat device
## <chr> <dttm> <date> <chr> <chr> <fct> <fct>
## 1 Offfe3f7c-… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G Galaxy…
## 2 Offfd173b-… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G Galaxy…
## 3 Offfbe951-… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G Galaxy…
## 4 Offfdc5a7-… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G HTC On…
## 5 Offfcf842-… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G Galaxy…
## 6 Offfb42a3-… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G Galaxy…
## 7 Offff0ee1-… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G Galaxy…
## 8 Offfb0e8a-… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G iPhone…
## 9 Offfe8df3-… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G Galaxy…
## 10 Offfd5337-… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G Galaxy…
## # ... with 70 more rows, and 8 more variables: device_platform <fct>,
## # device_has_lte <lgl>, longitude <dbl>, latitude <dbl>, iso_adm2 <fct>,
## # rtr_speed_dl <dbl>, rtr_speed_ul <dbl>, rtr_ping <dbl>
bind_rows(setdiff = setdiff(file3,file4), union = union(file3,file4), .id = "set_operator")
## # A tibble: 80 x 16
## set_operator id mtime mymd mym op_name nw_cat
## <chr> <chr> <dttm> <date> <chr> <chr> <fct>
## 1 setdiff Offfe… 2014-02-22 08:34:00 2014-02-22 2014… Pear 3G
## 2 setdiff Offfd… 2014-05-19 13:51:00 2014-05-19 2014… Pear 3G
## 3 setdiff Offfb… 2014-05-22 09:57:00 2014-05-22 2014… Apple 3G
## 4 setdiff Offfd… 2014-06-24 19:46:00 2014-06-24 2014… Apple 3G
## 5 setdiff Offfc… 2014-07-21 20:10:00 2014-07-21 2014… Pear 3G
## 6 setdiff Offfb… 2014-08-28 09:28:00 2014-08-28 2014… Apple 3G
## 7 setdiff Offff… 2014-09-14 15:32:00 2014-09-14 2014… Apple 3G
## 8 setdiff Offfb… 2014-12-22 18:29:19 2014-12-22 2014… Kiwi 3G
## 9 setdiff Offfe… 2015-02-03 03:23:05 2015-02-03 2015… Apple 3G
## 10 setdiff Offfd… 2015-03-27 08:46:18 2015-03-27 2015… Kiwi 4G
## # ... with 70 more rows, and 9 more variables: device <fct>,
## # device_platform <fct>, device_has_lte <lgl>, longitude <dbl>,
## # latitude <dbl>, iso_adm2 <fct>, rtr_speed_dl <dbl>,
## # rtr_speed_ul <dbl>, rtr_ping <dbl>
# as you should see now, the output has a new column named set_operator that shows you from where your data origins - this can be used as groups in group_by and summarise
##### bind_cols: ##### Adds the cols of y to the cols of x - expects the rows to be in the matching order and to have the same number of rows
file 5 holds 20 rows and the first 8 columns of RTR file 6 the id-column for identification purposes and the cols 9 to 15 from RTR - ideal to bind_cols file 7 is file 6 arranged by id
temp1 <- bind_cols(file5,file6)
temp2 <- bind_cols(file5,file7)
setequal()
!setequal(temp1, temp2)
## FALSE: Rows in x but not y: 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10[...]. Rows in y but not x: 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10[...].
# doesn't look to bad, only that its completely false and you didn't even get an error!
setequal()
tells you that what you did was nonsense! - so before joining make sure that what you do is right!anti_join(file5,file7, by = "id")
## # A tibble: 0 x 8
## # ... with 8 variables: id <chr>, mtime <dttm>, mymd <date>, mym <chr>,
## # op_name <chr>, nw_cat <fct>, device <fct>, device_platform <fct>
file5 <- arrange(file5, id)
file7 <- arrange(file7, id)
temp3 <- bind_cols(file5, file7)
setequal(temp1, temp3)
## TRUE
# Congrats, you passed all of our exercises! We hope you had fun and learned a lot!