Create expanded grid of valid task ID and output type value combinations

Usage

expand_model_out_val_grid(
  config_tasks,
  round_id,
  required_vals_only = FALSE,
  all_character = FALSE,
  as_arrow_table = FALSE,
  bind_model_tasks = TRUE
)

Arguments

config_tasks: a list version of the content's of a hub's tasks.json config file, accessed through the "config_tasks" attribute of a <hub_connection> object or function read_config().
round_id: Character string. Round identifier. If the round is set to round_id_from_variable: true, IDs are values of the task ID defined in the round's round_id property of config_tasks. Otherwise should match round's round_id value in config. Ignored if hub contains only a single round.
required_vals_only: Logical. Whether to return only combinations of Task ID and related output type ID required values.
all_character: Logical. Whether to return all character column.
as_arrow_table: Logical. Whether to return an arrow table. Defaults to FALSE.
bind_model_tasks: Logical. Whether to bind expanded grids of values from multiple modeling tasks into a single tibble/arrow table or return a list.

Value

If bind_model_tasks = TRUE (default) a tibble or arrow table containing all possible task ID and related output type ID value combinations. If bind_model_tasks = FALSE, a list containing a tibble or arrow table for each round modeling task.

Columns are coerced to data types according to the hub schema, unless all_character = TRUE. If all_character = TRUE, all columns are returned as character which can be faster when large expanded grids are expected. If required_vals_only = TRUE, values are limited to the combinations of required values only.

Details

When a round is set to round_id_from_variable: true, the value of the task ID from which round IDs are derived (i.e. the task ID specified in round_id property of config_tasks) is set to the value of the round_id argument in the returned output.

Examples

hub_con <- connect_hub(
  system.file("testhubs/flusight", package = "hubUtils")
)
config_tasks <- attr(hub_con, "config_tasks")
expand_model_out_val_grid(config_tasks, round_id = "2023-01-02")
#> # A tibble: 3,132 × 6
#>    forecast_date target              horizon location output_type output_type_id
#>    <date>        <chr>                 <int> <chr>    <chr>       <chr>         
#>  1 2023-01-02    wk flu hosp rate c…       2 US       pmf         large_decrease
#>  2 2023-01-02    wk flu hosp rate c…       1 US       pmf         large_decrease
#>  3 2023-01-02    wk flu hosp rate c…       2 01       pmf         large_decrease
#>  4 2023-01-02    wk flu hosp rate c…       1 01       pmf         large_decrease
#>  5 2023-01-02    wk flu hosp rate c…       2 02       pmf         large_decrease
#>  6 2023-01-02    wk flu hosp rate c…       1 02       pmf         large_decrease
#>  7 2023-01-02    wk flu hosp rate c…       2 04       pmf         large_decrease
#>  8 2023-01-02    wk flu hosp rate c…       1 04       pmf         large_decrease
#>  9 2023-01-02    wk flu hosp rate c…       2 05       pmf         large_decrease
#> 10 2023-01-02    wk flu hosp rate c…       1 05       pmf         large_decrease
#> # ℹ 3,122 more rows
expand_model_out_val_grid(
  config_tasks,
  round_id = "2023-01-02",
  required_vals_only = TRUE
)
#> # A tibble: 28 × 5
#>    forecast_date horizon location output_type output_type_id
#>    <date>          <int> <chr>    <chr>       <chr>         
#>  1 2023-01-02          2 US       pmf         large_decrease
#>  2 2023-01-02          2 US       pmf         decrease      
#>  3 2023-01-02          2 US       pmf         stable        
#>  4 2023-01-02          2 US       pmf         increase      
#>  5 2023-01-02          2 US       pmf         large_increase
#>  6 2023-01-02          2 US       quantile    0.01          
#>  7 2023-01-02          2 US       quantile    0.025         
#>  8 2023-01-02          2 US       quantile    0.05          
#>  9 2023-01-02          2 US       quantile    0.1           
#> 10 2023-01-02          2 US       quantile    0.15          
#> # ℹ 18 more rows
# Specifying a round in a hub with multiple round configurations.
hub_con <- connect_hub(
  system.file("testhubs/simple", package = "hubUtils")
)
config_tasks <- attr(hub_con, "config_tasks")
expand_model_out_val_grid(config_tasks, round_id = "2022-10-01")
#> # A tibble: 5,184 × 6
#>    origin_date target          horizon location output_type output_type_id
#>    <date>      <chr>             <int> <chr>    <chr>                <dbl>
#>  1 2022-10-01  wk inc flu hosp       1 US       mean                    NA
#>  2 2022-10-01  wk inc flu hosp       2 US       mean                    NA
#>  3 2022-10-01  wk inc flu hosp       3 US       mean                    NA
#>  4 2022-10-01  wk inc flu hosp       4 US       mean                    NA
#>  5 2022-10-01  wk inc flu hosp       1 01       mean                    NA
#>  6 2022-10-01  wk inc flu hosp       2 01       mean                    NA
#>  7 2022-10-01  wk inc flu hosp       3 01       mean                    NA
#>  8 2022-10-01  wk inc flu hosp       4 01       mean                    NA
#>  9 2022-10-01  wk inc flu hosp       1 02       mean                    NA
#> 10 2022-10-01  wk inc flu hosp       2 02       mean                    NA
#> # ℹ 5,174 more rows
# Later round_id maps to round config that includes additional task ID 'age_group'.
expand_model_out_val_grid(config_tasks, round_id = "2022-10-29")
#> # A tibble: 25,920 × 7
#>    origin_date target      horizon location age_group output_type output_type_id
#>    <date>      <chr>         <int> <chr>    <chr>     <chr>                <dbl>
#>  1 2022-10-29  wk inc flu…       1 US       65+       mean                    NA
#>  2 2022-10-29  wk inc flu…       2 US       65+       mean                    NA
#>  3 2022-10-29  wk inc flu…       3 US       65+       mean                    NA
#>  4 2022-10-29  wk inc flu…       4 US       65+       mean                    NA
#>  5 2022-10-29  wk inc flu…       1 01       65+       mean                    NA
#>  6 2022-10-29  wk inc flu…       2 01       65+       mean                    NA
#>  7 2022-10-29  wk inc flu…       3 01       65+       mean                    NA
#>  8 2022-10-29  wk inc flu…       4 01       65+       mean                    NA
#>  9 2022-10-29  wk inc flu…       1 02       65+       mean                    NA
#> 10 2022-10-29  wk inc flu…       2 02       65+       mean                    NA
#> # ℹ 25,910 more rows
# Coerce all columns to character
expand_model_out_val_grid(config_tasks,
  round_id = "2022-10-29",
  all_character = TRUE
)
#> # A tibble: 25,920 × 7
#>    origin_date target      horizon location age_group output_type output_type_id
#>    <chr>       <chr>       <chr>   <chr>    <chr>     <chr>       <chr>         
#>  1 2022-10-29  wk inc flu… 1       US       65+       mean        NA            
#>  2 2022-10-29  wk inc flu… 2       US       65+       mean        NA            
#>  3 2022-10-29  wk inc flu… 3       US       65+       mean        NA            
#>  4 2022-10-29  wk inc flu… 4       US       65+       mean        NA            
#>  5 2022-10-29  wk inc flu… 1       01       65+       mean        NA            
#>  6 2022-10-29  wk inc flu… 2       01       65+       mean        NA            
#>  7 2022-10-29  wk inc flu… 3       01       65+       mean        NA            
#>  8 2022-10-29  wk inc flu… 4       01       65+       mean        NA            
#>  9 2022-10-29  wk inc flu… 1       02       65+       mean        NA            
#> 10 2022-10-29  wk inc flu… 2       02       65+       mean        NA            
#> # ℹ 25,910 more rows
# Return arrow table
expand_model_out_val_grid(config_tasks,
  round_id = "2022-10-29",
  all_character = TRUE,
  as_arrow_table = TRUE
)
#> Table
#> 25920 rows x 7 columns
#> $origin_date <string>
#> $target <string>
#> $horizon <string>
#> $location <string>
#> $age_group <string>
#> $output_type <string>
#> $output_type_id <string>