Skip to contents

Create a model output submission file template

Usage

create_model_out_submit_tmpl(
  hub_con,
  config_tasks,
  round_id,
  required_vals_only = FALSE,
  complete_cases_only = TRUE
)

Arguments

hub_con

A ⁠<hub_connection>⁠ class object.

config_tasks

a list version of the content's of a hub's tasks.json config file, accessed through the "config_tasks" attribute of a <hub_connection> object or function read_config().

round_id

Character string. Round identifier. If the round is set to round_id_from_variable: true, IDs are values of the task ID defined in the round's round_id property of config_tasks. Otherwise should match round's round_id value in config. Ignored if hub contains only a single round.

required_vals_only

Logical. Whether to return only combinations of Task ID and related output type ID required values.

complete_cases_only

Logical. If TRUE (default) and required_vals_only = TRUE, only rows with complete cases of combinations of required values are returned. If FALSE, rows with incomplete cases of combinations of required values are included in the output.

Value

a tibble template containing an expanded grid of valid task ID and output type ID value combinations for a given submission round and output type. If required_vals_only = TRUE, values are limited to the combination of required values only.

Details

For task IDs or output_type_ids where all values are optional, by default, columns are included as columns of NAs when required_vals_only = TRUE. When such columns exist, the function returns a tibble with zero rows, as no complete cases of required value combinations exists. (Note that determination of complete cases does excludes valid NA output_type_id values in "mean" and "median" output types). To return a template of incomplete required cases, which includes NA columns, use complete_cases_only = FALSE.

When a round is set to round_id_from_variable: true, the value of the task ID from which round IDs are derived (i.e. the task ID specified in round_id property of config_tasks) is set to the value of the round_id argument in the returned output.

Examples

hub_con <- connect_hub(
  system.file("testhubs/flusight", package = "hubUtils")
)
create_model_out_submit_tmpl(hub_con, round_id = "2023-01-02")
#> # A tibble: 3,132 × 7
#>    forecast_date target        horizon location output_type output_type_id value
#>    <date>        <chr>           <int> <chr>    <chr>       <chr>          <dbl>
#>  1 2023-01-02    wk flu hosp …       2 US       pmf         large_decrease    NA
#>  2 2023-01-02    wk flu hosp …       1 US       pmf         large_decrease    NA
#>  3 2023-01-02    wk flu hosp …       2 01       pmf         large_decrease    NA
#>  4 2023-01-02    wk flu hosp …       1 01       pmf         large_decrease    NA
#>  5 2023-01-02    wk flu hosp …       2 02       pmf         large_decrease    NA
#>  6 2023-01-02    wk flu hosp …       1 02       pmf         large_decrease    NA
#>  7 2023-01-02    wk flu hosp …       2 04       pmf         large_decrease    NA
#>  8 2023-01-02    wk flu hosp …       1 04       pmf         large_decrease    NA
#>  9 2023-01-02    wk flu hosp …       2 05       pmf         large_decrease    NA
#> 10 2023-01-02    wk flu hosp …       1 05       pmf         large_decrease    NA
#> # ℹ 3,122 more rows
create_model_out_submit_tmpl(
  hub_con,
  round_id = "2023-01-02",
  required_vals_only = TRUE
)
#> # A tibble: 0 × 7
#> # ℹ 7 variables: forecast_date <date>, target <chr>, horizon <int>,
#> #   location <chr>, output_type <chr>, output_type_id <chr>, value <dbl>
create_model_out_submit_tmpl(
  hub_con,
  round_id = "2023-01-02",
  required_vals_only = TRUE,
  complete_cases_only = FALSE
)
#> ! Column "target" whose values are all optional included as all `NA` column.
#> ! Round contains more than one modeling task (2)
#>  See Hub's tasks.json file or <hub_connection> attribute "config_tasks" for
#>   details of optional task ID/output_type/output_type ID value combinations.
#> # A tibble: 28 × 7
#>    forecast_date target horizon location output_type output_type_id value
#>    <date>        <chr>    <int> <chr>    <chr>       <chr>          <dbl>
#>  1 2023-01-02    NA           2 US       pmf         large_decrease    NA
#>  2 2023-01-02    NA           2 US       pmf         decrease          NA
#>  3 2023-01-02    NA           2 US       pmf         stable            NA
#>  4 2023-01-02    NA           2 US       pmf         increase          NA
#>  5 2023-01-02    NA           2 US       pmf         large_increase    NA
#>  6 2023-01-02    NA           2 US       quantile    0.01              NA
#>  7 2023-01-02    NA           2 US       quantile    0.025             NA
#>  8 2023-01-02    NA           2 US       quantile    0.05              NA
#>  9 2023-01-02    NA           2 US       quantile    0.1               NA
#> 10 2023-01-02    NA           2 US       quantile    0.15              NA
#> # ℹ 18 more rows
# Specifying a round in a hub with multiple rounds
hub_con <- connect_hub(
  system.file("testhubs/simple", package = "hubUtils")
)
create_model_out_submit_tmpl(hub_con, round_id = "2022-10-01")
#> # A tibble: 5,184 × 7
#>    origin_date target          horizon location output_type output_type_id value
#>    <date>      <chr>             <int> <chr>    <chr>                <dbl> <int>
#>  1 2022-10-01  wk inc flu hosp       1 US       mean                    NA    NA
#>  2 2022-10-01  wk inc flu hosp       2 US       mean                    NA    NA
#>  3 2022-10-01  wk inc flu hosp       3 US       mean                    NA    NA
#>  4 2022-10-01  wk inc flu hosp       4 US       mean                    NA    NA
#>  5 2022-10-01  wk inc flu hosp       1 01       mean                    NA    NA
#>  6 2022-10-01  wk inc flu hosp       2 01       mean                    NA    NA
#>  7 2022-10-01  wk inc flu hosp       3 01       mean                    NA    NA
#>  8 2022-10-01  wk inc flu hosp       4 01       mean                    NA    NA
#>  9 2022-10-01  wk inc flu hosp       1 02       mean                    NA    NA
#> 10 2022-10-01  wk inc flu hosp       2 02       mean                    NA    NA
#> # ℹ 5,174 more rows
create_model_out_submit_tmpl(hub_con, round_id = "2022-10-29")
#> # A tibble: 25,920 × 8
#>    origin_date target      horizon location age_group output_type output_type_id
#>    <date>      <chr>         <int> <chr>    <chr>     <chr>                <dbl>
#>  1 2022-10-29  wk inc flu…       1 US       65+       mean                    NA
#>  2 2022-10-29  wk inc flu…       2 US       65+       mean                    NA
#>  3 2022-10-29  wk inc flu…       3 US       65+       mean                    NA
#>  4 2022-10-29  wk inc flu…       4 US       65+       mean                    NA
#>  5 2022-10-29  wk inc flu…       1 01       65+       mean                    NA
#>  6 2022-10-29  wk inc flu…       2 01       65+       mean                    NA
#>  7 2022-10-29  wk inc flu…       3 01       65+       mean                    NA
#>  8 2022-10-29  wk inc flu…       4 01       65+       mean                    NA
#>  9 2022-10-29  wk inc flu…       1 02       65+       mean                    NA
#> 10 2022-10-29  wk inc flu…       2 02       65+       mean                    NA
#> # ℹ 25,910 more rows
#> # ℹ 1 more variable: value <int>
create_model_out_submit_tmpl(hub_con,
  round_id = "2022-10-29",
  required_vals_only = TRUE
)
#> # A tibble: 0 × 8
#> # ℹ 8 variables: origin_date <date>, target <chr>, horizon <int>,
#> #   location <chr>, age_group <chr>, output_type <chr>, output_type_id <dbl>,
#> #   value <int>
create_model_out_submit_tmpl(hub_con,
  round_id = "2022-10-29",
  required_vals_only = TRUE,
  complete_cases_only = FALSE
)
#> ! Column "location" whose values are all optional included as all `NA` column.
#>  See Hub's tasks.json file or <hub_connection> attribute "config_tasks" for
#>   details of optional task ID/output_type/output_type ID value combinations.
#> # A tibble: 23 × 8
#>    origin_date target      horizon location age_group output_type output_type_id
#>    <date>      <chr>         <int> <chr>    <chr>     <chr>                <dbl>
#>  1 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.01 
#>  2 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.025
#>  3 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.05 
#>  4 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.1  
#>  5 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.15 
#>  6 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.2  
#>  7 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.25 
#>  8 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.3  
#>  9 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.35 
#> 10 2022-10-29  wk inc flu…       1 NA       65+       quantile             0.4  
#> # ℹ 13 more rows
#> # ℹ 1 more variable: value <int>