
Create a model output submission file template
Source:R/create_model_out_submit_tmpl.R
create_model_out_submit_tmpl.Rd
Create a model output submission file template
Usage
create_model_out_submit_tmpl(
hub_con,
config_tasks,
round_id,
required_vals_only = FALSE,
complete_cases_only = TRUE
)
Arguments
- hub_con
A
<hub_connection
> class object.- config_tasks
a list version of the content's of a hub's
tasks.json
config file, accessed through the"config_tasks"
attribute of a<hub_connection>
object or functionread_config()
.- round_id
Character string. Round identifier. If the round is set to
round_id_from_variable: true
, IDs are values of the task ID defined in the round'sround_id
property ofconfig_tasks
. Otherwise should match round'sround_id
value in config. Ignored if hub contains only a single round.- required_vals_only
Logical. Whether to return only combinations of Task ID and related output type ID required values.
- complete_cases_only
Logical. If
TRUE
(default) andrequired_vals_only = TRUE
, only rows with complete cases of combinations of required values are returned. IfFALSE
, rows with incomplete cases of combinations of required values are included in the output.
Value
a tibble template containing an expanded grid of valid task ID and
output type ID value combinations for a given submission round
and output type.
If required_vals_only = TRUE
, values are limited to the combination of required
values only.
Details
For task IDs or output_type_ids where all values are optional, by default, columns
are included as columns of NA
s when required_vals_only = TRUE
.
When such columns exist, the function returns a tibble with zero rows, as no
complete cases of required value combinations exists.
(Note that determination of complete cases does excludes valid NA
output_type_id
values in "mean"
and "median"
output types).
To return a template of incomplete required cases, which includes NA
columns, use
complete_cases_only = FALSE
.
When a round is set to round_id_from_variable: true
,
the value of the task ID from which round IDs are derived (i.e. the task ID
specified in round_id
property of config_tasks
) is set to the value of the
round_id
argument in the returned output.
Examples
hub_con <- connect_hub(
system.file("testhubs/flusight", package = "hubUtils")
)
create_model_out_submit_tmpl(hub_con, round_id = "2023-01-02")
#> # A tibble: 3,132 × 7
#> forecast_date target horizon location output_type output_type_id value
#> <date> <chr> <int> <chr> <chr> <chr> <dbl>
#> 1 2023-01-02 wk flu hosp … 2 US pmf large_decrease NA
#> 2 2023-01-02 wk flu hosp … 1 US pmf large_decrease NA
#> 3 2023-01-02 wk flu hosp … 2 01 pmf large_decrease NA
#> 4 2023-01-02 wk flu hosp … 1 01 pmf large_decrease NA
#> 5 2023-01-02 wk flu hosp … 2 02 pmf large_decrease NA
#> 6 2023-01-02 wk flu hosp … 1 02 pmf large_decrease NA
#> 7 2023-01-02 wk flu hosp … 2 04 pmf large_decrease NA
#> 8 2023-01-02 wk flu hosp … 1 04 pmf large_decrease NA
#> 9 2023-01-02 wk flu hosp … 2 05 pmf large_decrease NA
#> 10 2023-01-02 wk flu hosp … 1 05 pmf large_decrease NA
#> # ℹ 3,122 more rows
create_model_out_submit_tmpl(
hub_con,
round_id = "2023-01-02",
required_vals_only = TRUE
)
#> # A tibble: 0 × 7
#> # ℹ 7 variables: forecast_date <date>, target <chr>, horizon <int>,
#> # location <chr>, output_type <chr>, output_type_id <chr>, value <dbl>
create_model_out_submit_tmpl(
hub_con,
round_id = "2023-01-02",
required_vals_only = TRUE,
complete_cases_only = FALSE
)
#> ! Column "target" whose values are all optional included as all `NA` column.
#> ! Round contains more than one modeling task (2)
#> ℹ See Hub's tasks.json file or <hub_connection> attribute "config_tasks" for
#> details of optional task ID/output_type/output_type ID value combinations.
#> # A tibble: 28 × 7
#> forecast_date target horizon location output_type output_type_id value
#> <date> <chr> <int> <chr> <chr> <chr> <dbl>
#> 1 2023-01-02 NA 2 US pmf large_decrease NA
#> 2 2023-01-02 NA 2 US pmf decrease NA
#> 3 2023-01-02 NA 2 US pmf stable NA
#> 4 2023-01-02 NA 2 US pmf increase NA
#> 5 2023-01-02 NA 2 US pmf large_increase NA
#> 6 2023-01-02 NA 2 US quantile 0.01 NA
#> 7 2023-01-02 NA 2 US quantile 0.025 NA
#> 8 2023-01-02 NA 2 US quantile 0.05 NA
#> 9 2023-01-02 NA 2 US quantile 0.1 NA
#> 10 2023-01-02 NA 2 US quantile 0.15 NA
#> # ℹ 18 more rows
# Specifying a round in a hub with multiple rounds
hub_con <- connect_hub(
system.file("testhubs/simple", package = "hubUtils")
)
create_model_out_submit_tmpl(hub_con, round_id = "2022-10-01")
#> # A tibble: 5,184 × 7
#> origin_date target horizon location output_type output_type_id value
#> <date> <chr> <int> <chr> <chr> <dbl> <int>
#> 1 2022-10-01 wk inc flu hosp 1 US mean NA NA
#> 2 2022-10-01 wk inc flu hosp 2 US mean NA NA
#> 3 2022-10-01 wk inc flu hosp 3 US mean NA NA
#> 4 2022-10-01 wk inc flu hosp 4 US mean NA NA
#> 5 2022-10-01 wk inc flu hosp 1 01 mean NA NA
#> 6 2022-10-01 wk inc flu hosp 2 01 mean NA NA
#> 7 2022-10-01 wk inc flu hosp 3 01 mean NA NA
#> 8 2022-10-01 wk inc flu hosp 4 01 mean NA NA
#> 9 2022-10-01 wk inc flu hosp 1 02 mean NA NA
#> 10 2022-10-01 wk inc flu hosp 2 02 mean NA NA
#> # ℹ 5,174 more rows
create_model_out_submit_tmpl(hub_con, round_id = "2022-10-29")
#> # A tibble: 25,920 × 8
#> origin_date target horizon location age_group output_type output_type_id
#> <date> <chr> <int> <chr> <chr> <chr> <dbl>
#> 1 2022-10-29 wk inc flu… 1 US 65+ mean NA
#> 2 2022-10-29 wk inc flu… 2 US 65+ mean NA
#> 3 2022-10-29 wk inc flu… 3 US 65+ mean NA
#> 4 2022-10-29 wk inc flu… 4 US 65+ mean NA
#> 5 2022-10-29 wk inc flu… 1 01 65+ mean NA
#> 6 2022-10-29 wk inc flu… 2 01 65+ mean NA
#> 7 2022-10-29 wk inc flu… 3 01 65+ mean NA
#> 8 2022-10-29 wk inc flu… 4 01 65+ mean NA
#> 9 2022-10-29 wk inc flu… 1 02 65+ mean NA
#> 10 2022-10-29 wk inc flu… 2 02 65+ mean NA
#> # ℹ 25,910 more rows
#> # ℹ 1 more variable: value <int>
create_model_out_submit_tmpl(hub_con,
round_id = "2022-10-29",
required_vals_only = TRUE
)
#> # A tibble: 0 × 8
#> # ℹ 8 variables: origin_date <date>, target <chr>, horizon <int>,
#> # location <chr>, age_group <chr>, output_type <chr>, output_type_id <dbl>,
#> # value <int>
create_model_out_submit_tmpl(hub_con,
round_id = "2022-10-29",
required_vals_only = TRUE,
complete_cases_only = FALSE
)
#> ! Column "location" whose values are all optional included as all `NA` column.
#> ℹ See Hub's tasks.json file or <hub_connection> attribute "config_tasks" for
#> details of optional task ID/output_type/output_type ID value combinations.
#> # A tibble: 23 × 8
#> origin_date target horizon location age_group output_type output_type_id
#> <date> <chr> <int> <chr> <chr> <chr> <dbl>
#> 1 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.01
#> 2 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.025
#> 3 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.05
#> 4 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.1
#> 5 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.15
#> 6 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.2
#> 7 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.25
#> 8 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.3
#> 9 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.35
#> 10 2022-10-29 wk inc flu… 1 NA 65+ quantile 0.4
#> # ℹ 13 more rows
#> # ℹ 1 more variable: value <int>