Create an arrow schema from a tasks.json
config file. For use when
opening an arrow dataset.
Arguments
- config_tasks
a list version of the content's of a hub's
tasks.json
config file created using functionread_config()
.- partitions
a named list specifying the arrow data types of any partitioning column.
- output_type_id_datatype
character string. One of
"auto"
,"character"
,"double"
,"integer"
,"logical"
,"Date"
. Defaults to"auto"
indicating thatoutput_type_id
will be determined automatically from thetasks.json
config file. Other data type values can be used to override automatic determination. Note that attempting to coerceoutput_type_id
to a data type that is not possible (e.g. trying to coerce to"double"
when the data contains"character"
values) will likely result in an error or potentially unexpected behaviour so use with care.- r_schema
Logical. If
FALSE
(default), return anarrow::schema()
object. IfTRUE
, return a character vector of R data types.
Value
an arrow schema object that can be used to define column datatypes when
opening model output data. If r_schema = TRUE
, a character vector of R data types.
Examples
hub_path <- system.file("testhubs/simple", package = "hubUtils")
config_tasks <- read_config(hub_path, "tasks")
schema <- create_hub_schema(config_tasks)