Skip to content

Transformations

Transformations allow you to transform data as it is passed between tasks.

You can do this inline using !Py, in multi-line snippets using !Multistep or encapsulated in a normal python function. The difference between transformations and functions is that transformations are applied before and can be to a specific argument.

In-line and Multi-line are described in depth in !Py and !Multistep.

Similarly to functions, transformations can be included in the project or referenced from the contributed / built-in repos.

  • typhoon.transformation_file.transformation_name for built-in / contrib transformations.
  • transformations.my_transformation_file.transformation_name for accessing your project transformations.

Transformation example

Applying a regex transformation to the data (element 1 which is the path) before its passed to the function.

  write_df_to_sql:
    input: read_csv_to_df
    function: functions.pandas_data_helpers.df_write
    args:
      hook: !Py $HOOK.transaction_db
      df: !Py $BATCH[0]
      table_name: !Py transformations.path_mapping.to_regex_name($BATCH[1])

Mixing !Multistep inline transformations and a flatten transformation. The dictionary will be flattened and then passed into csv in two lines before being written by the function. Note you can apply transformations independently to different arguments. In this case data path has a separate in-line transformation using !Py.

This can make very powerful tasks that are also concise and readable.

  write_csv:
    input: exchange_rate
    function: typhoon.filesystem.write_data
    args:
      hook: !Hook echo
      data: !MultiStep
        - !Py transformations.xr.flatten_response($BATCH)
        - !Py typhoon.data.dicts_to_csv($1, delimiter='|')
      path: !Py f'{$DAG_CONTEXT.interval_start.isoformat()}_xr_data.csv'