airflow.operators.hive_to_mysql

Module Contents

class airflow.operators.hive_to_mysql.HiveToMySqlTransfer(sql, mysql_table, hiveserver2_conn_id='hiveserver2_default', mysql_conn_id='mysql_default', mysql_preoperator=None, mysql_postoperator=None, bulk_load=False, *args, **kwargs)[source]

Bases: airflow.models.BaseOperator

Moves data from Hive to MySQL, note that for now the data is loaded into memory before being pushed to MySQL, so this operator should be used for smallish amount of data.

Parameters
  • sql (str) – SQL query to execute against Hive server. (templated)

  • mysql_table (str) – target MySQL table, use dot notation to target a specific database. (templated)

  • mysql_conn_id (str) – source mysql connection

  • hiveserver2_conn_id (str) – destination hive connection

  • mysql_preoperator (str) – sql statement to run against mysql prior to import, typically use to truncate of delete in place of the data coming in, allowing the task to be idempotent (running the task twice won’t double load data). (templated)

  • mysql_postoperator (str) – sql statement to run against mysql after the import, typically used to move data from staging to production and issue cleanup commands. (templated)

  • bulk_load (bool) – flag to use bulk_load option. This loads mysql directly from a tab-delimited text file using the LOAD DATA LOCAL INFILE command. This option requires an extra connection parameter for the destination MySQL connection: {‘local_infile’: true}.

template_fields = ['sql', 'mysql_table', 'mysql_preoperator', 'mysql_postoperator'][source]
template_ext = ['.sql'][source]
ui_color = #a0e08c[source]
execute(self, context)[source]

Was this entry helpful?