airflow.providers.apache.sqoop.operators.sqoop¶
This module contains a sqoop 1 operator
Module Contents¶
-
class
airflow.providers.apache.sqoop.operators.sqoop.SqoopOperator(*, conn_id: str = 'sqoop_default', cmd_type: str = 'import', table: Optional[str] = None, query: Optional[str] = None, target_dir: Optional[str] = None, append: bool = False, file_type: str = 'text', columns: Optional[str] = None, num_mappers: Optional[int] = None, split_by: Optional[str] = None, where: Optional[str] = None, export_dir: Optional[str] = None, input_null_string: Optional[str] = None, input_null_non_string: Optional[str] = None, staging_table: Optional[str] = None, clear_staging_table: bool = False, enclosed_by: Optional[str] = None, escaped_by: Optional[str] = None, input_fields_terminated_by: Optional[str] = None, input_lines_terminated_by: Optional[str] = None, input_optionally_enclosed_by: Optional[str] = None, batch: bool = False, direct: bool = False, driver: Optional[Any] = None, verbose: bool = False, relaxed_isolation: bool = False, properties: Optional[Dict[str, Any]] = None, hcatalog_database: Optional[str] = None, hcatalog_table: Optional[str] = None, create_hcatalog_table: bool = False, extra_import_options: Optional[Dict[str, Any]] = None, extra_export_options: Optional[Dict[str, Any]] = None, schema: Optional[str] = None, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorExecute a Sqoop job. Documentation for Apache Sqoop can be found here: https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html
- Parameters
conn_id -- str
cmd_type -- str specify command to execute "export" or "import"
schema -- Schema name
table -- Table to read
query -- Import result of arbitrary SQL query. Instead of using the table, columns and where arguments, you can specify a SQL statement with the query argument. Must also specify a destination directory with target_dir.
target_dir -- HDFS destination directory where the data from the rdbms will be written
append -- Append data to an existing dataset in HDFS
file_type -- "avro", "sequence", "text" Imports data to into the specified format. Defaults to text.
columns -- <col,col,col> Columns to import from table
num_mappers -- Use n mapper tasks to import/export in parallel
split_by -- Column of the table used to split work units
where -- WHERE clause to use during import
export_dir -- HDFS Hive database directory to export to the rdbms
input_null_string -- The string to be interpreted as null for string columns
input_null_non_string -- The string to be interpreted as null for non-string columns
staging_table -- The table in which data will be staged before being inserted into the destination table
clear_staging_table -- Indicate that any data present in the staging table can be deleted
enclosed_by -- Sets a required field enclosing character
escaped_by -- Sets the escape character
input_fields_terminated_by -- Sets the input field separator
input_lines_terminated_by -- Sets the input end-of-line character
input_optionally_enclosed_by -- Sets a field enclosing character
batch -- Use batch mode for underlying statement execution
direct -- Use direct export fast path
driver -- Manually specify JDBC driver class to use
verbose -- Switch to more verbose logging for debug purposes
relaxed_isolation -- use read uncommitted isolation level
hcatalog_database -- Specifies the database name for the HCatalog table
hcatalog_table -- The argument value for this option is the HCatalog table
create_hcatalog_table -- Have sqoop create the hcatalog table passed in or not
properties -- additional JVM properties passed to sqoop
extra_import_options -- Extra import options to pass as dict. If a key doesn't have a value, just pass an empty string to it. Don't include prefix of -- for sqoop options.
extra_export_options -- Extra export options to pass as dict. If a key doesn't have a value, just pass an empty string to it. Don't include prefix of -- for sqoop options.
-
template_fields= ['conn_id', 'cmd_type', 'table', 'query', 'target_dir', 'file_type', 'columns', 'split_by', 'where', 'export_dir', 'input_null_string', 'input_null_non_string', 'staging_table', 'enclosed_by', 'escaped_by', 'input_fields_terminated_by', 'input_lines_terminated_by', 'input_optionally_enclosed_by', 'properties', 'extra_import_options', 'driver', 'extra_export_options', 'hcatalog_database', 'hcatalog_table', 'schema'][source]¶