DatabricksCopyIntoOperator¶
Use the DatabricksCopyIntoOperator to import
data into Databricks table using COPY INTO
command.
Using the Operator¶
Operator loads data from a specified location into a table using a configured endpoint. The only required parameters are:
- table_name- string with the table name
- file_location- string with the URI of data to load
- file_format- string specifying the file format of data to load. Supported formats are- CSV,- JSON,- AVRO,- ORC,- PARQUET,- TEXT,- BINARYFILE.
- One of - sql_endpoint_name(name of Databricks SQL endpoint to use) or- http_path(HTTP path for Databricks SQL endpoint or Databricks cluster).
Other parameters are optional and could be found in the class documentation.
Examples¶
Importing CSV data¶
An example usage of the DatabricksCopyIntoOperator to import CSV data into a table is as follows:
    # Example of importing data using COPY_INTO SQL command
    import_csv = DatabricksCopyIntoOperator(
        task_id="import_csv",
        databricks_conn_id=connection_id,
        sql_endpoint_name=sql_endpoint_name,
        table_name="my_table",
        file_format="CSV",
        file_location="abfss://container@account.dfs.core.windows.net/my-data/csv",
        format_options={"header": "true"},
        force_copy=True,
    )