Airflow: Lesser Known Tips, Tricks, and Best Practises

As an Airflow variable can contain JSON value, you can store all your DAG configuration inside a single variable as shown in the image below:As shown in this screenshot you can either store values in separate Airflow variables or under a single Airflow variable as a JSON fieldYou can then access them as shown below under Recommended way:(7) The “context” dictionaryUsers often forget the contents of the context dictionary when using PythonOperator with a callable function.The context contains references to related objects to the task instance and is documented under the macros section of the API as they are also available to templated field.{ 'dag': task.dag, 'ds': ds, 'next_ds': next_ds, 'next_ds_nodash': next_ds_nodash, 'prev_ds': prev_ds, 'prev_ds_nodash': prev_ds_nodash, 'ds_nodash': ds_nodash, 'ts': ts, 'ts_nodash': ts_nodash, 'ts_nodash_with_tz': ts_nodash_with_tz, 'yesterday_ds': yesterday_ds, 'yesterday_ds_nodash': yesterday_ds_nodash, 'tomorrow_ds': tomorrow_ds, 'tomorrow_ds_nodash': tomorrow_ds_nodash, 'END_DATE': ds, 'end_date': ds, 'dag_run': dag_run, 'run_id': run_id, 'execution_date': self.execution_date, 'prev_execution_date': prev_execution_date, 'next_execution_date': next_execution_date, 'latest_date': ds, 'macros': macros, 'params': params, 'tables': tables, 'task': task, 'task_instance': self, 'ti': self, 'task_instance_key_str': ti_key_str, 'conf': configuration, 'test_mode': self.test_mode, 'var': { 'value': VariableAccessor(), 'json': VariableJsonAccessor() }, 'inlets': task.inlets, 'outlets': task.outlets,}(8) Generating Dynamic Airflow TasksI have been answering many questions on StackOverflow on how to create dynamic tasks..The answer is simple, you just need to generate unique task_id for all of your tasks..Below are 2 examples on how to achieve that:(9) Run “airflow upgradedb” instead of “airflow initdb”Thanks to Ash Berlin for this tip in his talk in the First Apache Airflow London Meetup.airflow initdb will create all default connections, charts etc that we might not use and don’t want in our production database..airflow upgradedb will instead just apply any missing migrations to the database table..(including creating missing tables etc.) It is also safe to run every time, it tracks which migrations have already been applied (using the Alembic module).Let me know in the comments section below if you know something that would be worth adding in this blog post. Happy Airflow’ing :-). More details

Leave a Reply