Customising Airflow: Beyond Boilerplate Settings

$ screen $ airflow webserver -p 8080 The following will launch Airflows scheduler..$ screen $ airflow scheduler If you open Airflows Web UI you can "unpause" the "example_bash_operator" job and manually trigger the job by clicking the play button in the controls section on the right..Log files read via the Web UI should state theyre being read off of S3..If you dont see this message it could be the logs havent yet finished being uploaded..*** Reading remote log from s3://<your s3 bucket>/airflow-logs/example_bash_operator/run_this_last/2018-06-21T14:52:34.689800/1.log..Airflow Maintenance DAGs Robert Sanders of Clairvoyant has a repository containing three Airflow jobs to help keep Airflow operating smoothly..The db-cleanup job will clear out old entries in six of Airflows database tables..The log-cleanup job will remove log files stored in ~/airflow/logs that are older than 30 days (note this will not affect logs stored on S3) and finally, kill-halted-tasks kills lingering processes running in the background after youve killed off a running job in Airflows Web UI..Below Ill create a folder for Airflows jobs and clone the repository..$ mkdir -p ~/airflow/dags $ git clone ~/airflow/dags/maintenance If Airflows scheduler is running, youll need to restart it in order to pick up these new jobs..Below Ill "unpause" the three jobs so theyll start running on a scheduled basis..$ for DAG in airflow-db-cleanup airflow-kill-halted-tasks airflow-log-cleanup; do airflow unpause $DAG done Create Your Own Airflow Pages Airflow supports plugins in the form of new operators, hooks, executors, macros, Web UI pages (called views) and menu links.. More details

Leave a Reply