So, in this case, we define one volume for the logs, one for the requirements file and one for the logs persistent volume claim.Apache Airflow is currently one of the most popular task orchestration tools available. volumes section: as mentioned on the Volumes section of this guide, we need to define the volumes to be used by the pods.If you’re not familiar with these Kubernetes concepts, I recommend having a quick read to theĪpiVersion : apps/v1 kind : Deployment metadata : name : airflow-webserver labels : app : airflow-k8s spec : selector : matchLabels : app : airflow-webserver replicas : 1 template : metadata : labels : app : airflow-webserver spec : containers : - name : airflow-webserver image : puckel/docker-airflow : 1.10.9 envFrom : - configMapRef : name : airflow-envvars-configmap resources : limits : memory : "2Gi" ports : - containerPort : 8080 volumeMounts : - name : requirements-configmap subPath : "requirements.txt" mountPath : "/requirements.txt" - name : dags-host-volume mountPath : /usr/local/airflow/dags - name : logs-persistent-storage mountPath : /usr/local/airflow/logs volumes : - name : requirements-configmap configMap : name : requirements-configmap - name : dags-host-volume hostPath : path : /mnt/airflow/dags type : Directory - name : logs-persistent-storage persistentVolumeClaim : claimName : airflow-logs-pvc We can also define a Pod object, but in this case, they’ll be automatically created In this and the following sections, we’ll define the necessary Kubernetes objects to run theįirst, we’ll define a Deployment and a Service to run a PostgreSQL instance that Airflow įinally, having the mount running, pods will be able to mount the cluster node’s folder where they’llīe able to read and write files which will be written into our host machine. □ NOTE: This process must stay alive for the mount to be accessible. To achieve this, we need to create a PersistenVolumeClaim object: If we decide to not setĪ volume, then Airflow’s workers’ logs would be lost after they finish. In this guide, we’llĭefine a Volume that will allow us to persist logs from all Airflow’s components. There are multiple alternatives to save Airflow’s logs on a Kubernetes deployment. Kubernetes pod) we’re going to set up three volumes for different purposes using multiple Kubernetes tools: This will allow workers to load env vars from this ConfigMap when running.įor each Airflow component (i.e. AIRFLOW_KUBERNETES_ENV_FROM_CONFIGMAP_REF: this specifies the name of the ConfigMap that stores theĮnv vars (i.e.We’ll talk about this in more detail later. AIRFLOW_KUBERNETES_LOGS_VOLUME_CLAIM: this env var specifies the Kubernetes volume claim to use to.cluster node) where DAGs files are stored. AIRFLOW_KUBERNETES_DAGS_VOLUME_HOST: we’ll see this in more detail later.AIRFLOW_KUBERNETES_WORKER_CONTAINER_TAG: this env var is used to specify the docker image tag.In the context of Kubernetes, workers will be run on a Pod. As the name suggests, this env var is to specify theĭocker image to be used for workers. Specifically for Kubernetes integration on Airflow.
0 Comments
Leave a Reply. |