LetsDevOps: CI/CD for Confluent Kafka Topics using Python and GitHub Actions, Confluent Cloud.

Introduction

In this article we learn how to Implement the CI/CD for Confluent Kafka Topics.


Confluent Cloud is streaming service based on the Apache Kafka and currently it is used as fully Managed Service.


DevOps Architecture


Use Case

As part of the confluent CI/CD, Sync the Topic created on the Dev to Pre Prod and from Pre Prod to Prod. During the Sync you will have option to customize the configuration like Partition count, Retention policy etc.


Technologies Used

  • GitHub Action YMAL

  • Python Script

  • REST API

Setup CI/CD- Sync


Configure Repo

Since we are using the Python Script and GitHub Action Workflow to achieve the CI/CD. You can fork the below GitHub Repo and import to your repo.


All the Solution file is publicly accessible at.

Lets discuss each of them one my one.


Workflow File

Workflow File available here.


https://github.com/sumitraj0103/Letsdevops/blob/main/.github/workflows/Confluent-Kafka-Topics-Sync.yml




Deployment Script/requirments.txt Files

Deployment script and dependency file.


https://github.com/sumitraj0103/Letsdevops/tree/main/Deployment-Scripts/Confluent-Sync


Configure Workflow File

Once workflow file is imported to your GitHub now it's time to make below changes as per your Environment.


https://github.com/sumitraj0103/Letsdevops/edit/main/.github/workflows/Confluent-Kafka-Topics-Sync.yml


Since we are syncing the Topic Details from one Environment to another we have Created two option.


npr-prp --> Sync from Non Prod to Pre-Prod Environment


prp-prd --> Sync from Pre-Prod to Prod Environment


Parameters Details

  • Environment --> Environment to Sync From and To

  • topic_list --> List of Topic to be Created separated by ,

  • partitions_count --> This can be set default or Custom

  • replication_factor -->This can be set default or Custom

  • rentention_days --> Retention Policy Days

  • RemoveList --> List of Topic to be removed separated by ,



on:
  workflow_dispatch:
    inputs:
      # Name of your Source Cosmos DB
      Environment:
        description: 'Enviroment --> npr-prp,prp-prd'
        required: true
        default: 'npr-prp'
      topic_list:
        description: 'Topics to Deploy--> , separated or set full for all topic to Deploy'
        required: true
        default: 'srs-test1,srstest2'
      partitions_count:
        description: 'Custom Partition Count else keep --> default'
        required: true
        default: 'default'
      replication_factor:
        description: 'replication_factor else keep --> default'
        required: false
        default: 'default'
      rentention_days:
        description: 'Rentention Day of logs like- 7, 10. For Infinite -1'
        required: false
        default: 'default'
      RemoveList:
        description: 'Topics to Delete--> , separated for Multiple Value'
        required: false
        default: ''


Update Config for Non Prod-Pre Prod Sync

Under the Env Setting Update the below value.

  • Confluent_Source_Server --> Non Prod Confluent Server

  • Confluent_Target_Server --> Pre Prod Confluent Server

  • Confluent_Source_Cluster_Id --> Non Prod Cluster ID

  • Confluent_Target_Cluster_Id --> Pre Prod Cluster ID



  • keyVaultName-NPR --> Kevault of Non Prod

  • keyVaultName-PRP --> Kevault of Pre Prod



Each Key Vault will have below Two information.

  • confluentcluster-kafka-npr-key

  • confluentcluster-kafka-npr-secretkey

This is the cluster Access Key Which can be created/Retrieve from the Confluent Cluster. The purpose of this access to create Access token for Authentication used by Python Script. Find below steps to get the key and secret for the Cluster


For each cluster you need to create the API key which consist of Key Name and Secret.


GitHub Secret for the Authentication to Azure

Create below secret which will have access to Key Vault. As part of the step we are connecting to Azure to fetch the Key Vault Secret.




https://github.com/sumitraj0103/Letsdevops/edit/main/.github/workflows/Confluent-Kafka-Topics-Sync.yml


#############################################################
# Sync from NPR to PRP
#############################################################
  NPR-PRP:
    if: ${{ github.event.inputs.Environment == 'npr-prp' }}
    needs: Build
    runs-on: ubuntu-latest
    env:
       Confluent_Source_Server: 'pkc-epwny.eastus.azure.confluent.cloud'
       Confluent_Target_Server: 'pkc-epwny.eastus.azure.confluent.cloud'
       Confluent_Source_Cluster_Id: 'lkc-380qx0'
       Confluent_Target_Cluster_Id: 'lkc-do8dn1'
       keyVaultName-NPR: 'AKV-NPR'
       keyVaultName-PRP: 'AKV-PRP'
        
    steps:
    
    # Login to Azure
    - name: Login via Az module
      uses: Azure/login@v1.4.5
      with:
        creds: | 
          ${{ secrets.SP_NPR }}
          
    # Download secret from KeyVault Secrets
    - name: Download publish profile from KeyVault Secrets
      uses: Azure/get-keyvault-secrets@v1
      with:
        keyvault: ${{ env.keyVaultName-PRP }}
        secrets: 'confluentcluster-kafka-prp-key,confluentcluster-kafka-prp-secretkey'
      id: ConfluentSecretActionPRP
  
    # Login to Azure
    - name: Login via Az module
      uses: Azure/login@v1.4.5
      with:
        creds: | 
          ${{ secrets.SP_NPR }}
          
    # Download secret from KeyVault Secrets
    - name: Download publish profile from KeyVault Secrets
      uses: Azure/get-keyvault-secrets@v1
      with:
        keyvault: ${{ env.keyVaultName-NPR }}
        secrets: 'confluentcluster-kafka-npr-key,confluentcluster-kafka-npr-secretkey'
      id: ConfluentSecretActionNPR
      
    # Download Artifact: deployment-scripts
    - name: 'Download Artifact: Deployment Script' 
      uses: actions/download-artifact@v2
      with:
        name: 'deployment-scripts'
        path: ${{ github.workspace }}/Deployment-Scripts/Confluent-Sync/
        
    - name: Install dependencies
      run: |
        cd "$GITHUB_WORKSPACE/Deployment-Scripts/Confluent-Sync/"
        pip install -r requirements.txt
        
    - name: Sync Confluent Topics
      run: |
       chmod +x $GITHUB_WORKSPACE/Deployment-Scripts/Confluent-Sync/confluent_topics_deploy.py
       python3 $GITHUB_WORKSPACE/Deployment-Scripts/Confluent-Sync/confluent_topics_deploy.py
      env:
        source_api_key: ${{ steps.ConfluentSecretActionNPR.outputs.confluentcluster-kafka-npr-key }}
        source_api_secret: ${{ steps.ConfluentSecretActionNPR.outputs.confluentcluster-kafka-npr-secretkey }}
        target_api_key: ${{ steps.ConfluentSecretActionPRP.outputs.confluentcluster-kafka-prp-key }}
        target_api_secret: ${{ steps.ConfluentSecretActionPRP.outputs.confluentcluster-kafka-prp-secretkey }}
        source_server: ${{ env.Confluent_Source_Server }}
        target_server: ${{ env.Confluent_Target_Server }}
        source_cluster_id: ${{ env.Confluent_Source_Cluster_Id }}
        target_cluster_id: ${{ env.Confluent_Target_Cluster_Id }}
        topic_list: ${{ github.event.inputs.topic_list }}
        topic_deletelist: ${{ github.event.inputs.RemoveList }}
        partitions_count: ${{ github.event.inputs.partitions_count }}
        replication_factor: ${{ github.event.inputs.replication_factor }}
        rentention_days: ${{ github.event.inputs.rentention_days }}


Run Workflow

Once All the configuration completed now we are ready to test the workflow. Currently I have setup for Non Prod to Pre Prod Sync for Demo.




Workflow Output




Demo