Kafka topics dump and import

From time to time you may find yourself in need of exporting all the topics definition from a Kafka server to recreate them on a new server. Kafka command line tools are pretty handy to do that but I could not find an already cooked script that could both dump Topics definition to a CSV file and then read it back to create topics – so I made my own two scripts.

They are pretty basic, rough and tested in standard usages, but they works. If you find any bug or have any suggestion for improvement please feel free to comment it.

First one is the script to DUMP all the Topics definition in a CSV file. The CSV file will basically contain the Topic Name, the Partitions Number, the Replication Level and any other additional Configuration Parameter related to the topic. The CSV is intentionally formatted with a Keyword before the actual value in order to do some kind of Validation in the import script. Here’s the topicsList.sh script:

#!/bin/sh

./kafka-topics.sh --describe --zookeeper localhost:2181 | grep -v "Partition: " | while read Topic Partitions Replica Configs; do
	topicName=`echo $Topic | cut -d ':' -f 2`
	partitionsCount=`echo $Partitions | cut -d ':' -f 2`
	replicaCount=`echo $Replica | cut -d ':' -f 2`
	configDetails=`echo $Configs | cut -d ':' -f 2`
	echo Topic,$topicName,Partitions,$partitionsCount,Replica,$replicaCount,Configs,$configDetails
 done

Here’s a sample CSV you will get with this script:

Topic,bss.order.event,Partitions,12,Replica,2,Configs,retention.ms=345600000
Topic,bss.order.event-dlt,Partitions,3,Replica,3,Configs,

Then, finally, here’s the script to import the CSV list of topics into a new Kafka server. You have to invoke it by passing the CSV file name on the command line, and in the first lines you have some toggles to limit the maximum replica you want to apply (i.e.: you’re replicating topics on a smaller cluster) and to apply some default configurations to all the topics you’re creating. Here’s topicsCreate.sh:

#!/bin/sh

## Allows you to override the max replicas we want to set-up
MAXREPLICA=1

## Allows you to specify the Default Configurations to apply
DEFAULTCONFIGS="segment.jitter.ms=5000,segment.ms=28800000,retention.bytes=524288000,segment.bytes=524288000"

if [ "$1" == "" ] || [ ! -f "$1" ]; then
	echo "Please specify kafka topic file to import."
	echo
	echo "Format:"
	echo " - one topic per line"
	echo " - line format: Topic,<topic>,Partitions,<partitions>,Replica,<replica>,Configs,<configs>"
	exit
fi

cat $1 | while IFS="," read -r TopicKey TopicValue PartitionKey PartitionValue ReplicaKey ReplicaValue ConfigsKey ConfigsValues; do

	if [ "$TopicKey" != "Topic" ] || [ "$PartitionKey" != "Partitions" ] || [ "$ReplicaKey" != "Replica" ] || [ "$ConfigsKey" != "Configs" ]; then
		echo ""
		echo "Corrupt CSV file!"
		echo
		exit
	fi

	if [ $ReplicaValue -gt $MAXREPLICA ]; then
		ReplicaValue=$MAXREPLICA;
	fi

	echo ""
	echo "Creating topic $TopicValue with $PartitionValue partitions and $ReplicaValue replicas:"

	./kafka-topics.sh --create --bootstrap-server localhost:9092 --topic="$TopicValue" --partitions="$PartitionValue" --replication-factor="$ReplicaValue"

	echo "==> Applying default configurations for topic $TopicValue:"
	./kafka-configs.sh --alter --bootstrap-server localhost:9092  --topic="$TopicValue" --add-config="$DEFAULTCONFIGS"

	if [ "$ConfigsValues" != "" ]; then 
		echo "==> Applying custom configurations $ConfigsValues for topic $TopicValue"		
		./kafka-configs.sh --alter --bootstrap-server localhost:9092  --topic="$TopicValue" --add-config=$ConfigsValues
	fi

done

Have fun!

Lascia un commento

Questo sito utilizza Akismet per ridurre lo spam. Scopri come vengono elaborati i dati derivati dai commenti.