Installing Apache Kafka on Cloudera’s Quickstart VM

Installing Apache Kafka on Cloudera’s Quickstart VMRobert SandersBlockedUnblockFollowFollowingMay 3Cloudera LogoCloudera, one of the leading distributions of Hadoop, provides an easy to install Virtual Machine for the purposes of getting started quickly on their platform.

With this, someone can easily get a single node CDH cluster running within a Virtual Environment.

Users could use this VM for their own personal learning, rapidly building applications on a dedicated cluster, or for many other purposes.

Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to the Apache Software Foundation, written in Scala and Java.

The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.

Its storage layer is essentially a “massively scalable pub/sub message queue designed as a distributed transaction log,” making it highly valuable for enterprise infrastructures to process streaming data.

The Cloudera Quickstart VM doesn’t come with Apache Kafka right out of the box.

But can be installed fairly easily.

Installation Steps1.

Download and Install the VMa.

Navigate to https://www.

cloudera.

com/downloads/quickstart_vms.

htmlb.

Select the Platform you’d like the VM to run on and Downloadc.

Load the VM into your desired Platform2.

Configure the VMBefore starting the VM, set the following configurations:— Set at least 8GB of RAM— Set at least 2 CPUs3.

Startup the VM4.

Startup Cloudera Manager (CM)Once the VM starts up, navigate to the Desktop and Execute the “Launch Cloudera Express” script.

Note: This may take a while to runOnce complete, you should now be able to view the Cloudera Manager by opening up your web browser (within the VM) and navigating to:http://quickstart.

cloudera:7180From your local machine, you can navigate to:http://localhost:7180Default Credentials: cloudera/cloudera5.

Configure CM to use ParcelsNavigate to the Desktop and Execute the “Migrate to Parcels” script.

Note: This may take a while to runYou can validate that CM is now using parcels by logging into the Cloudera Manager Web UI.

Right next to the cluster name, it should say: (CDH x.

x.

x, Parcels)Note: All the services will be shut down after this and you will need to restart all the services on the cluster after this:i.

Restart the Cluster Services— Select Clusters > Cloudera QuickStart— Select Actions > Restart6.

Select the Version of Kafka you want to InstallNavigate here to get a full list of the Kafka Versions that are available:CDK Powered By Apache Kafka® Version and Packaging Information | 4.

0.

x | Cloudera DocumentationCDH 6 includes Apache Kafka as part of the core package.

The documentation includes improved contents for how to set…www.

cloudera.

comSelect the Parcel URLCopy the Parcel URL next to the version of Kafka that you want (To be referred to as PARCEL_URL in future sections)7.

Install Kafka ParcelComplete Documentation on how to manage Parcels:Parcels | 5.

16.

x | Cloudera DocumentationOn the Parcels page in Cloudera Manager, you can manage parcel installation and activation and determine which parcel…www.

cloudera.

coma.

Log in to the Cloudera Manager Web UIb.

Navigate to Hosts -> Parcelsc.

Click Configurationd.

Add the PARCEL_URL you found in the previous step to the list under Remote Parcel Repository URLse.

Save Changesf.

You will be taken back to the Parcels page.

Wait a few seconds and the version of Kafka that you entered should be added to the list.

f.

Locate the Kafka parcel from the listg.

Under Actions, click Download and wait for it to downloadh.

Under Actions, click Distribute and wait for it to be distributedi.

Under Actions, click Activate and wait for it to be activated8.

Install Kafka Servicea.

Log in to the Cloudera Manager Web UIb.

Click on the button next to the Cluster Name and select “Add Service”c.

Select “Kafka” and click “Continue”d.

Select whichever set of dependencies you would like and click “Continue”e.

Select the one instance available as the Kafka Broker and Gateway and click “Continue”f.

Keep the default configurations and click Continueg.

The service will now be added and then you will be taken back to the CM home9.

Configure Kafka ServiceNote: you will see that the Broker goes down at first.

This is due to some incorrect default configurations that cannot be set until after the Kafka Service has been added.

a.

Log in to the Cloudera Manager Web UIb.

Click on Kafka -> Configurationc.

Set Configurations:Java Heap Size of Broker (broker_max_heap_size) =“256”Advertised Host (advertised.

host.

name) = “quickstart.

cloudera”Inter Broker Protocol = “PLAINTEXT”d.

Click Save Changese.

On the top of the page, click on the Yellow Restart buttonTestingSmoke Testkafka-topics –zookeeper quickstart.

cloudera:2181 –create –topic test –partitions 1 –replication-factor 1kafka-topics –zookeeper quickstart.

cloudera:2181 –list# Run the consumer and producer in separate windows.

# Type in text to the producer and watch it appear in the consumer.

# ^C to quit.

kafka-console-consumer –zookeeper quickstart.

cloudera:2181 –topic testkafka-console-producer –broker-list quickstart.

cloudera:9092 –topic test.. More details

Leave a Reply