Hadoop Configuration using Ansible
We already knew that in order to create clusters using Hadoop, we need jdk(Hadoop’s dependency) and hadoop packages. Here I have jdk rpm and hadoop rpm and stored its location using a variable(Helps while installing them).
vars:
jdkrpm: /jdk-8u171-linux-x64.rpm
hadooprpm: /hadoop-1.2.1-1.x86_64.rpm tasks:
- name: Copying jdk file
copy:
src: /home/kanav/jdk-8u171-linux-x64.rpm
dest: / - name: Copying Hadoop file
copy:
src: /home/kanav/hadoop-1.2.1-1.x86_64.rpm
dest: /
Now I am using command/shell module to install the packages that I have stored their location using variables.

I am using ignore_errors: yes to ignore errors because If there are any clusters formed before, its always recommended to clear the previous folder(in my case /nn folder) and after deleting those folder create a new folder for your new cluster
- name: Deleting previous caches (if any)
command: "rm -r /nn"
ignore_errors: yes
- name: creating a new folder !
file:
path: /nn
state: directory
Here I am updating my core-site and hdfs-site file
- name: updating Core-site
copy:
src: "/home/kanav/core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: updating hdfs-site
copy:
src: "/home/kanav/hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
To format our namenode before starting our cluster and then I am using command module to start namenode service.
- name: Formatting namenode
command: "echo Y | hadoop namenode -format"
- name: Starting Namenode service
command: "hadoop-daemon.sh start namenode"
- debug:
msg: Namenode started successfully !
entire playbook
- hosts: 172.20.10.2
vars:
jdkrpm: /jdk-8u171-linux-x64.rpm
hadooprpm: /hadoop-1.2.1-1.x86_64.rpm
tasks:
- name: Copying jdk file
copy:
src: /home/kanav/jdk-8u171-linux-x64.rpm
dest: /
- name: Copying Hadoop file
copy:
src: /home/kanav/hadoop-1.2.1-1.x86_64.rpm
dest: /
- name: Using command module to install
shell: "rpm -i {{ jdkrpm }} --force"
shell: "rpm -hiv {{ hadooprpm }} --force"
- name: Deleting previous caches (if any)
command: "rm -r /nn"
ignore_errors: yes
- name: creating a new folder !
file:
path: /nn
state: directory
- name: updating Core-site
copy:
src: "/home/kanav/core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: updating hdfs-site
copy:
src: "/home/kanav/hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Formatting namenode
command: "echo Y | hadoop namenode -format"
- name: Starting Namenode service
command: "hadoop-daemon.sh start namenode"
- debug:
msg: Namenode started successfully !
In the same way you can configure slave node too
- hosts: 172.20.10.2
vars:
jdkrpm: /jdk-8u171-linux-x64.rpm
hadooprpm: /hadoop-1.2.1-1.x86_64.rpm
tasks:
- name: Copying jdk file
copy:
src: /home/kanav/jdk-8u171-linux-x64.rpm
dest: /
- name: Copying Hadoop file
copy:
src: /home/kanav/hadoop-1.2.1-1.x86_64.rpm
dest: /
- name: Using command module to install
shell: "rpm -i {{ jdkrpm }} --force"
shell: "rpm -hiv {{ hadooprpm }} --force"
- name: Deleting previous caches (if any)
command: "rm -r /dn"
ignore_errors: yes
- name: creating a new folder !
file:
path: /dn
state: directory
- name: updating Core-site
copy:
src: "/home/kanav/core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: updating hdfs-site
copy:
src: "/home/kanav/hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Starting Datanode service
command: "hadoop-daemon.sh start datanode"
- debug:
msg: Namenode started successfully !