Hadoop HDFS NameNode Metadata Service Port Should Not Be Open
More Info:
Determines if TCP port 8020 for HDFS NameNode metadata service is open to the public.
Risk Level
Medium
Address
Security
Compliance Standards
CBP
Triage and Remediation
Remediation
To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP console, please follow the below steps:
-
Open the GCP console and navigate to the Compute Engine section.
-
Click on the VM instances tab and select the instance where the Hadoop HDFS NameNode Metadata Service Port is open.
-
Click on the Edit button to edit the instance settings.
-
Scroll down to the “Firewall” section and click on the “Network tags” drop-down menu.
-
Add a new network tag and give it a name, for example, “no-namenode-port”.
-
Click on the “Save” button to save the changes.
-
Navigate to the “VPC network” section and click on the “Firewall rules” tab.
-
Click on the “Create Firewall Rule” button to create a new firewall rule.
-
Give the firewall rule a name, for example, “no-namenode-port”.
-
In the “Targets” section, select “Specified target tags” and enter the tag name “no-namenode-port”.
-
In the “Source filter” section, select “IP ranges” and enter the IP address range of the network that should not have access to the Hadoop HDFS NameNode Metadata Service Port.
-
In the “Protocols and ports” section, select “Specified protocols and ports” and enter the protocol and port number of the Hadoop HDFS NameNode Metadata Service Port (default is TCP port 8020).
-
Click on the “Create” button to create the firewall rule.
-
Verify that the firewall rule is applied to the instance by checking the “Firewall rules” section on the instance details page.
-
Test the configuration by attempting to access the Hadoop HDFS NameNode Metadata Service Port from a network that is not allowed. The connection should be refused.
By following these steps, you will remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP console.
To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP CLI, please follow the below steps:
-
Open the Cloud Shell from the GCP console.
-
Run the following command to get the list of all the Compute Engine instances in your project:
gcloud compute instances list
-
Identify the instance where the Hadoop HDFS NameNode Metadata Service Port is open.
-
SSH into the instance using the following command:
gcloud compute ssh [INSTANCE_NAME]
- Edit the Hadoop configuration file
hdfs-site.xml
using the following command:
sudo nano /etc/hadoop/conf/hdfs-site.xml
- Add the following property to the file:
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>127.0.0.1</value>
</property>
This will bind the Hadoop HDFS NameNode to the loopback IP address and prevent it from being accessible from the network.
-
Save and exit the file.
-
Restart the Hadoop HDFS NameNode service using the following command:
sudo systemctl restart hadoop-hdfs-namenode.service
- Verify that the Hadoop HDFS NameNode Metadata Service Port is no longer open by running the following command:
sudo lsof -i :8020
This should not return any output.
- Exit the SSH session using the following command:
exit
By following these steps, you can remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP CLI.
To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” on GCP, you can follow the below steps using Python:
-
First, you need to authenticate with GCP using the Python SDK. You can do this by installing the
google-cloud-sdk
and running the commandgcloud auth application-default login
. This will authenticate you with your GCP account. -
Next, you need to get a list of all the instances in your GCP project. You can do this by using the
google-cloud-sdk
commandgcloud compute instances list
or by using the Python SDK. Here’s the Python code to get a list of all the instances:
from google.cloud import compute_v1
client = compute_v1.InstancesClient()
project = 'your-project-id'
zones = ['us-central1-a', 'us-central1-b', 'us-central1-c']
instances = []
for zone in zones:
result = client.list(project=project, zone=zone)
for instance in result:
instances.append(instance)
- Once you have a list of all the instances, you need to check if the Hadoop HDFS NameNode Metadata Service Port is open on any of them. You can do this by using the
socket
library in Python. Here’s the Python code to check if the port is open:
import socket
def is_port_open(ip, port):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.connect((ip, port))
s.shutdown(socket.SHUT_RDWR)
return True
except:
return False
finally:
s.close()
port = 8020
for instance in instances:
ip = instance.network_interfaces[0].network_ip
if is_port_open(ip, port):
print(f"Port {port} is open on instance {instance.name}.")
- If the Hadoop HDFS NameNode Metadata Service Port is open on any instance, you need to close it. You can do this by creating a firewall rule to block traffic on that port. Here’s the Python code to create a firewall rule:
from google.cloud import compute_v1
client = compute_v1.FirewallsClient()
project = 'your-project-id'
firewall_rule_name = 'block-hadoop-port'
network_name = 'default'
direction = 'INGRESS'
priority = 1000
ip_protocol = 'tcp'
port = 8020
target_tags = ['hadoop']
firewall_rule = {
'name': firewall_rule_name,
'network': f'projects/{project}/global/networks/{network_name}',
'direction': direction,
'priority': priority,
'sourceRanges': ['0.0.0.0/0'],
'targetTags': target_tags,
'allowed': [{
'IPProtocol': ip_protocol,
'ports': [str(port)]
}]
}
operation = client.insert(project=project, firewall_resource=firewall_rule)
print(f"Created firewall rule {firewall_rule_name}.")
- Finally, you need to apply the
hadoop
tag to all the instances running Hadoop. You can do this by using the Python SDK. Here’s the Python code to apply the tag:
from google.cloud import compute_v1
client = compute_v1.InstancesClient()
project = 'your-project-id'
zones = ['us-central1-a', 'us-central1-b', 'us-central1-c']
tag = 'hadoop'
for zone in zones:
result = client.list(project=project, zone=zone)
for instance in result:
if 'hadoop' in instance.tags.items:
continue
instance.tags.items.append(tag)
operation = client.update(project=project, zone=zone, instance=instance.name, instance_resource=instance)
print(f"Applied tag {tag} to all Hadoop instances.")
By following these steps, you can remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” on GCP using Python.