More Info:

Determines if TCP port 8020 for HDFS NameNode metadata service is open to the public.

Risk Level

Medium

Address

Security

Compliance Standards

CBP

Triage and Remediation

Remediation

Using Console

To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP console, please follow the below steps:
  1. Open the GCP console and navigate to the Compute Engine section.
  2. Click on the VM instances tab and select the instance where the Hadoop HDFS NameNode Metadata Service Port is open.
  3. Click on the Edit button to edit the instance settings.
  4. Scroll down to the “Firewall” section and click on the “Network tags” drop-down menu.
  5. Add a new network tag and give it a name, for example, “no-namenode-port”.
  6. Click on the “Save” button to save the changes.
  7. Navigate to the “VPC network” section and click on the “Firewall rules” tab.
  8. Click on the “Create Firewall Rule” button to create a new firewall rule.
  9. Give the firewall rule a name, for example, “no-namenode-port”.
  10. In the “Targets” section, select “Specified target tags” and enter the tag name “no-namenode-port”.
  11. In the “Source filter” section, select “IP ranges” and enter the IP address range of the network that should not have access to the Hadoop HDFS NameNode Metadata Service Port.
  12. In the “Protocols and ports” section, select “Specified protocols and ports” and enter the protocol and port number of the Hadoop HDFS NameNode Metadata Service Port (default is TCP port 8020).
  13. Click on the “Create” button to create the firewall rule.
  14. Verify that the firewall rule is applied to the instance by checking the “Firewall rules” section on the instance details page.
  15. Test the configuration by attempting to access the Hadoop HDFS NameNode Metadata Service Port from a network that is not allowed. The connection should be refused.
By following these steps, you will remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP console.

To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP CLI, please follow the below steps:
  1. Open the Cloud Shell from the GCP console.
  2. Run the following command to get the list of all the Compute Engine instances in your project:
gcloud compute instances list
  1. Identify the instance where the Hadoop HDFS NameNode Metadata Service Port is open.
  2. SSH into the instance using the following command:
gcloud compute ssh [INSTANCE_NAME]
  1. Edit the Hadoop configuration file hdfs-site.xml using the following command:
sudo nano /etc/hadoop/conf/hdfs-site.xml
  1. Add the following property to the file:
<property>
  <name>dfs.namenode.rpc-bind-host</name>
  <value>127.0.0.1</value>
</property>
This will bind the Hadoop HDFS NameNode to the loopback IP address and prevent it from being accessible from the network.
  1. Save and exit the file.
  2. Restart the Hadoop HDFS NameNode service using the following command:
sudo systemctl restart hadoop-hdfs-namenode.service
  1. Verify that the Hadoop HDFS NameNode Metadata Service Port is no longer open by running the following command:
sudo lsof -i :8020
This should not return any output.
  1. Exit the SSH session using the following command:
exit
By following these steps, you can remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” for GCP using GCP CLI.
To remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” on GCP, you can follow the below steps using Python:
  1. First, you need to authenticate with GCP using the Python SDK. You can do this by installing the google-cloud-sdk and running the command gcloud auth application-default login. This will authenticate you with your GCP account.
  2. Next, you need to get a list of all the instances in your GCP project. You can do this by using the google-cloud-sdk command gcloud compute instances list or by using the Python SDK. Here’s the Python code to get a list of all the instances:
from google.cloud import compute_v1

client = compute_v1.InstancesClient()

project = 'your-project-id'

zones = ['us-central1-a', 'us-central1-b', 'us-central1-c']

instances = []

for zone in zones:
    result = client.list(project=project, zone=zone)
    for instance in result:
        instances.append(instance)
  1. Once you have a list of all the instances, you need to check if the Hadoop HDFS NameNode Metadata Service Port is open on any of them. You can do this by using the socket library in Python. Here’s the Python code to check if the port is open:
import socket

def is_port_open(ip, port):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    try:
        s.connect((ip, port))
        s.shutdown(socket.SHUT_RDWR)
        return True
    except:
        return False
    finally:
        s.close()

port = 8020

for instance in instances:
    ip = instance.network_interfaces[0].network_ip
    if is_port_open(ip, port):
        print(f"Port {port} is open on instance {instance.name}.")
  1. If the Hadoop HDFS NameNode Metadata Service Port is open on any instance, you need to close it. You can do this by creating a firewall rule to block traffic on that port. Here’s the Python code to create a firewall rule:
from google.cloud import compute_v1

client = compute_v1.FirewallsClient()

project = 'your-project-id'

firewall_rule_name = 'block-hadoop-port'

network_name = 'default'

direction = 'INGRESS'

priority = 1000

ip_protocol = 'tcp'

port = 8020

target_tags = ['hadoop']

firewall_rule = {
    'name': firewall_rule_name,
    'network': f'projects/{project}/global/networks/{network_name}',
    'direction': direction,
    'priority': priority,
    'sourceRanges': ['0.0.0.0/0'],
    'targetTags': target_tags,
    'allowed': [{
        'IPProtocol': ip_protocol,
        'ports': [str(port)]
    }]
}

operation = client.insert(project=project, firewall_resource=firewall_rule)

print(f"Created firewall rule {firewall_rule_name}.")
  1. Finally, you need to apply the hadoop tag to all the instances running Hadoop. You can do this by using the Python SDK. Here’s the Python code to apply the tag:
from google.cloud import compute_v1

client = compute_v1.InstancesClient()

project = 'your-project-id'

zones = ['us-central1-a', 'us-central1-b', 'us-central1-c']

tag = 'hadoop'

for zone in zones:
    result = client.list(project=project, zone=zone)
    for instance in result:
        if 'hadoop' in instance.tags.items:
            continue
        instance.tags.items.append(tag)
        operation = client.update(project=project, zone=zone, instance=instance.name, instance_resource=instance)

print(f"Applied tag {tag} to all Hadoop instances.")
By following these steps, you can remediate the misconfiguration “Hadoop HDFS NameNode Metadata Service Port Should Not Be Open” on GCP using Python.

Additional Reading: