Examining the cluster

In this section you will look at the utilization of instance fleets and examine Spark executors, while the Spark application is running.

EMR Cluster Status

To get started, let’s check that your EMR cluster and Spark application are running.

export EMRClusterName="emr-spot-workshop";
export EMRClusterID=$(aws emr list-clusters --active | jq -c '.Clusters[] | select( .Name == '\"$EMRClusterName\"' )' | jq -r '.Id');
aws emr describe-cluster --cluster-id $EMRClusterID | jq -r '.Cluster.Status';

If you got an error about not getting the a proper value for --cluster-id, you may have picked a different name for the EMR Cluster. Make sure the EMRClusterName enviroment variable matches with your EMR cluster name and run the above commands again.

Wait until you get the cluster status on RUNNING with eather a message of Running step, like the following output:

{
  "State": "RUNNING",
  "StateChangeReason": {
    "Message": "Running step"
  },
  "Timeline": {
    "CreationDateTime": "2022-11-14T15:07:08.730000+00:00",
    "ReadyDateTime": "2022-11-14T15:12:18.753000+00:00"
  }
}

EMR On-cluster application user interfaces

To connect to the application user interfaces running on our EMR cluster you need to use SSH tunneling. Click here to learn more about connecting to EMR interfaces.

First, we need to grant SSH access from the Cloud9 environment to the EMR cluster master node. To do so, run the following commands:

export EMRMasterSG=$(aws emr describe-cluster --cluster-id $EMRClusterID | jq -r '.Cluster.Ec2InstanceAttributes.EmrManagedMasterSecurityGroup');
aws ec2 authorize-security-group-ingress \
    --group-id $EMRMasterSG \
    --protocol tcp \
    --port 22 \
    --cidr 172.31.0.0/16;

At this stage, you will be able to ssh into the EMR master node.

In the following steps, you might not see full utilization of vCPUs and Memory on the EC2 instances because the wordcount demo Spark application is not very resource intensive.

Access Resource Manager web interface

  1. Get the EMR Cluster DNS endpoint by running the following command on your Cloud9 environment:\

    export EMRClusterDNS=$(aws emr describe-cluster --cluster-id $EMRClusterID | jq -r '.Cluster.MasterPublicDnsName');
    
  2. Establish an SSH tunnel to port 8088, where Resource Manager is bound, by executing the below command:

    ssh -i ~/environment/emr-workshop-key-pair.pem -N -L 8080:$EMRClusterDNS:8088 hadoop@$EMRClusterDNS;
    

    You’ll get a message saying the authenticity of the host can’t be established. Type ‘yes’ and hit enter. The message will look similar to the following:

    The authenticity of host 'ec2-54-195-131-148.eu-west-1.compute.amazonaws.com (172.31.36.62)' can't be established.
    ECDSA key fingerprint is SHA256:Cmv0qkh+e4nm5qir6a9fPN5DlgTUEaCGBN42txhShoI.
    ECDSA key fingerprint is MD5:ee:63:d0:4a:a2:29:8a:c9:41:1b:a1:f0:f6:8e:68:4a.
    Are you sure you want to continue connecting (yes/no)? 
    
  3. Now, on your Cloud9 environment, click on the Preview menu on the top and then click on Preview Running Application. Cloud9-preview-application

  4. You’ll see a browser window opening with in the Cloud9 environment with a refused connection error page. Click on the button next to Browser (arrow inside a box) to open web UI in a dedicated browser page. Cloud9-resource-manager-pop-out

  5. On the left pane, click on Nodes:

    • If the Spark App is Running, then in the Cluster Metrics table the Containers Running will be 18. In Cluster Nodes Metrics table, the number of Active Nodes will be 17 (1 core node with CORE Label and 16 task nodes without any Node Label).

    • If the Spark App is Completed, then Containers Running will be 0, Active Nodes will be 1 (1 core node with CORE Label) and 16 Decommissioned Nodes (16 task nodes will be decommissioned by EMR managed cluster scaling).

Cloud9-Resource-Manager

If you can’t access the Resource Manager page, try using a different browser or check the network security policies in your computer. Your browser might have third-party cookies disabled, to enable them check this link: https://docs.aws.amazon.com/cloud9/latest/user-guide/troubleshooting.html#troubleshooting-preview.

Challenge

Now that you are familiar with EMR web interfaces, can you try to access Ganglia and Spark History Server application user interfaces?

Go to Application user interfaces tab to see the user interfaces URLs for Ganglia and Spark History Server.

Show answers

Managed Scaling in Action

You enabled managed cluster scaling and EMR scaled out to 64 Spot units in the task fleet. EMR could have launched either 16 * xlarge (running one executor per xlarge) or 8 * 2xlarge instances (running 2 executors per 2xlarge) or 4 * 4xlarge instances (running 4 executors pe r4xlarge), so the task fleet provides 16 executors / containers to the cluster. The core fleet launched one xlarge instance and it will run one executor / container, so in total 17 executors / containers will be running in the cluster.

  1. In your EMR cluster page, in the AWS Management Console, go to the Steps tab.
  2. Go to the Events tab to see the scaling events. scalingEvent

EMR Managed cluster scaling constantly monitors key metrics and automatically increases or decreases the number of instances or units in your cluster based on workload.

Question: Did you see more than 17 containers in CloudWatch Metrics and in YARN ResourceManager? if so, do you know why? Click to expand the answer