My thoughts and ideas
My thoughts and ideas
Cloud computing allows users to quickly access an arbitrary amount of compute resources from a distance without the need to buy or maintain hardware themselves. There are many cloud computing services. This tutorial describes the use of the Amazon Web Services (AWS) elastic compute (EC2) resource. However, the fundamental concepts covered here will generally apply to other cloud computing services such as Google Cloud, Digital Ocean, Microsoft Azure, etc., though with substantial differences in jargon used by each provider.
Creation of this tutorial on Amazon AWS EC2 was generously supported by Amazon AWS Education grants.
To complete this tutorial, you will need a computer with access to the internet, a Web Browser, and a command line terminal application (e.g.
Terminal on a Mac,
putty on Windows, etc.). We are going to access the Amazon EC2 console in your web browser and use it to configure and rent a remote computer from Amazon. We are then going to log into that computer from the command line using a terminal application. The computer you are working on can be almost anything and could be running Windows, Mac OSX, or Linux. The computer that we configure and rent from Amazon will be a Linux machine (though there are many other possibilities). You will use the terminal application on your computer to remotely log into this computer. The Amazon AWS computer you rent will be physically located somewhere that is likely far away from you. Depending on the Region you select in Amazon AWS it could be physically located in one of several large compute warehouses in the North America, South America, Europe or Asia.
There are two types of knowledge you will be exposed to in this tutorial. First, use of the Amazon AWS EC2 web console. AWS documentation has a lot of jargon and technical concepts. Becoming proficient in AWS EC2 usage is akin to becoming a very particular type of system administrator. Everything we will do below in a web browser using the EC2 console can also be accomplished at the command line using the Amazon EC2 API tools. That is a subject for a more advanced tutorial.
Second, since we are going to create an Amazon instance that is running a Linux operating system you will need to learn the basics of working at a Linux command line. You will also need to become familiar with basic fundamentals of Linux system administration. Refer to the Resources section for a list of reference materials to help you learn the basics of Linux/Unix
In order to use AWS for the first time, you will need to create an account. In order to create and run instances as described in this tutorial, you will need to associate a credit card with that account for billing purposes. Refer to the sections below on how billing works, how to estimate costs, and how to ensure that you have properly shut down everything that you could be billed for. To run this tutorial as it is described should cost at most a few dollars.
To log into AWS, go to aws.amazon.com and hit the Sign In to the Console button as shown below. If needed, create an account and activate it by associating a credit card. Once you are logged in, select
EC2 from the list of Amazon Web Services. This tutorial is entirely focused on
EC2 (with some mention of
S3) so the
EC2 console will be the starting point for many of the activities described below.
AWS log in:
List of AWS services (select EC2 for this tutorial):
The AWS EC2 dashboard:
Region is set of compute resources that Amazon maintains (each like the
Data Center image shown above). Each
Region corresponds to a physical warehouse of compute hardware (computers, storage, networking, etc.). At the time of writing there are 8 regions: (
US East (N.Virginia),
US West (Oregon),
US West (N. California),
Asia Pacific (Singapore),
Asia Pacific (Tokyo),
Asia Pacific (Sydney), and
South America (Sao Paulo). When you are logged into the AWS EC2 console you are always operating in one of these 8 regions. The current region is shown in the upper right corner of the console between the
User menu and
Support menu. It is important to pay attention to what region you are using for several reasons. First, when you launch an EC2 instance, this happens in a specific region. If you switch regions later, you will not see this instance. To find info in the console you will have to switch back to the region where that instance was created. The same reasoning applies for EBS volumes, AMIs, and other resources. These are tracked within a region. Second, the cost to use many AWS resources varies by region. Third, since each region is located in a different part of the world, this may influence network performance when you are accessing the instance and especially if you need to transfer large amounts of data in or out. For example, if you are working in the US and you are going to be uploading RNA-seq data to EC2 instances, it probably does not make sense to create those instances in
Asia Pacific (Sydney). Generally you should choose a region that is close to you or your users. But cost is also a consideration. It is important to be aware of regions when it comes to billing because if you are using resources in multiple regions it is easy to lose track of what you have running and you might wind up paying for something that you forgot to shut down. We will discuss billing and cost in further detail below.
Estimating the cost to use AWS resources can get complicated. For the most part when you launch an EC2 instance or create an EBS or S3 volume, you are renting and reserving that resource. You will generally be charged for as long as you reserve that resource regardless of how you use it. For example, if you rent an 8-core machine with 1Tb of disk, and 64Gb of RAM, once you start that machine you will be charged an hourly rate for as long as it is running. Even if you do not use it much. Even if you do not log into it at all! You have reserved it, it is being run for you, that resource can not be rented to someone else, so you must pay for it. To get a sense of how much a particular resource costs, spend some time examining the AWS EC2 Pricing list. Remember that
Region can influence cost, so once you decide on the type of resources you need you should compare the cost of that resource across multiple regions. The pricing list is an extremely long page, broken down into several major categories:
Free Tier (light weight resources you can experiment with for free),
On-Demand Instances (rent by the hour, as we do in this tutorial),
Reserved Instances (get a discount by renting longer term),
Reserved Instance Volume Discounts (get further discounts by being an enterprise scale user),
Spot Instances (bid for unused Amazon EC2 capacity in an open market),
Data transfer (moving data in and out of EC2),
EBS-Optimized Instances (for high performance file I/O),
Amazon Elastic Block Store (rent storage volumes separately from Instances), etc. Amazon provides a Monthly Calculator to help you predict what your costs might look like.
For this tutorial, we are going to use an
On-Demand Instance. Let look more closely at that section of the pricing list by referring to the example screenshot below. Note that we have selected
US West (Oregon) as our region and we are looking at the
General Purpose section of the table and assuming that we will launch a
Linux instance. These tables enumerate the features of various computer configurations that you can rent by the hour. Consider a particular instance type in this table, for example
m3.xlarge. For this instance, we are told the number of CPUs that will be available on the machine (4), the amount of memory (15 GiB), the storage that will be pre-configured (2 x 80GB SSD drives), and the cost per hour to rent this machine ($0.140 per Hour). Note how much jargon is used in these tables. Memory is reported in GiB and storage is reported in GB (1GiB ≈ 1.074GB). For the number of CPUs we are told both the number of vCPUs (virtual CPUs) and ECUs (Elastic Compute Units). A virtual CPU is a reference to the number of physical CPUs that are available on the machine. It is referred to as
virtual because we are actually creating a virtual machine when we create an EC2 instance. This virtualization allows Amazon to create a set of smaller virtual computers of varying specifications from larger phyiscal computers that they maintain. An ECU is a unit of computing that is meant to allow more accurate comparisons between machines that might have different generations of CPUs, recognizing that not all CPUs are created equally. The storage descriptions for EC2 instances may use the terms:
HDD. We will discuss EBS (Elastic Block Storage) in more detail below, but briefly EBS allows us to define one or more storage volumes of almost any size we wish. The hardware details of that storage will be handled for us behind the scenes. EBS volumes exist independently of an EC2 machine and can be moved from one machine to another and the contents can be stored and attached to multiple EC2 instances over time (though only one instance at a time).
SSD refers to a solid state drive that is associated with the machine.
HDD refers to a hard disk drive that is associated with the machine.
SSD drives tend to be higher performance than
HDD but also more expensive per unit of storage space. When an instance indicates that it has
2 x 40 SSD this means that is has two solid state drives, and each are 40GB in size. The
HDD drives associated with each instance and described in the pricing list are considered
ephemeral. We will discuss this important concept in more detail below.
An example AWS EC2 price list:
Generally you get an accounting of usage and cost on a 30 day cycle. You can get more detailed information on your account by going to the
Billing and Cost Management section of the
User menu in the EC2 console. You can also refer to the billing section of the EC2 FAQ page. Briefly, Amazon describes billing as follows:
You pay only for what you use and there is no minimum fee. Pricing is per instance-hour consumed for each instance type. Partial instance-hours consumed are billed as full hours. There is no Data Transfer charge between two Amazon Web Services within the same region (i.e. between Amazon EC2 US West and another AWS service in the US West). Data transferred between AWS services in different regions will be charged as Internet Data Transfer on both sides of the transfer. Usage for other Amazon Web Services is billed separately from Amazon EC2. Billing commences when Amazon EC2 initiates the boot sequence of an AMI instance. Billing ends when the instance terminates, which could occur through a web services command, by running “shutdown -h”, or through instance failure. Instance-hours are billed for any time your instances are in a “running” state. If you no longer wish to be charged for your instance, you must “stop” or “terminate” the instance to avoid being billed for additional instance-hours. Billing starts when an instance transitions into the running state.
This sounds simple but it tends to be more complicated than this. EBS usage is billed separately from EC2 resources, even though the two are often intricately linked. For example, the root volume on a Linux Instance often exists as an EBS volume. You will be charged for this whether the system is running or stopped. Similarly, if you have an EC2 instance with an extra EBS volume(s) attached to it, you will be charged for this storage even when the EC2 instance is stopped. When you terminate the instance, associated EBS volumes may or may not be automatically destroyed. This behavior depends on how you configured the instance. We will discuss this configuration in further detail below. Similarly, if you create a
Snapshot of your instance, this gets saved to EBS storage and you will be charged for these as long as they exist. When you create an AMI to save the state of your instance for later, this is stored as a
Snapshot. The good news is that small EBS volumes are very cheap and by default the root volume for most instances is small (usually 8GB).
If you choose an instance type with pre-configured storage or you attach EBS volumes for storage but set them to be deleted upon termination of the instance, and you never create any
Snapshots or save the instance as an
AMI, when you terminate that instance, all costs associated with it will be gone. The cost will therefore be the hourly rate from the pricing list multiplied by the number of hours it was running rounded up to closest whole hour.
Billing and Cost Management section of the EC2 console you can create billing alerts that will warn you of ongoing costs. If you find that you are being charged a monthly fee but you are not intentionally using any resources, you should follow these steps. Log into the AWS EC2 console. Now, for each AWS Region, determine the following: are there any
Elastic IPs? or
Snapshots? If any of these values are greater than 0, in any one or more regions, you are likely being billed monthly for resources that Amazon is reserving for you until told otherwise. If you terminate or delete all of these items, your monthly bill should return to $0.
In the following sections we are going to launch an example instance, configure it in the AWS EC2 console, discuss some of the important concepts of this configuration and log into the instance once it is running. In each case, screenshots from an example instance will be shown and discussed. To get started, make sure you are logged into AWS and go to the
EC2 dashboard. To start using EC2 we will launch a virtual server running the latest stable version of the Ubuntu operating system. We will decide on the basic hardware for this server, configure storage that will be available on it, configure its security, and so on. Once it is running we will log into this server and perform some additional exercises and configuration. To get started press the blue
Launch Instance button. Remember that you are launching this instance in a particular
Region. In the following example we launched an instance in
US East (N. Virginia).
Launching an instance:
An AMI or
Amazon Machine Instance is a template for launching an instance. The AMI includes a template for a pre-configured root volume that contains an operating system (e.g. Ubuntu Linux, Windows, etc.). The AMI also includes basic configuration of storage volumes that will be available within an instance.
The first step to launching an instance is to select an AMI. Refer to the screenshot below. There are four main options when selecting an AMI:
AWS Marketplace, and
Community AMIs. In the
Quick Start list we will select an Ubuntu AMI as our starting point. The
Quick Start AMIs section is a relatively short list of basic systems that have been chosen by Amazon as common starting points. These have some degree of “official” support and testing on AWS. The
My AMIs section contains AMIs that you have created youself, perhaps using a
Quick Start or
Community AMI as a starting point. For example, you might start with an Ubuntu AMI and install all of the bioinformatics software and other infrastructure you need to run an analysis pipeline. You could then store this configured machine as an AMI to save your work. Having an AMI allows you to share a complete system configuration, and to fire up a cluster of identical instances that are ready to go. The
AWS Marketplace contains AMIs where a company (often a software company) has configured a machine for certain applications. You can browse through this section to get an idea what kinds applications are available. Finally, the
Community AMI section contains thousands of AMIs created by users around the world. These AMIs are specific to each
Region so if someone tells you about an AMI they want to share, be sure to search for it in the correct region. If you create you own AMI and you want to share it with others, you can ‘publish’ it to the community. It will still appear in you
My AMIs section, but it will also then appear and be searchable in the
Community AMI section.
For this tutorial we will select the following AMI from the
Quick Start list:
ami-9a562df2. This number is a unique ID for the AMI. The full length description for this AMI is
Ubuntu Server 14.04 LTS (HVM), SSD Volume Type. We are also told that the
Root device type is
EBS and the
Virtualization type is
Ubuntu Server 14.04 LTS, refers to the version of the Ubuntu OS. This version is also known as Ubuntu ‘Trusty Tahr’. If you were going to install it directly onto your own hardware instead of on the AWS cloud you would download the 14.04 LTS image from the Ubuntu releases site. We are told that the
Root device type of the AMI is
EBS. We will discuss storage in more detail but this means that the AMI is configured so that the
root volume of the operating system will be installed on an EBS volume. In practical terms, this means that information stored on the root volume, including the OS itself will persist if we stop the instance (i.e. the root volume is not ephemeral). The term
HVM refers to a type of virtualization technologythat will be used by the instance, the other common type being
PV. A detailed discussion of virtualizition technology is outside of the scope of this tutorial but you can learn more details here: Linux AMI virtualization types and Ubuntu PV vs HVM. On balance, the
HVM option is now perhaps recommended over the
PV option. Once you are ready to proceed, press the blue
Select button next to the Ubuntu Server description.
Step 1. Choose an Amazon Machine Image (AMI):
Once an AMI is selected, the next step is to choose an instance type. In simple terms, in the previous step we decided on the operating system we want to run (Ubuntu v14.04), and now we need to chose the hardware that it will run on. Refer to the following screenshot for this discussion. Note that in this example, we have selected
General purpose in the drop down filter. This leaves us 7 choices for hardware configuration. Note that the price per hour for each of these options is not listed here. To get the price, note the instance type name (e.g.
m3.large) and refer back to the EC2 pricing list. At the time of writing, an
m3.large instance in
US East (N. Virginia) rented on an
On Demand basis costs $0.140 per Hour. We discussed many of the details described in this table of instance types in the pricing discussion above. Briefly, we are given a series of options that differ in their number of CPUs, memory, pre-configured storage, network performance, etc. To view more or less details you can adjust this table using the
Show/Hide Columns option. In the example depicted below we have selected the
m3.large option with 2 CPUs, 7.5 GiB of memory, a single 32GB SSD
Instance Storage volume, and moderate network performance. Once you are ready, proceed to the next step by pressing the
Next: Configure Instance Details button.
Step 2. Choose an Instance Type:
Once an instance type is selected the next step is to configure the instance details. This step in the launch-an-instance procedure introduces many advanced concepts that will be covered only briefly here. To learn more about each of the options available in this step you can mouse over the
i symbol beside each. For the most part, leaving all of these options at their default value will be fine. Refer to the following screenshot while we discuss a few of these options briefly. Using the
Number of instances option you could launch multiple instances of the same AMI with the same hardware configurations at the same time. However, in our example, only one instance will be launched. You also have the option to attempt to negotiate a cheaper rental by using the Request Spot Instances option. The
Shutdown behavior option determines what will happen if you shutdown the instance from within the AMI (e.g. by issuing a
sudo shutdown command in ubuntu linux). To prevent accidental termination of your instance, you may want to set this option to
Stop. You can also help to prevent accidental termination of your instance by using the
Enable termination protection option. These options can also adjusted later for any instance in the console. Once you are ready, proceed to the next step by pressing the
Next: Add Storage button.
Step 3. Configure Instance Details:
The next step is to configure the disk/storage that will be available in the instance. The starting point of this page depends on what instance type we selected in Step 2. Remember that we selected an instance type with an EBS root volume (during AMI selection) and an additional 1 x 32 GB SSD drive. These two volumes are summarized in the
Add Storage view. The first volume is 8 GiB. This is the root volume where the operating system will exist. It is set to be deleted on termination of the instance but we could chose to keep it as well. The second row of the table shows as
instance store 0. This is the 32 GiB SSD drive (though confusingly, the size is not shown here). Our two volumes are projected to be attached to the instance as
/dev/sdb. Sometimes this does not exactly match what we see inside the instance because device mapping behavior depends on the operating system. Since the second volume is an
Instance Store device, it is akin to a drive physically attached to the computer we are renting. This should ensure high performance, but it is important to remember that such volumes are
ephemeral and the contents will not persist if the instance is stopped or destroyed. To demonstrate the difference, lets use the
Add New Volume button to add a third volume to our instance (see Step 4b screenshot below). Choose
EBS as the
Type, set the device to
/dev/sdc, give it a size of 500 GiB, and set the volume type to
General Purpose (SSD). Now when we log into the instance we will expect to find three distinct storage volumes/devises.
Step 4a. Add Storage:
Step 4b. Add additional Storage:
Instance Storestorage? What is
EBSstorage? Which is the better option for
Root device type?
EBS (elastic block storage) volume can be linked to single instance.
EBS volumes can be set to persist even if the instance is destroyed. They could therefore later be reassigned to a different instance. In the context of bioinformatics analysis, you might decide to write your analysis results to an EBS volume. Once analysis was complete you could then shut down the instance to save money but keep your results indefinitely on the EBS volume. The Instance Store volumes are considered ‘ephemeral’ or transient. Therefore, you must be careful when storing data to these volumes because if the machine is stopped or terminated, the data will be unrecoverable. Instance store volumes are created from disks that are physically attached to the host computer while EBS volumes are created from disk arrays in the same
Availability Zone but are not physically attached to the host computer. Instead the host computer accesses EBS volumes over a network. EBS volumes can be added at will to an existing instance. On the other hand, Instance Store volumes can only be added or configured when the instance is created.
Root device type refers to the type of volume used to store the operating system itself. This is usually a small volume (often 8 GiB) that can be either
Instance Store type. This type option can be selected during the choice of AMI or when configuring storage during setup of the AMI. Once you launch the AMI though you can not change the
Root device type. There are pros and cons to both
Instance Store for the root device type. The
Instance Store type may have a performance advantage but the
EBS type is more flexible and safer from the perspective of accidental data loss. For a beginniner just starting to use AWS, we recommend
EBS. A bioinformatics analysis instance might use an
EBS volume for
Root device type, use an
Instance Store volume for
/tmp where all temporary files and staging of data will occur, and use an additional
EBS volume to store the final results. Notice that this is how we have configured the example instance in this tutorial. You can examine the types of volumes for an existing Instance in the EC2 dashboard by selecting a running Instance and examining the
Root device type value. Once you are ready, proceed to the next step by pressing the
Next: Tag Instance button.
As you start to have a large number of instances running or saved you may want to start assigning
Tags to these instances to help track their usage and billing details. Try creating a
Tag as a simple key/value pair. In the example below we created a
name tag with a value of
AWS Tutorial. Once you are ready, proceed to the next step by pressing the
Next: Configure Security Group button.
Step 5. Tag Instance:
Security Group controls how services and users can access your instance once it is running. When you launch a new instance you can choose to configure a new Security Group and use it or select one that you created previously. You can also select a
default security group. The purpose of Security Group settings is to determine what Inbound and Outbound network traffic will be allowed on the instance. Since Inbound traffic could be coming from anyone (including those with malicious intentions) it is highly recommended that most incoming traffic be blocked and only certain incoming services be allowed on an as needed basis. In the example below we created a Security Group called
AWS-Tutorial that only allows incoming traffic of two types:
SSH (over port 22) and
HTTP (over port 80). The first rule, will allow us to log into our instance remotely using the SSH protocol. The second rule, will allow us to set up a web server on the instance and access web content remotely using a web browser. Both of these rules could be made significantly more secure by limiting access to only certain IP addresses. For example, if you will access your AWS instances only from your university you could limit access to your universities IP address (or a range of addresses). You can reconfigure the
Security Group settings at any time, but they will not take effect until the instance has been rebooted. Create two Incoming rules that match those in the screenshot and name your new security group. Once you are ready, proceed to the next step by pressing the
Review and Launch button.
Step 6. Configure Security Group:
At this stage you will be presented with a final summary describing the configuration of your instance. Some warnings may appear. A conservative security warning is often presented here if you have allowed any broad Incoming access to the instance. Once you are ready, proceed to the next step by pressing the
Step 7. Review Instance Launch:
You will now be presented with a final, but very important configuration step, assignment of a
Key Pair. A Key Pair consists of two keys, a
public key and a
private key. The
public key will be stored in AWS. The
private key will will be presented to you on creation and it must be saved by you to allow you to log into your instance later. If this is your first instance you will have to
Create a new key pair. In the example below we have chosen to create a new key pair called
AWS-Tutorial. Once the name is chosen, press the
Download Key Pair button. You will download a simple text file called
AWS-Tutorial.pem. Store this file somewhere on your computer (e.g. your home directory) and remember the location. The contents of this file will contain an RSA key that should look something like this:
-----BEGIN RSA PRIVATE KEY----- MIIEpAIBAAKCAQEAhEpF18lIUouMH8qia/BSB70vrQVq/mTTkiRbsACB78rzy3XGRMfvwUseIsGY H6SDOAFrRlmTrAArH5A0t2TZ8PKrq7b9FtEAvMCeE7rWEiqBblAWiER0k1pbnIqyKJJCo1YRSUs0 oNMdvjB4CUylYraSsSNFYJG5gRwcNhBENLDVnDS79geQcPLu/JeEiJ9V+w+CCYAG40f7li/TuULr rSy6Oq6jgn2Gy7rrHU7XHU5hcEvxuSeoLb8h/bH1N+cN/H7x3ipEjIDdA2ScCkRXum1V6/kTFQFq vDG0lqoTlmTNKgDGpb+rdzJgOg/3QX4RSrX/c0W6aFkV9Ib/jQxT+wIDAQABAoIBADAvWXc6wpQG bjiaN0T3mPlmqHnuEkWs9f8yLQ9TcACmvNwr/tbIuISAVu6z8zP7WSxKIAfU0twAh7SMcxclrdh8 m5kFIvRvlkQqKKnpENY3E0PZ+gsSXB/b9qhzQGdUtt8Fl3BJ61Z07016HA7PEyJ8e7v3q+p7ycTE N2Zd0GocRIX8zxdRo9GS8ouS0QcFgNF8KblzlJ6Vs0gI7o7mIRZIm9vWkuR9Lp9uEPD2flUIvN3z yRmY/FE/R1yc76Uq+g8eywifRAh+GFyyO8PmFoYRni4Ki6+tEIFaq5JauT0JJF66EZeZP8ZKoWm9 1K30Ucti2D5l8t+CpbBM5JxhmjECgYEAxz1ET42F1sBGYqNn5hmfjrRp+YF3EYz2awRSibOeerpJ Bh1QZeB7/QD3wcB00XFiMu/3haP9xs4eesjSSug+1F59nyzDplNsybz1sYpUQwP9LjX0loUCIb8r 3O2VdLJ5ZJ9dfNgpStC/wi7kkr8xjK5XiHgP6DLk6+H1Lr2d+kMCgYEAqfpUseZ/sm1vYt80LlWI r8ozsUmzuISRspGVUppyDD47Iyj/1mkiWnsFDDl07oBcFIUFIEd1rkJNB3gXKSr76kcY0X4lav7a 0dvse2T9PC/pLSFkax9UjVnydCN8ElyNoXI2wT5HuLDjjCmHBD/4E9ZOO201JICSbRxaykl17+kC gYEAxRiWuxwFiqwq9Okxny856LIRJAIvB+2q17Mu84n8/OvL0YCuSBoKjf6nGcSJy6eevUUmV84i /sho3o5Lek7F2NCg9RYTdjaRKAEGDNwK/0Cy9UPq8fwiX7/+ZE+jyg3EiQYeNaKhNqHLEQ3SkFkT a1gMv7QGCG5QiAi/w71QyoECgYARcn+VDyrWXsNLK8wIYYE5QhESRpVrADiQUr84DmBcf1rEniW8 lWgQT4ZSHeexv300If9Hs+4RZ/7OIHaIJEBdaNTUVBV1KRm+5sscU15m+if+GOpc0Id2RuBLKYVH wTZMdxPFvCXSgF2q+mxAdGx7ZMj88pW83HGrP3jWQLoZWQKBgQCX5jxy3QXlPpwDppqwKKBQ8cGn YDDQHCeD5LhrVCUqo5DCobswzmGKU/xEqYsqlk/Mz1Zkvg4FbJwJDgQGkSyAu071NLi0O6w27dm+ UHuvF5mCDdAHWirFUBSiebxOpEQnkZ9IPXUUCSC6IQvPFbdGN8G3WjoER6Lw121Q4rJxGA== -----END RSA PRIVATE KEY-----
If this is not your first instance and you have already created a key pair, and you still have that key file, you can choose to use the existing key pair.
Step 8. Select an existing Key Pair or create a new Key Pair:
To prepare for logging into our instance, lets create a directory on our own
local computer (i.e. the one you are sitting at) and store the key file there. Later we will use this file to log onto the AWS instance. We are using a mac system. To create a directory and move the key file we downloaded into that directory we can down the following in a Mac Terminal session:
mkdir ~/AWS-Tutorial mv ~/Downloads/AWS-Tutorial.pem ~/AWS-Tutorial cd ~/AWS-Tutorial chmod 400 AWS-Tutorial.pem ls
chmod 400 AWS-Tutorial.pem command changes the permissions of your key file so that only you can read it. This is an important security setting. If you attempt to log into your instance using a key file with inappropriate permissions, the login command may fail. So you should always perform this command on any new key file (or copy of such a file) before attempting to use it to log into an instance.
Once you have launched your instance, you will be presented with yet another review page. When you are ready, proceed to the next step by hitting the
View Instances button.
Step 9. Review launch status:
You should now see the EC2 Console view for a new instance. This view shows a table of all Instances you have created in the current
Region. When an instance is terminated it will remain visible in this table for a brief time and then will be automatically removed. You should see a single entry that when selected looks much like the example below. After a few minutes, your instance should achieve an
Instance State of
running. Note the incredible wealth of information available both in the table and in the
Description view below. Much of the configuration we described above will be summarized here. To work effectively with your instances you will need to become familiar with certain features of this EC2 console view. For example, the
Root device type,
Root device, and
Block devices will help to remind you what kind of instance you configured. In order to remotely log into the instance, the following items in the console will be relevant:
Key Pair Name,
Security Groups, and
Public IP (or
Public DNS). Try to familiarize yourself with each of these features and how to find them in the console for each instance you have have running.
To modify an instance in the EC2 console you can select that instance (or a series instances) using the blue check boxes at the left. You can then perform various tasks using the
Actions menu. You can also right click on a single instance to obtain a similar menu. Before logging into this instance lets take a momemt to examine various important sections of the EC2 console in particular the
Security Groups, and
Key Pairs. In each of these views you should see new entities that correspond to the instance we just created.
Step 10. EC2 Console view of a new Instance:
The EC2 dashboard should now show a running Instance, Volumes, etc.:
Review new Volumes:
Review new Security Groups:
Review new Key Pairs:
We are finally ready to log into our instance. To do this, open a terminal session on your local computer (e.g. using
Mac Terminal or
Windows Putty). Change directories to the location where you stored your key file
AWS-Tutorial.pem. Now at the same time, view your instance in the EC2 console. Make sure that the
Key pair name for this instance matches the
.pem key file. Also, get the
Public IP value from the console and use it instead of the example one below. Note that you could use the
Public DNS value instead if you want. Finally log in as follows:
cd ~/AWS-Tutorial chmod 400 AWS-Tutorial.pem ssh -i AWS-Tutorial.pem email@example.com
In this example, we open a terminal command line session on our local computer. We moved to the location of my
.pem key file. We then made sure the permissions of this file were set correctly using a
chmod command. You only need to do this step once but there is no harm in doing it again. Then we executed an SSH command to remotely log into our AWS instance using the
Public IP 188.8.131.52. Our SSH command included an option to use the
.pem file to identify us as the owner of the instance. We logged into the instance as a user called
ubuntu because that is a user that we know will be defined by default on all ubuntu systems. Once logged in you can create new users if you wish. If your login is successful, you should see something like that shown in the screenshot below.
Step 11. Log into Instance:
If you tried the above and it did not work there are several possible explanations.
Instance Stateof your instance in the EC2 console. Is it
running? When you first start an instance it takes a few minutes to boot up. Similarly, if you reboot the instance for some reason, you will not be able to log into it until it comes back online.
Key Pairand you must have the key file that was created when that
Key Pairwas created. If you delete a key file and later generate a new one, it will not work with instances that used an older
Key Paireven if you name the file the same thing.
.pemkey file correctly. Do not forget to run
chmod 400 *.pemon your key file.
-i key_file_name.pemin your SSH command?
ubuntu@(or other valid user name) before the IP address to log into an Ubuntu system by SSH.
ubuntu@must match the
Public IPvalue that is shown in the AWS EC2 Console. Note that there are also
privateversions of these two values. Only the
Publicversion will work from your local computer.
Security Groupused for the instance allow Incoming SSH Access? Make sure your
Security Grouphas an entry for type
22, from source
Anywhere. If you have to change the
Security Groupsettings to allow access, you will have to reboot the instance before they take effect.
Now that you are logged in, you can investigate how the storage options you choose when creating the instance manifest inside an AWS Ubuntu instance. First try using the command
df -h to view existing storage devices that are mounted. If you created a system exactly as decsribed above, you should see two devices (
/dev/xvda1 of 7.8G mounted as
/) and (
/dev/xvdb of 30G mounted as
/mnt). These are the
EBS root device volume and the ephemeral 30G
Instance Store volume that come with the
m3.large instance type we chose. Remember that we also added another
EBS volume that was 500 GiB in size. Where is that device?
It is not currently mounted. To view all devices the system knows about you can do something like this command:
ls /dev/ or
ls -1 /dev/ | grep xvd. You should now see three devices:
xvdc. Lets format and mount the device
xvdc to a new directory
data as follows:
lsblk cd / sudo mkdir data sudo mkfs -t ext4 /dev/xvdc sudo mount /dev/xvdc /data sudo chown -R ubuntu:ubuntu /data df -h lsblk
NOTE: Refer to AWS docs about using EBS Volumes for more details.
Now the same
df -h command we performed above should show a new volume
/dev/xvdc mounted at
/data of size 493G. Note that in order to make this new mount persist when we reboot the machine we will have to add a mount line like this to the
/etc/fstab file (e.g. by
sudo vim /etc/fstab):
/dev/xvdc /data auto defaults,nobootwait 0 2
To examine other resources within the Ubuntu instance you should familiarize yourself with the command
1 to show CPUs and
q to exit). You can also learn about the system by examining
cat /proc/cpuinfo, and
To update your Ubuntu OS to use the latest security patches etc. you can do the following:
sudo apt-get update sudo apt-get upgrade sudo reboot
If you want to easily retrieve or share data created on your instance, one option is to start an Apache web service on the instance so that you can browse the contents of certain directories remotely in a web browser. Note that when launching the instance our security group was configured to allow http access via port 80 so that this would work.
sudo vim /etc/apache2/apache2.conf
<Directory /home/ubuntu/> Options Indexes FollowSymLinks AllowOverride None Require all granted </Directory>
sudo vim /etc/apache2/sites-available/000-default.conf
sudo service apache2 restart
You should now be able to enter the
Public IP or
Public DNS in a web browser on your local computer and browse the contents of the
/home/ubuntu directory on your AWS instance.
From the AWS EC2 console, you can change the state of each of your instances. The
Start command will boot a system that has been powered down. The
Stop command will power down the instance and is similar to performing
sudo shutdown from within the instance (if you have configured your instance that way during creation). Do not forget that if you stop an instance with ephemeral
Instance Store volumes, the contents of these volumes will be lost. The
Reboot command will simply reboot the machine. This is equivalent to using a
sudo reboot command from within the instance. The
Terminate command will destroy the instance and any ephemeral
Instance Store volumes associated with it. If the root device is an EBS volume it may or may not be destroyed depending on how you configured the instance during creation. If there were additional EBS volumes associated with the instance and you
Terminate the instance, these may also be destroyed if you selected that option when they were being created. Before terminating an instance you should think carefully about whether there is data you want to save and if so, how the volumes will behave on termination. Similarly, if you want to destroy all components of an instance, including all associated volumes, you may need to terminate the instance and then separately destroy certain volumes.
If you have configured an existing AMI for your own purposes and you would like to save this work, one option is to create your own custom AMI. You can save an existing instance as a new AMI in the EC2 console by right clicking the instance and selecting
Create Image. Enter an appropriate name and description and then save. You can monitor the creation of you AMI in the
AMIs section of
IMAGES in the EC2 console. Once complete, your new AMI will appear in the
My AMIs section when you create new instances. If you would like your AMI to be listed in the
Community AMIs where it can be shared with others you can change the permissions of the AMI to
public. Remember that AMIs are region specific and you need to copy the AMI to any additional regions where you would like it to appear in Community AMI searches. Each AMI you create will be associated with a
Snapshot stored on an EBS volume. You will be charged to store these. If the AMI was large and you make it available in multiple regions, these costs could add up.
Once you are done with this tutorial you should terminate or delete all resources that were created to ensure you are not charged. Specifically you should remove:
Snapshots. You may also decide to remove other entities that were created for demonstration purposes including:
Security Groups, and
Key Pairs. All of this can be done in the AWS EC2 console. When you are done, the
EC2 Dashboard should show
0 for all resource types except
Security Groups where a single default security configuration will remain.
This is a basic introduction to AWS cloud computing that assumes all configuration of the instance will occur within the AWS EC2 console of your web browser and all configuration of the Ubuntu Linux system will occur by the user manually executing commands and perhaps saving the outcome as a cusom AMI. For large scale computing and complex deployments of compute infrastructure on the cloud these methods will not be sustainable. Here is a list of more advanced topics for discussion on how to move beyond the console and automate configuration of your system: