Which AWS Services you want to learn about?

Listing all of your EC2 Instances using boto

One of the most common problems that lot of people (who are learning AWS) face is that they can’t figure out which service in AWS they forgot to turn off. This shows up in month end on their credit card bills.

I have started with a long term project to use boto (python SDK for AWS) to list & take action on all of your components in AWS in all the regions. As first part of the series, here is the python code that can be used to list all EC2 Instances & EBS volumes.


import argparse
import boto.ec2
access_key = ''
secret_key = ''
def get_ec2_instances(region):
ec2_conn = boto.ec2.connect_to_region(region,
aws_access_key_id=access_key,
aws_secret_access_key=secret_key)
reservations = ec2_conn.get_all_reservations()
for reservation in reservations:
print region+':',reservation.instances
for vol in ec2_conn.get_all_volumes():
print region+':',vol.id
def main():
regions = ['us-east-1','us-west-1','us-west-2','eu-west-1','sa-east-1',
'ap-southeast-1','ap-southeast-2','ap-northeast-1']
parser = argparse.ArgumentParser()
parser.add_argument('access_key', help='Access Key');
parser.add_argument('secret_key', help='Secret Key');
args = parser.parse_args()
global access_key
global secret_key
access_key = args.access_key
secret_key = args.secret_key
for region in regions: get_ec2_instances(region)
if __name__ =='__main__':main()

view raw

list.py

hosted with ❤ by GitHub

You can run this code as

python list.py <aws_access_key> <aws_secret_key>

Suggest what else would like to be covered in this long-term project. Let me know if you are good in Python/Boto & would like to contribute to it.

Facebook Open Sources Presto SQL Query Engine

In June 2013 at Analytics @ WebScale conference, Facebook announced Presto which they were using internally to process petabytes of data. It has now been made open-source as per a recent post by Facebook Engineering.

So what is Presto?

Hive, which was initially developed by Facebook used MapReduce chaining to transform a query into multiple MapReduce Jobs. Presto different as it does not use MapReduce & is 10 times faster that Hive for most queries as per Facebook. Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. You can issue SQL like queries on Presto that include left/right outer join, subqueries or even common aggregate functions. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto Architecture Diagram (source: Presto Website)

Facebook uses Presto internally to interactively query over a petabyte of data by about 1000 employees running more than 30,000 queries a day. Currently its also being used by leading internet companies including Airbnb and Dropbox.

You can find more about Presto here :

Presto Website
Facebook Blog about Presto
Gigaom Story


This blog is cross-posted from here

SSH to an EC2 instance in VPC private subnet

While exploring out AWS VPC, have you wondered about how you would SSH into your instances since any instance launched in VPC does not have internet access directly.

This tutorial will explain connecting to your instances using port forwarding technique.

Step 1 : Creating VPC

Create a VPC with Public & Private subnets using one of the templates provided in VPC Wizard

VPC Wizard

On the next screen make sure to chose a valid kaypair for which you have the .pem file. You’ll need this to SSH into the NAT instance. Also keep a note of the default IP ranges for private & public subnet.

VPC Confirmation Screen

At this point, the wizard will create following resources for you :

  • VPC (with CIDR 10.0.0.0/16 in our case)

  • Public Subnet (with CIDR 10.0.0.0/24)

  • Private Subnet (with CIDR 10.0.1.0/24)

  • Two Route Tables

  • One Internet Gateway

  • One Network ACL

  • One Elastic IP

  • One Security Group

In addition to this list, you would notice that it has also launched a NAT instance. This is a special type of instance that is used to route traffic for other instances. Also, VPC wizard has already configured the “Route Tables” to route traffic from Public Subnet to Internet Gateway & Private Subnet to NAT instance

Step 2 :  SSH into NAT instance

Before you can SSH into NAT instance, you’ll need to change VPC’s Security Group settings to allow inbound traffic over port 22

Now, let’s go to our EC2 service page to get Elastic IP of NAT instance launched by VPC wizard & try to SSH into this instance using the keypair provided earlier.

EC2 NAT Instance

I’m using Ubuntu’s terminal to connect to my instance using the following command

ssh -i training.pem ec2-user@54.208.114.96

You’ll need to change IP address in the end to Elastic IP assigned to you.

NAT Instance Terminal

Kindly note the IP address in last line of the screenshot above (10.0.0.236). This is the Private IP that was automatically assigned to our NAT instance. Also note that since NAT instance was launched in public subnet, this IP range falls into CIDR range for our public subnet (10.0.0.0/24).

Step 3 : Launching instances in Private Subnet

We would now launch two instances in private subnet. Later we’ll try to SSH into these instances by redirecting TCP packets through NAT instance. While launching the instances, make sure to launch them in private subnet of the VPC we created (Subnet 10.0.1.0/24 in our case). Once the instances are launched, note down their private IP address. We would use them configure IP tables on our NAT instance.

EC2 Launch Confirmation

The launched instances in our case carry the following private IPs

  • 10.0.1.234

  • 10.0.1.235

Step 4 : Configuring iptable on NAT instance

We will now make some configurations to our NAT instance

sudo iptables -t nat -A PREROUTING -p tcp --dport 10234 -j DNAT --to-destination 10.0.1.234:22
sudo iptables -t nat -A PREROUTING -p tcp --dport 10235 -j DNAT --to-destination 10.0.1.235:22

Here I have updated the IP tables of NAT instance to route incoming traffic on port 10235 to port 22 of first instance in our private subnet & similarly traffic on port 10235 on second instance.

Step 5 : Configuring Security Group

Before we can SSH into instances in private subnet, we’ll need to update security group of NAT instance to accept incoming traffic on ports 10234 & 10235. Also, port 22 should be open for target instances (we have already done this earlier)

Step 6 : SSH into NAT on specified port number

Now we’ll SSH into our NAT instance again from our local system as earlier with some difference this time. We’ll specify port number in our SSH command

ssh -p 10234 -i /home/himanshu/Downloads/training.pem ec2-user@54.208.114.96
ssh -p 10235 -i /home/himanshu/Downloads/training.pem ec2-user@54.208.114.96

This allows us to SSH directly in instance of private subnet. Check out the last line of screenshot below to see that IP address is of one of instances from private subnet.

Terminal of Private Instance

This completes our tutorial.

Make sure you terminate your instances & delete VPC to avoid any unnecessary charges.