From a2ec3e9877416e33abaa64efc3bb4b1547627209 Mon Sep 17 00:00:00 2001
From: kananinirav <30398499+kananinirav@users.noreply.github.com>
Date: Tue, 16 Aug 2022 10:20:01 +0900
Subject: [PATCH] [Modified] Table Of Contents added
---
README.md | 19 +-
sections/databases.md | 379 +++++++++++++------------
sections/deploying.md | 344 ++++++++++++-----------
sections/ec2_storage.md | 192 +++++++------
sections/other_compute.md | 273 +++++++++---------
sections/s3.md | 567 ++++++++++++++++++++------------------
6 files changed, 949 insertions(+), 825 deletions(-)
diff --git a/README.md b/README.md
index 5ea3c27..5e16332 100644
--- a/README.md
+++ b/README.md
@@ -4,16 +4,15 @@
### Table of contents
-- AWS Fundamentals
- - [What is Cloud Computing?](sections/cloud_computing.md)
- - [IAM: Identity Access & Management](sections/iam.md)
- - [EC2: Virtual Machines](sections/ec2.md)
- - [EC2 Instance Storage](sections/ec2_storage.md)
- - [Elastic Load Balancing & Auto Scaling Groups](sections/elb_asg.md)
- - [Amazon S3](sections/s3.md)
- - [Databases & Analytics](sections/databases.md)
- - [Other Compute Section](sections/other_compute.md)
- - [Deploying and Managing Infrastructure at Scale Section](sections/deploying.md)
+- [What is Cloud Computing?](sections/cloud_computing.md)
+- [IAM: Identity Access & Management](sections/iam.md)
+- [EC2: Virtual Machines](sections/ec2.md)
+- [EC2 Instance Storage](sections/ec2_storage.md)
+- [Elastic Load Balancing & Auto Scaling Groups](sections/elb_asg.md)
+- [Amazon S3](sections/s3.md)
+- [Databases & Analytics](sections/databases.md)
+- [Other Compute Section](sections/other_compute.md)
+- [Deploying and Managing Infrastructure at Scale Section](sections/deploying.md)
### Contributors
diff --git a/sections/databases.md b/sections/databases.md
index 8b76c83..fc14765 100644
--- a/sections/databases.md
+++ b/sections/databases.md
@@ -1,37 +1,64 @@
-# Databases
+# Databases & Analytics
+
+- [Databases & Analytics](#databases--analytics)
+ - [Databases Intro](#databases-intro)
+ - [Relational Databases](#relational-databases)
+ - [NoSQL Databases](#nosql-databases)
+ - [NoSQL data example: JSON](#nosql-data-example-json)
+ - [Databases & Shared Responsibility on AWS](#databases--shared-responsibility-on-aws)
+ - [AWS RDS Overview](#aws-rds-overview)
+ - [Advantage over using RDS versus deploying DB on EC2](#advantage-over-using-rds-versus-deploying-db-on-ec2)
+ - [RDS Deployments: Read Replicas, Multi-AZ](#rds-deployments-read-replicas-multi-az)
+ - [RDS Deployments: Multi-Region](#rds-deployments-multi-region)
+ - [Amazon Aurora](#amazon-aurora)
+ - [Amazon ElastiCache Overview](#amazon-elasticache-overview)
+ - [DynamoDB](#dynamodb)
+ - [DynamoDB Accelerator - DAX](#dynamodb-accelerator---dax)
+ - [DynamoDB - Global Tables](#dynamodb---global-tables)
+ - [Redshift Overview](#redshift-overview)
+ - [Amazon EMR](#amazon-emr)
+ - [Amazon Athena](#amazon-athena)
+ - [Amazon QuickSight](#amazon-quicksight)
+ - [DocumentDB](#documentdb)
+ - [Amazon Neptune](#amazon-neptune)
+ - [Amazon QLDB](#amazon-qldb)
+ - [Amazon Managed Blockchain](#amazon-managed-blockchain)
+ - [AWS Glue](#aws-glue)
+ - [DMS - Database Migration Service](#dms---database-migration-service)
+ - [Databases & Analytics Summary](#databases--analytics-summary)
## Databases Intro
-* Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits
-* Sometimes, you want to store data in a database…
-* You can structure the data
-* You build indexes to efficiently query / search through the data
-* You define relationships between your datasets
-* Databases are optimized for a purpose and come with different features, shapes and constraint
+- Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits
+- Sometimes, you want to store data in a database…
+- You can structure the data
+- You build indexes to efficiently query / search through the data
+- You define relationships between your datasets
+- Databases are optimized for a purpose and come with different features, shapes and constraint
## Relational Databases
-* Looks just like Excel spreadsheets, with links between them!
-* Can use the SQL language to perform queries / lookups
+- Looks just like Excel spreadsheets, with links between them!
+- Can use the SQL language to perform queries / lookups
## NoSQL Databases
-* NoSQL = non-SQL = non relational databases
-* NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications.
-* Benefits:
- * Flexibility: easy to evolve data model
- * Scalability: designed to scale-out by using distributed clusters
- * High-performance: optimized for a specific data model
- * Highly functional: types optimized for the data model
-* Examples: Key-value, document, graph, in-memory, search databases
+- NoSQL = non-SQL = non relational databases
+- NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications.
+- Benefits:
+ - Flexibility: easy to evolve data model
+ - Scalability: designed to scale-out by using distributed clusters
+ - High-performance: optimized for a specific data model
+ - Highly functional: types optimized for the data model
+- Examples: Key-value, document, graph, in-memory, search databases
### NoSQL data example: JSON
-* JSON = JavaScript Object Notation
-* JSON is a common form of data that fits into a NoSQL model
-* Data can be nested
-* Fields can change over time
-* Support for new types: arrays, etc…
+- JSON = JavaScript Object Notation
+- JSON is a common form of data that fits into a NoSQL model
+- Data can be nested
+- Fields can change over time
+- Support for new types: arrays, etc…
```json
{
@@ -52,213 +79,213 @@
## Databases & Shared Responsibility on AWS
-* AWS offers use to manage different databases
-* Benefits include:
- * Quick Provisioning, High Availability, Vertical and Horizontal Scaling
- * Automated Backup & Restore, Operations, Upgrades
- * Operating System Patching is handled by AWS
- * Monitoring, alerting
-* Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling
+- AWS offers use to manage different databases
+- Benefits include:
+ - Quick Provisioning, High Availability, Vertical and Horizontal Scaling
+ - Automated Backup & Restore, Operations, Upgrades
+ - Operating System Patching is handled by AWS
+ - Monitoring, alerting
+- Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling
## AWS RDS Overview
-* RDS stands for Relational Database Service
-* It’s a managed DB service for DB use SQL as a query language.
-* It allows you to create databases in the cloud that are managed by AWS
- * Postgres
- * MySQL
- * MariaDB
- * Oracle
- * Microsoft SQL Server
- * **Aurora (AWS Proprietary database)**
+- RDS stands for Relational Database Service
+- It’s a managed DB service for DB use SQL as a query language.
+- It allows you to create databases in the cloud that are managed by AWS
+ - Postgres
+ - MySQL
+ - MariaDB
+ - Oracle
+ - Microsoft SQL Server
+ - **Aurora (AWS Proprietary database)**
### Advantage over using RDS versus deploying DB on EC2
-* RDS is a managed service:
- * Automated provisioning, OS patching
- * Continuous backups and restore to specific timestamp (Point in Time Restore)!
- * Monitoring dashboards
- * Read replicas for improved read performance
- * Multi AZ setup for DR (Disaster Recovery)
- * Maintenance windows for upgrades
- * Scaling capability (vertical and horizontal)
- * Storage backed by EBS (gp2 or io1)
-* BUT you can’t SSH into your instances
+- RDS is a managed service:
+ - Automated provisioning, OS patching
+ - Continuous backups and restore to specific timestamp (Point in Time Restore)!
+ - Monitoring dashboards
+ - Read replicas for improved read performance
+ - Multi AZ setup for DR (Disaster Recovery)
+ - Maintenance windows for upgrades
+ - Scaling capability (vertical and horizontal)
+ - Storage backed by EBS (gp2 or io1)
+- BUT you can’t SSH into your instances
-## Amazon Aurora
+### RDS Deployments: Read Replicas, Multi-AZ
-* Aurora is a proprietary technology from AWS (not open sourced)
-* PostgreSQL and MySQL are both supported as Aurora DB
-* Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
-* Aurora storage automatically grows in increments of 10GB, up to 64 TB.
-* Aurora costs more than RDS (20% more) – but is more efficient
-* Not in the free tier
-
-## RDS Deployments: Read Replicas, Multi-AZ
-
-Read Replicas | Multi-AZ
----- | ----
-Scale the read workload of your DB | Failover in case of AZ outage (high availability)
-Can create up to 5 Read Replicas | Data is only read/written to the main database
-Data is only written to the main DB | Can only have 1 other AZ as failover
+| Read Replicas | Multi-AZ |
+| ----------------------------------- | ------------------------------------------------- |
+| Scale the read workload of your DB | Failover in case of AZ outage (high availability) |
+| Can create up to 5 Read Replicas | Data is only read/written to the main database |
+| Data is only written to the main DB | Can only have 1 other AZ as failover |

-## RDS Deployments: Multi-Region
+### RDS Deployments: Multi-Region
-* Multi-Region (Read Replicas)
- * Disaster recovery in case of region issue
- * Local performance for global reads
- * Replication cost
+- Multi-Region (Read Replicas)
+ - Disaster recovery in case of region issue
+ - Local performance for global reads
+ - Replication cost

+## Amazon Aurora
+
+- Aurora is a proprietary technology from AWS (not open sourced)
+- PostgreSQL and MySQL are both supported as Aurora DB
+- Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
+- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
+- Aurora costs more than RDS (20% more) – but is more efficient
+- Not in the free tier
+
## Amazon ElastiCache Overview
-* The same way RDS is to get managed Relational Databases…
-* ElastiCache is to get managed Redis or Memcached
-* Caches are in-memory databases with high performance, low latency
-* Helps reduce load off databases for read intensive workloads
-* AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup
+- The same way RDS is to get managed Relational Databases…
+- ElastiCache is to get managed Redis or Memcached
+- Caches are in-memory databases with high performance, low latency
+- Helps reduce load off databases for read intensive workloads
+- AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup
## DynamoDB
-* Fully Managed Highly available with replication across 3 AZ
-* NoSQL database - not a relational database
-* Scales to massive workloads, distributed “serverless” database
-* Millions of requests per seconds, trillions of row, 100s of TB of storage
-* Fast and consistent in performance
-* Single-digit millisecond latency – low latency retrieval
-* Integrated with IAM for security, authorization and administration
-* Low cost and auto scaling capabilities
-* Standard & Infrequent Access (IA) Table Class
+- Fully Managed Highly available with replication across 3 AZ
+- NoSQL database - not a relational database
+- Scales to massive workloads, distributed “serverless” database
+- Millions of requests per seconds, trillions of row, 100s of TB of storage
+- Fast and consistent in performance
+- Single-digit millisecond latency – low latency retrieval
+- Integrated with IAM for security, authorization and administration
+- Low cost and auto scaling capabilities
+- Standard & Infrequent Access (IA) Table Class
### DynamoDB Accelerator - DAX
-* Fully Managed in-memory cache for DynamoDB
-* 10x performance improvement – single- digit millisecond latency to microseconds latency – when accessing your DynamoDB tables
-* Secure, highly scalable & highly available
-* Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
+- Fully Managed in-memory cache for DynamoDB
+- 10x performance improvement – single- digit millisecond latency to microseconds latency – when accessing your DynamoDB tables
+- Secure, highly scalable & highly available
+- Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
-### DynamoDB – Global Tables
+### DynamoDB - Global Tables
-* Make a DynamoDB table accessible with low latency in multiple-regions
-* Active-Active replication (read/write to any AWS Region)
+- Make a DynamoDB table accessible with low latency in multiple-regions
+- Active-Active replication (read/write to any AWS Region)
## Redshift Overview
-* Redshift is based on PostgreSQL, but it’s not used for OLTP (Online Transactional Processing)
-* It’s OLAP – online analytical processing (analytics and data warehousing)
-* Load data once every hour, not every second
-* 10x better performance than other data warehouses, scale to PBs of data
-* Columnar storage of data (instead of row based)
-* Massively Parallel Query Execution (MPP), highly available
-* Pay as you go based on the instances provisioned
-* Has a SQL interface for performing the queries
-* BI tools such as AWS Quicksight or Tableau integrate with it
+- Redshift is based on PostgreSQL, but it’s not used for OLTP (Online Transactional Processing)
+- It’s OLAP – online analytical processing (analytics and data warehousing)
+- Load data once every hour, not every second
+- 10x better performance than other data warehouses, scale to PBs of data
+- Columnar storage of data (instead of row based)
+- Massively Parallel Query Execution (MPP), highly available
+- Pay as you go based on the instances provisioned
+- Has a SQL interface for performing the queries
+- BI tools such as AWS Quicksight or Tableau integrate with it
## Amazon EMR
-* EMR stands for “Elastic MapReduce”
-* EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
-* The clusters can be made of hundreds of EC2 instances
-* Also supports Apache Spark, HBase, Presto, Flink
-* EMR takes care of all the provisioning and configuration
-* Auto-scaling and integrated with Spot instances
-* Use cases: data processing, machine learning, web indexing, big data
+- EMR stands for “Elastic MapReduce”
+- EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
+- The clusters can be made of hundreds of EC2 instances
+- Also supports Apache Spark, HBase, Presto, Flink
+- EMR takes care of all the provisioning and configuration
+- Auto-scaling and integrated with Spot instances
+- Use cases: data processing, machine learning, web indexing, big data
## Amazon Athena
-* Serverless query service to analyze data stored in Amazon S3
-* Uses standard SQL language to query the files
-* Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
-* Pricing: $5.00 per TB of data scanned
-* Use compressed or columnar data for cost-savings (less scan)
-* Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
-* **analyze data in S3 using serverless SQL, use Athena**
+- Serverless query service to analyze data stored in Amazon S3
+- Uses standard SQL language to query the files
+- Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
+- Pricing: $5.00 per TB of data scanned
+- Use compressed or columnar data for cost-savings (less scan)
+- Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
+- **analyze data in S3 using serverless SQL, use Athena**
## Amazon QuickSight
-* Serverless machine learning-powered business intelligence service to create interactive dashboards
-* Fast, automatically scalable, embeddable, with per-session pricing
-* Use cases:
- * Business analytics
- * Building visualizations
- * Perform ad-hoc analysis
- * Get business insights using data
-* Integrated with RDS, Aurora, Athena, Redshift, S3…
+- Serverless machine learning-powered business intelligence service to create interactive dashboards
+- Fast, automatically scalable, embeddable, with per-session pricing
+- Use cases:
+ - Business analytics
+ - Building visualizations
+ - Perform ad-hoc analysis
+ - Get business insights using data
+- Integrated with RDS, Aurora, Athena, Redshift, S3…
## DocumentDB
-* Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
-* DocumentDB is the same for MongoDB (which is a NoSQL database)
-* MongoDB is used to store, query, and index JSON data
-* Similar “deployment concepts” as Aurora
-* Fully Managed, highly available with replication across 3 AZ
-* Aurora storage automatically grows in increments of 10GB, up to 64 TB.
-* Automatically scales to workloads with millions of requests per seconds
+- Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
+- DocumentDB is the same for MongoDB (which is a NoSQL database)
+- MongoDB is used to store, query, and index JSON data
+- Similar “deployment concepts” as Aurora
+- Fully Managed, highly available with replication across 3 AZ
+- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
+- Automatically scales to workloads with millions of requests per seconds
## Amazon Neptune
-* Fully managed graph database
-* A popular graph dataset would be a social network
- * Users have friends
- * Posts have comments
- * Comments have likes from users
- * Users share and like posts…
-* Highly available across 3 AZ, with up to 15 read replicas
-* Build and run applications working with highly connected datasets – optimized for these complex and hard queries
-* Can store up to billions of relations and query the graph with milliseconds latency
-* Highly available with replications across multiple AZs
-* Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
+- Fully managed graph database
+- A popular graph dataset would be a social network
+ - Users have friends
+ - Posts have comments
+ - Comments have likes from users
+ - Users share and like posts…
+- Highly available across 3 AZ, with up to 15 read replicas
+- Build and run applications working with highly connected datasets – optimized for these complex and hard queries
+- Can store up to billions of relations and query the graph with milliseconds latency
+- Highly available with replications across multiple AZs
+- Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
## Amazon QLDB
-* QLDB stands for ”Quantum Ledger Database”
-* A ledger is a book **recording financial transactions**
-* Fully Managed, Serverless, High available, Replication across 3 AZ
-* Used to **review history of all the changes made to your application data** over time
-* **Immutable** system: no entry can be removed or modified, cryptographically verifiable
-* 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
-* Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
+- QLDB stands for ”Quantum Ledger Database”
+- A ledger is a book **recording financial transactions**
+- Fully Managed, Serverless, High available, Replication across 3 AZ
+- Used to **review history of all the changes made to your application data** over time
+- **Immutable** system: no entry can be removed or modified, cryptographically verifiable
+- 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
+- Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
## Amazon Managed Blockchain
-* Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority.
-* Amazon Managed Blockchain is a managed service to:
- * Join public blockchain networks
- * Or create your own scalable private network
-* Compatible with the frameworks Hyperledger Fabric & Ethereum
+- Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority.
+- Amazon Managed Blockchain is a managed service to:
+ - Join public blockchain networks
+ - Or create your own scalable private network
+- Compatible with the frameworks Hyperledger Fabric & Ethereum
## AWS Glue
-* Managed extract, transform, and load (ETL) service
-* Useful to prepare and transform data for analytics
-* Fully serverless service
-* Glue Data Catalog: catalog of datasets
- * can be used by Athena, Redshift, EMR
+- Managed extract, transform, and load (ETL) service
+- Useful to prepare and transform data for analytics
+- Fully serverless service
+- Glue Data Catalog: catalog of datasets
+ - can be used by Athena, Redshift, EMR
-## DMS – Database Migration Service
+## DMS - Database Migration Service
-* Quickly and securely migrate databases to AWS, resilient, self healing
-* The source database remains available during the migration
-* Supports:
- * Homogeneous migrations: ex Oracle to Oracle
- * Heterogeneous migrations: ex Microsoft SQL Server to Aurora
+- Quickly and securely migrate databases to AWS, resilient, self healing
+- The source database remains available during the migration
+- Supports:
+ - Homogeneous migrations: ex Oracle to Oracle
+ - Heterogeneous migrations: ex Microsoft SQL Server to Aurora
-## Databases & Analytics Summary in AWS
+## Databases & Analytics Summary
-* Relational Databases - OLTP: RDS & Aurora (SQL)
-* Differences between Multi-AZ, Read Replicas, Multi-Region
-* In-memory Database: ElastiCache
-* Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
-* Warehouse - OLAP: Redshift (SQL)
-* Hadoop Cluster: EMR
-* Athena: query data on Amazon S3 (serverless & SQL)
-* QuickSight: dashboards on your data (serverless)
-* DocumentDB: “Aurora for MongoDB” (JSON – NoSQL database)
-* Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable)
-* Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
-* Glue: Managed ETL (Extract Transform Load) and Data Catalog service
-* Database Migration: DMS
-* Neptune: graph database
\ No newline at end of file
+- Relational Databases - OLTP: RDS & Aurora (SQL)
+- Differences between Multi-AZ, Read Replicas, Multi-Region
+- In-memory Database: ElastiCache
+- Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
+- Warehouse - OLAP: Redshift (SQL)
+- Hadoop Cluster: EMR
+- Athena: query data on Amazon S3 (serverless & SQL)
+- QuickSight: dashboards on your data (serverless)
+- DocumentDB: “Aurora for MongoDB” (JSON – NoSQL database)
+- Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable)
+- Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
+- Glue: Managed ETL (Extract Transform Load) and Data Catalog service
+- Database Migration: DMS
+- Neptune: graph database
diff --git a/sections/deploying.md b/sections/deploying.md
index 47df87f..0e41e26 100644
--- a/sections/deploying.md
+++ b/sections/deploying.md
@@ -1,221 +1,243 @@
# Deploying and Managing Infrastructure at Scale
-## What is CloudFormation
+- [Deploying and Managing Infrastructure at Scale](#deploying-and-managing-infrastructure-at-scale)
+ - [What is CloudFormation?](#what-is-cloudformation)
+ - [Benefits of AWS CloudFormation](#benefits-of-aws-cloudformation)
+ - [CloudFormation Stack Designer](#cloudformation-stack-designer)
+ - [AWS Cloud Development Kit (CDK)](#aws-cloud-development-kit-cdk)
+ - [Developer problems on AWS](#developer-problems-on-aws)
+ - [AWS Elastic Beanstalk Overview](#aws-elastic-beanstalk-overview)
+ - [Elastic Beanstalk - Health Monitoring](#elastic-beanstalk---health-monitoring)
+ - [AWS CodeDeploy](#aws-codedeploy)
+ - [AWS CodeCommit](#aws-codecommit)
+ - [AWS CodeBuild](#aws-codebuild)
+ - [AWS CodePipeline](#aws-codepipeline)
+ - [AWS CodeArtifact](#aws-codeartifact)
+ - [AWS CodeStar](#aws-codestar)
+ - [AWS Cloud9](#aws-cloud9)
+ - [AWS Systems Manager (SSM)](#aws-systems-manager-ssm)
+ - [How Systems Manager works](#how-systems-manager-works)
+ - [Systems Manager - SSM Session Manager](#systems-manager---ssm-session-manager)
+ - [AWS OpsWorks](#aws-opsworks)
+ - [Deployment - Summary](#deployment---summary)
+ - [Developer Services - Summary](#developer-services---summary)
-* CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported).
-* For example, within a CloudFormation template, you say:
- * I want a security group
- * I want two EC2 instances using this security group
- * I want an S3 bucket
- * I want a load balancer (ELB) in front of these machines
-* Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
+## What is CloudFormation?
+
+- CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported).
+- For example, within a CloudFormation template, you say:
+ - I want a security group
+ - I want two EC2 instances using this security group
+ - I want an S3 bucket
+ - I want a load balancer (ELB) in front of these machines
+- Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
### Benefits of AWS CloudFormation
-* Infrastructure as code
- * No resources are manually created, which is excellent for control
- * Changes to the infrastructure are reviewed through code
-* Cost
- * Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
- * You can estimate the costs of your resources using the CloudFormation template
- * Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
-* Productivity
- * Ability to destroy and re-create an infrastructure on the cloud on the fly
- * Automated generation of Diagram for your templates!
- * Declarative programming (no need to figure out ordering and orchestration)
-* Don’t re-invent the wheel
- * Leverage existing templates on the web!
- * Leverage the documentation
-* Supports (almost) all AWS resources:
- * Everything we’ll see in this course is supported
- * You can use “custom resources” for resources that are not supported
+- Infrastructure as code
+ - No resources are manually created, which is excellent for control
+ - Changes to the infrastructure are reviewed through code
+- Cost
+ - Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
+ - You can estimate the costs of your resources using the CloudFormation template
+ - Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
+- Productivity
+ - Ability to destroy and re-create an infrastructure on the cloud on the fly
+ - Automated generation of Diagram for your templates!
+ - Declarative programming (no need to figure out ordering and orchestration)
+- Don’t re-invent the wheel
+ - Leverage existing templates on the web!
+ - Leverage the documentation
+- Supports (almost) all AWS resources:
+ - Everything we’ll see in this course is supported
+ - You can use “custom resources” for resources that are not supported
### CloudFormation Stack Designer
-* Example: WordPress CloudFormation Stack
-* We can see all the resources
-* We can see the relations between the components
+- Example: WordPress CloudFormation Stack
+- We can see all the resources
+- We can see the relations between the components
## AWS Cloud Development Kit (CDK)
-* Define your cloud infrastructure using a familiar language:
- * JavaScript/TypeScript, Python, Java, and .NET
-* The code is “compiled” into a CloudFormation template (JSON/YAML)
-* You can therefore deploy infrastructure and application runtime code together
- * Great for Lambda functions
- * Great for Docker containers in ECS / EKS
+- Define your cloud infrastructure using a familiar language:
+ - JavaScript/TypeScript, Python, Java, and .NET
+- The code is “compiled” into a CloudFormation template (JSON/YAML)
+- You can therefore deploy infrastructure and application runtime code together
+ - Great for Lambda functions
+ - Great for Docker containers in ECS / EKS
## Developer problems on AWS
-* Managing infrastructure
-* Deploying Code
-* Configuring all the databases, load balancers, etc
-* Scaling concerns
-* Most web apps have the same architecture (ALB + ASG)
-* All the developers want is for their code to run!
-* Possibly, consistently across different applications and environments
+- Managing infrastructure
+- Deploying Code
+- Configuring all the databases, load balancers, etc
+- Scaling concerns
+- Most web apps have the same architecture (ALB + ASG)
+- All the developers want is for their code to run!
+- Possibly, consistently across different applications and environments
## AWS Elastic Beanstalk Overview
-* Elastic Beanstalk is a developer centric view of deploying an application on AWS
-* It uses all the component’s we’ve seen before: EC2, ASG, ELB, RDS, etc…
-* But it’s all in one view that’s easy to make sense of!
-* We still have full control over the configuration
-* Beanstalk = Platform as a Service (PaaS)
-* Beanstalk is free but you pay for the underlying instances
-* Managed service
- * Instance configuration / OS is handled by Beanstalk
- * Deployment strategy is configurable but performed by Elastic Beanstalk
- * Capacity provisioning
- * Load balancing & auto-scaling
-* Application health-monitoring & responsiveness
-* Just the application code is the responsibility of the developer
-* Three architecture models:
- * Single Instance deployment: good for dev
- * LB + ASG: great for production or pre-production web applications
- * ASG only: great for non-web apps in production (workers, etc..)
+- Elastic Beanstalk is a developer centric view of deploying an application on AWS
+- It uses all the component’s we’ve seen before: EC2, ASG, ELB, RDS, etc…
+- But it’s all in one view that’s easy to make sense of!
+- We still have full control over the configuration
+- Beanstalk = Platform as a Service (PaaS)
+- Beanstalk is free but you pay for the underlying instances
+- Managed service
+ - Instance configuration / OS is handled by Beanstalk
+ - Deployment strategy is configurable but performed by Elastic Beanstalk
+ - Capacity provisioning
+ - Load balancing & auto-scaling
+- Application health-monitoring & responsiveness
+- Just the application code is the responsibility of the developer
+- Three architecture models:
+ - Single Instance deployment: good for dev
+ - LB + ASG: great for production or pre-production web applications
+ - ASG only: great for non-web apps in production (workers, etc..)
-* Support for many platforms:
- * Go
- * Java SE
- * Java with Tomcat
- * .NET on Windows Server with IIS
- * Node.js
- * PHP
- * Python
- * Ruby
- * Packer Builder
- * Single Container Docker
- * Multi-Container Docker
- * Preconfigured Docker
+- Support for many platforms:
+ - Go
+ - Java SE
+ - Java with Tomcat
+ - .NET on Windows Server with IIS
+ - Node.js
+ - PHP
+ - Python
+ - Ruby
+ - Packer Builder
+ - Single Container Docker
+ - Multi-Container Docker
+ - Preconfigured Docker
-### Elastic Beanstalk – Health Monitoring
+### Elastic Beanstalk - Health Monitoring
-* Health agent pushes metrics to CloudWatch
-* Checks for app health, publishes health events
+- Health agent pushes metrics to CloudWatch
+- Checks for app health, publishes health events
## AWS CodeDeploy
-* We want to deploy our application automatically
-* Works with EC2 Instances
-* Works with On-Premises Servers
-* Hybrid service
-* Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent
+- We want to deploy our application automatically
+- Works with EC2 Instances
+- Works with On-Premises Servers
+- Hybrid service
+- Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent
## AWS CodeCommit
-* Before pushing the application code to servers, it needs to be stored somewhere
-* Developers usually store code in a repository, using the Git technology
-* A famous public offering is GitHub, AWS’ competing product is CodeCommit
-* CodeCommit:
- * Source-control service that hosts Git-based repositories
- * Makes it easy to collaborate with others on code
- * The code changes are automatically versioned
-* Benefits:
- * Fully managed
- * Scalable & highly available
- * Private, Secured, Integrated with AWS
+- Before pushing the application code to servers, it needs to be stored somewhere
+- Developers usually store code in a repository, using the Git technology
+- A famous public offering is GitHub, AWS’ competing product is CodeCommit
+- CodeCommit:
+ - Source-control service that hosts Git-based repositories
+ - Makes it easy to collaborate with others on code
+ - The code changes are automatically versioned
+- Benefits:
+ - Fully managed
+ - Scalable & highly available
+ - Private, Secured, Integrated with AWS
## AWS CodeBuild
-* Code building service in the cloud (name is obvious)
-* Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example)
-* Benefits:
- * Fully managed, serverless
- * Continuously scalable & highly available
- * Secure
- * Pay-as-you-go pricing – only pay for the build time
+- Code building service in the cloud (name is obvious)
+- Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example)
+- Benefits:
+ - Fully managed, serverless
+ - Continuously scalable & highly available
+ - Secure
+ - Pay-as-you-go pricing – only pay for the build time
## AWS CodePipeline
-* Orchestrate the different steps to have the code automatically pushed to production
-* Code => Build => Test => Provision => Deploy
-* Basis for CICD (Continuous Integration & Continuous Delivery)
-* Benefits:
- * Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins…
- * Fast delivery & rapid updates
+- Orchestrate the different steps to have the code automatically pushed to production
+- Code => Build => Test => Provision => Deploy
+- Basis for CICD (Continuous Integration & Continuous Delivery)
+- Benefits:
+ - Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins…
+ - Fast delivery & rapid updates
-* CodePipeline: orchestration layer
- * CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk
+- CodePipeline: orchestration layer
+ - CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk
## AWS CodeArtifact
-* Software packages depend on each other to be built (also called code dependencies), and new ones are created
-* Storing and retrieving these dependencies is called artifact management
-* Traditionally you need to setup your own artifact management system
-* CodeArtifact is a secure, scalable, and cost-effective artifact management for software development
-* Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet
-* Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact
+- Software packages depend on each other to be built (also called code dependencies), and new ones are created
+- Storing and retrieving these dependencies is called artifact management
+- Traditionally you need to setup your own artifact management system
+- CodeArtifact is a secure, scalable, and cost-effective artifact management for software development
+- Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet
+- Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact
## AWS CodeStar
-* Unified UI to easily manage software development activities in one place
-* “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc…
-* Can edit the code ”in-the-cloud” using AWS Cloud9
+- Unified UI to easily manage software development activities in one place
+- “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc…
+- Can edit the code ”in-the-cloud” using AWS Cloud9
## AWS Cloud9
-* AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code
-* “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used
-* A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary
-* AWS Cloud9 also allows for code collaboration in real-time (pair programming)
+- AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code
+- “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used
+- A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary
+- AWS Cloud9 also allows for code collaboration in real-time (pair programming)
## AWS Systems Manager (SSM)
-* Helps you manage your EC2 and On-Premises systems at scale
-* Another Hybrid AWS service
-* Get operational insights about the state of your infrastructure
-* Suite of 10+ products
-* Most important features are:
- * Patching automation for enhanced compliance
- * Run commands across an entire fleet of servers
- * Store parameter configuration with the SSM Parameter Store
-* Works for both Windows and Linux OS
+- Helps you manage your EC2 and On-Premises systems at scale
+- Another Hybrid AWS service
+- Get operational insights about the state of your infrastructure
+- Suite of 10+ products
+- Most important features are:
+ - Patching automation for enhanced compliance
+ - Run commands across an entire fleet of servers
+ - Store parameter configuration with the SSM Parameter Store
+- Works for both Windows and Linux OS
### How Systems Manager works
-* We need to install the SSM agent onto the systems we control
-* Installed by default on Amazon Linux AMI & some Ubuntu AMI
-* If an instance can’t be controlled with SSM, it’s probably an issue with the SSM agent!
-* Thanks to the SSM agent, we can run commands, patch & configure our servers
+- We need to install the SSM agent onto the systems we control
+- Installed by default on Amazon Linux AMI & some Ubuntu AMI
+- If an instance can’t be controlled with SSM, it’s probably an issue with the SSM agent!
+- Thanks to the SSM agent, we can run commands, patch & configure our servers
-### Systems Manager – SSM Session Manager
+### Systems Manager - SSM Session Manager
-* Allows you to start a secure shell on your EC2 and on-premises servers
-* No SSH access, bastion hosts, or SSH keys needed
-* No port 22 needed (better security)
-* Supports Linux, macOS, and Windows
-* Send session log data to S3 or CloudWatch Logs
+- Allows you to start a secure shell on your EC2 and on-premises servers
+- No SSH access, bastion hosts, or SSH keys needed
+- No port 22 needed (better security)
+- Supports Linux, macOS, and Windows
+- Send session log data to S3 or CloudWatch Logs
## AWS OpsWorks
-* Chef & Puppet help you perform server configuration automatically, or repetitive actions
-* They work great with EC2 & On-Premises VM
-* AWS OpsWorks = Managed Chef & Puppet
-* It’s an alternative to AWS SSM
-* Only provision standard AWS resources:
- * EC2 Instances, Databases, Load Balancers, EBS volumes…
-* **Chef or Puppet needed => AWS OpsWorks**
+- Chef & Puppet help you perform server configuration automatically, or repetitive actions
+- They work great with EC2 & On-Premises VM
+- AWS OpsWorks = Managed Chef & Puppet
+- It’s an alternative to AWS SSM
+- Only provision standard AWS resources:
+ - EC2 Instances, Databases, Load Balancers, EBS volumes…
+- **Chef or Puppet needed => AWS OpsWorks**
## Deployment - Summary
-* CloudFormation: (AWS only)
- * Infrastructure as Code, works with almost all of AWS resources
- * Repeat across Regions & Accounts
-* Beanstalk: (AWS only)
- * Platform as a Service (PaaS), limited to certain programming languages or Docker
- * Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS
-* CodeDeploy (hybrid): deploy & upgrade any application onto servers
-* Systems Manager (hybrid): patch, configure and run commands at scale
-* OpsWorks (hybrid): managed Chef and Puppet in AWS
+- CloudFormation: (AWS only)
+ - Infrastructure as Code, works with almost all of AWS resources
+ - Repeat across Regions & Accounts
+- Beanstalk: (AWS only)
+ - Platform as a Service (PaaS), limited to certain programming languages or Docker
+ - Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS
+- CodeDeploy (hybrid): deploy & upgrade any application onto servers
+- Systems Manager (hybrid): patch, configure and run commands at scale
+- OpsWorks (hybrid): managed Chef and Puppet in AWS
## Developer Services - Summary
-* CodeCommit: Store code in private git repository (version controlled)
-* CodeBuild: Build & test code in AWS
-* CodeDeploy: Deploy code onto servers
-* CodePipeline: Orchestration of pipeline (from code to build to deploy)
-* CodeArtifact: Store software packages / dependencies on AWS
-* CodeStar: Unified view for allowing developers to do CICD and code
-* Cloud9: Cloud IDE (Integrated Development Environment) with collab
-* AWS CDK: Define your cloud infrastructure using a programming language
+- CodeCommit: Store code in private git repository (version controlled)
+- CodeBuild: Build & test code in AWS
+- CodeDeploy: Deploy code onto servers
+- CodePipeline: Orchestration of pipeline (from code to build to deploy)
+- CodeArtifact: Store software packages / dependencies on AWS
+- CodeStar: Unified view for allowing developers to do CICD and code
+- Cloud9: Cloud IDE (Integrated Development Environment) with collab
+- AWS CDK: Define your cloud infrastructure using a programming language
diff --git a/sections/ec2_storage.md b/sections/ec2_storage.md
index 8f5ec80..9c6dd14 100644
--- a/sections/ec2_storage.md
+++ b/sections/ec2_storage.md
@@ -1,136 +1,154 @@
# EC2 Instance Storage
-* [EBS volumes](#ebs-volume)
-* [EFS: network file system, can be attached to 100s of instances in a region](#efs-elastic-file-system)
-* [EFS-IA: cost-optimized storage class for infrequent accessed files](#efs-infrequent-access-efs-ia)
-* [FSx for Windows: Network File System for Windows servers](#amazon-fsx-for-windows-file-server)
-* [FSx for Lustre: High Performance Computing Linux file system](#amazon-fsx-for-lustre)
+- [EC2 Instance Storage](#ec2-instance-storage)
+ - [EBS Volumes](#ebs-volumes)
+ - [What’s an EBS Volume?](#whats-an-ebs-volume)
+ - [EBS Volume](#ebs-volume)
+ - [EBS – Delete on Termination attribute](#ebs--delete-on-termination-attribute)
+ - [EBS Snapshots](#ebs-snapshots)
+ - [EBS Snapshots Features](#ebs-snapshots-features)
+ - [EFS: Elastic File System](#efs-elastic-file-system)
+ - [EFS Infrequent Access (EFS-IA)](#efs-infrequent-access-efs-ia)
+ - [Amazon FSx – Overview](#amazon-fsx--overview)
+ - [Amazon FSx for Windows File Server](#amazon-fsx-for-windows-file-server)
+ - [Amazon FSx for Lustre](#amazon-fsx-for-lustre)
+ - [EC2 Instance Store](#ec2-instance-store)
+ - [Shared Responsibility Model for EC2 Storage](#shared-responsibility-model-for-ec2-storage)
+ - [AMI Overview](#ami-overview)
+ - [AMI Process (from an EC2 instance)](#ami-process-from-an-ec2-instance)
+ - [EC2 Image Builder](#ec2-image-builder)
+
+- EBS: Elastic Block Store, Volume is a network drive you can attach to your instances while they run
+- EFS: network file system, can be attached to 100s of instances in a region
+- EFS-IA: cost-optimized storage class for infrequent accessed files
+- FSx for Windows: Network File System for Windows servers
+- FSx for Lustre: High Performance Computing Linux file system
## EBS Volumes
### What’s an EBS Volume?
-* An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run
-* It allows your instances to persist data, even after their termination
-* They can only be mounted to one instance at a time (at the CCP level)
-* They are bound to a specific availability zone
-* Analogy: Think of them as a “network USB stick”
-* Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
+- An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run
+- It allows your instances to persist data, even after their termination
+- They can only be mounted to one instance at a time (at the CCP level)
+- They are bound to a specific availability zone
+- Analogy: Think of them as a “network USB stick”
+- Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
### EBS Volume
-* It’s a network drive (i.e. not a physical drive)
- * It uses the network to communicate the instance, which means there might be a bit of latency
- * It can be detached from an EC2 instance and attached to another one quickly
-* It’s locked to an Availability Zone (AZ)
- * An EBS Volume in us-east-1a cannot be attached to us-east-1b
- * To move a volume across, you first need to snapshot it
-* Have a provisioned capacity (size in GBs, and IOPS)
- * You get billed for all the provisioned capacity
- * You can increase the capacity of the drive over time
+- It’s a network drive (i.e. not a physical drive)
+ - It uses the network to communicate the instance, which means there might be a bit of latency
+ - It can be detached from an EC2 instance and attached to another one quickly
+- It’s locked to an Availability Zone (AZ)
+ - An EBS Volume in us-east-1a cannot be attached to us-east-1b
+ - To move a volume across, you first need to snapshot it
+- Have a provisioned capacity (size in GBs, and IOPS)
+ - You get billed for all the provisioned capacity
+ - You can increase the capacity of the drive over time
### EBS – Delete on Termination attribute
-* Controls the EBS behaviour when an EC2 instance terminates
- * By default, the root EBS volume is deleted (attribute enabled)
- * By default, any other attached EBS volume is not deleted (attribute disabled)
-* This can be controlled by the AWS console / AWS CLI
-* Use case: preserve root volume when instance is terminated
+- Controls the EBS behaviour when an EC2 instance terminates
+ - By default, the root EBS volume is deleted (attribute enabled)
+ - By default, any other attached EBS volume is not deleted (attribute disabled)
+- This can be controlled by the AWS console / AWS CLI
+- Use case: preserve root volume when instance is terminated
### EBS Snapshots
-* Make a backup (snapshot) of your EBS volume at a point in time
-* Not necessary to detach volume to do snapshot, but recommended
-* Can copy snapshots across AZ or Region
+- Make a backup (snapshot) of your EBS volume at a point in time
+- Not necessary to detach volume to do snapshot, but recommended
+- Can copy snapshots across AZ or Region
### EBS Snapshots Features
-* EBS Snapshot Archive
- * Move a Snapshot to an ”archive tier” that is 75% cheaper
- * Takes within 24 to 72 hours for restoring the archive
-* Recycle Bin for EBS Snapshots
- * Setup rules to retain deleted snapshots so you can recover them after an accidental deletion
- * Specify retention (from 1 day to 1 year)
+- EBS Snapshot Archive
+ - Move a Snapshot to an ”archive tier” that is 75% cheaper
+ - Takes within 24 to 72 hours for restoring the archive
+- Recycle Bin for EBS Snapshots
+ - Setup rules to retain deleted snapshots so you can recover them after an accidental deletion
+ - Specify retention (from 1 day to 1 year)
## EFS: Elastic File System
-* Managed NFS (network file system) that can be mounted on 100s of EC2
-* EFS works with Linux EC2 instances in multi-AZ
-* Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
+- Managed NFS (network file system) that can be mounted on 100s of EC2
+- EFS works with Linux EC2 instances in multi-AZ
+- Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
## EFS Infrequent Access (EFS-IA)
-* Storage class that is cost-optimized for files not accessed every day
-* Up to 92% lower cost compared to EFS Standard
-* EFS will automatically move your files to EFS-IA based on the last time they were accessed
-* Enable EFS-IA with a Lifecycle Policy
-* Example: move files that are not accessed for 60 days to EFS-IA
-* Transparent to the applications accessing EFS
+- Storage class that is cost-optimized for files not accessed every day
+- Up to 92% lower cost compared to EFS Standard
+- EFS will automatically move your files to EFS-IA based on the last time they were accessed
+- Enable EFS-IA with a Lifecycle Policy
+- Example: move files that are not accessed for 60 days to EFS-IA
+- Transparent to the applications accessing EFS
## Amazon FSx – Overview
-* Launch 3rd party high-performance file systems on AWS
-* Fully managed service
- * FSx for Lustre
- * FSx for Windows File Server
- * FSx for NetApp ONTAP
+- Launch 3rd party high-performance file systems on AWS
+- Fully managed service
+ - FSx for Lustre
+ - FSx for Windows File Server
+ - FSx for NetApp ONTAP
### Amazon FSx for Windows File Server
-* A fully managed, highly reliable, and scalable Windows native shared file system
-* Built on Windows File Server
-* Supports SMB protocol & Windows NTFS
-* Integrated with Microsoft Active Directory
-* Can be accessed from AWS or your on-premise infrastructure
+- A fully managed, highly reliable, and scalable Windows native shared file system
+- Built on Windows File Server
+- Supports SMB protocol & Windows NTFS
+- Integrated with Microsoft Active Directory
+- Can be accessed from AWS or your on-premise infrastructure
### Amazon FSx for Lustre
-* A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
-* The name Lustre is derived from “Linux” and “cluster”
-* Machine Learning, Analytics, Video Processing, Financial Modeling
-* Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
+- A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
+- The name Lustre is derived from “Linux” and “cluster”
+- Machine Learning, Analytics, Video Processing, Financial Modeling
+- Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
## EC2 Instance Store
-* EBS volumes are network drives with good but “limited” performance
-* If you need a high-performance hardware disk, use EC2 Instance Store
-* Better I/O performance
-* EC2 Instance Store lose their storage if they’re stopped (ephemeral)
-* Good for buffer / cache / scratch data / temporary content
-* Risk of data loss if hardware fails
-* Backups and Replication are your responsibility
+- EBS volumes are network drives with good but “limited” performance
+- If you need a high-performance hardware disk, use EC2 Instance Store
+- Better I/O performance
+- EC2 Instance Store lose their storage if they’re stopped (ephemeral)
+- Good for buffer / cache / scratch data / temporary content
+- Risk of data loss if hardware fails
+- Backups and Replication are your responsibility
## Shared Responsibility Model for EC2 Storage
-AWS | USER
----- | ----
-Infrastructure | Setting up backup / snapshot procedures
-Replication for data for EBS volumes & EFS drives | Setting up data encryption
-Replacing faulty hardware | Responsibility of any data on the drives
-Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store
+| AWS | USER |
+| ------------------------------------------------- | -------------------------------------------------- |
+| Infrastructure | Setting up backup / snapshot procedures |
+| Replication for data for EBS volumes & EFS drives | Setting up data encryption |
+| Replacing faulty hardware | Responsibility of any data on the drives |
+| Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store |
## AMI Overview
-* AMI = Amazon Machine Image
-* AMI are a customization of an EC2 instance
- * You add your own software, configuration, operating system, monitoring…
- * Faster boot / configuration time because all your software is pre-packaged
-* AMI are built for a specific region (and can be copied across regions)
-* You can launch EC2 instances from:
- * A Public AMI: AWS provided
- * Your own AMI: you make and maintain them yourself
- * An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
+- AMI = Amazon Machine Image
+- AMI are a customization of an EC2 instance
+ - You add your own software, configuration, operating system, monitoring…
+ - Faster boot / configuration time because all your software is pre-packaged
+- AMI are built for a specific region (and can be copied across regions)
+- You can launch EC2 instances from:
+ - A Public AMI: AWS provided
+ - Your own AMI: you make and maintain them yourself
+ - An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
### AMI Process (from an EC2 instance)
-* Start an EC2 instance and customize it
-* Stop the instance (for data integrity)
-* Build an AMI – this will also create EBS snapshots
-* Launch instances from other AMIs
+- Start an EC2 instance and customize it
+- Stop the instance (for data integrity)
+- Build an AMI – this will also create EBS snapshots
+- Launch instances from other AMIs
## EC2 Image Builder
-* Used to automate the creation of Virtual Machines or container images
-* => Automate the creation, maintain, validate and test EC2 AMIs
-* Can be run on a schedule (weekly, whenever packages are updated, etc…)
-* Free service (only pay for the underlying resources)
+- Used to automate the creation of Virtual Machines or container images
+- => Automate the creation, maintain, validate and test EC2 AMIs
+- Can be run on a schedule (weekly, whenever packages are updated, etc…)
+- Free service (only pay for the underlying resources)
diff --git a/sections/other_compute.md b/sections/other_compute.md
index 29f82a0..de010b4 100644
--- a/sections/other_compute.md
+++ b/sections/other_compute.md
@@ -1,173 +1,192 @@
# Other Compute
-What is Docker?
+- [Other Compute](#other-compute)
+ - [What is Docker?](#what-is-docker)
+ - [Where Docker images are stored?](#where-docker-images-are-stored)
+ - [Docker versus Virtual Machines](#docker-versus-virtual-machines)
+ - [ECS](#ecs)
+ - [Fargate](#fargate)
+ - [ECR](#ecr)
+ - [What’s serverless?](#whats-serverless)
+ - [Why AWS Lambda ?](#why-aws-lambda-)
+ - [Benefits of AWS Lambda](#benefits-of-aws-lambda)
+ - [AWS Lambda language support](#aws-lambda-language-support)
+ - [AWS Lambda Pricing: example](#aws-lambda-pricing-example)
+ - [Amazon API Gateway](#amazon-api-gateway)
+ - [AWS Batch](#aws-batch)
+ - [Batch vs Lambda](#batch-vs-lambda)
+ - [Amazon Lightsail](#amazon-lightsail)
+ - [Lambda Summary](#lambda-summary)
+ - [Other Compute Summary](#other-compute-summary)
-* Docker is a software development platform to deploy apps
-* Apps are packaged in containers that can be run on any OS
-* Apps run the same, regardless of where they’re run
- * Any machine
- * No compatibility issues
- * Predictable behavior
- * Less work
- * Easier to maintain and deploy
- * Works with any language, any OS, any technology
-* Scale containers up and down very quickly (seconds)
+## What is Docker?
-Where Docker images are stored?
+- Docker is a software development platform to deploy apps
+- Apps are packaged in containers that can be run on any OS
+- Apps run the same, regardless of where they’re run
+ - Any machine
+ - No compatibility issues
+ - Predictable behavior
+ - Less work
+ - Easier to maintain and deploy
+ - Works with any language, any OS, any technology
+- Scale containers up and down very quickly (seconds)
-* Docker images are stored in Docker Repositories
-* Public: Docker Hub
- * Find base images for many technologies or OS:
- * Ubuntu
- * MySQL
- * NodeJS, Java…
-* Private: Amazon ECR (Elastic Container Registry)
+### Where Docker images are stored?
-## Docker versus Virtual Machines
+- Docker images are stored in Docker Repositories
+- Public: Docker Hub
+ - Find base images for many technologies or OS:
+ - Ubuntu
+ - MySQL
+ - NodeJS, Java…
+- Private: Amazon ECR (Elastic Container Registry)
-* Docker is ”sort of” a virtualization technology, but not exactly
-* Resources are shared with the host => many containers on one server
+### Docker versus Virtual Machines
+
+- Docker is ”sort of” a virtualization technology, but not exactly
+- Resources are shared with the host => many containers on one server
## ECS
-* ECS = Elastic Container Service
-* Launch Docker containers on AWS
-* You must provision & maintain the infrastructure (the EC2 instances)
-* AWS takes care of starting / stopping containers
-* Has integrations with the Application Load Balancer
+- ECS = Elastic Container Service
+- Launch Docker containers on AWS
+- You must provision & maintain the infrastructure (the EC2 instances)
+- AWS takes care of starting / stopping containers
+- Has integrations with the Application Load Balancer
## Fargate
-* Launch Docker containers on AWS
-* You do not provision the infrastructure (no EC2 instances to manage) – simpler!
-* Serverless offering
-* AWS just runs containers for you based on the CPU / RAM you need
+- Launch Docker containers on AWS
+- You do not provision the infrastructure (no EC2 instances to manage) – simpler!
+- Serverless offering
+- AWS just runs containers for you based on the CPU / RAM you need
## ECR
-* Elastic Container Registry
-* Private Docker Registry on AWS
-* This is where you store your Docker images so they can be run by ECS or Fargate
+- Elastic Container Registry
+- Private Docker Registry on AWS
+- This is where you store your Docker images so they can be run by ECS or Fargate
## What’s serverless?
-* Serverless is a new paradigm in which the developers don’t have to manage servers anymore…
-* They just deploy code
-* They just deploy… functions !
-* Initially... Serverless == FaaS (Function as a Service)
-* Serverless was pioneered by AWS Lambda but now also includes anything that’s managed: “databases, messaging, storage, etc.”
-* Serverless does not mean there are no servers…
-* it means you just don’t manage / provision / see them
+- Serverless is a new paradigm in which the developers don’t have to manage servers anymore…
+- They just deploy code
+- They just deploy… functions !
+- Initially... Serverless == FaaS (Function as a Service)
+- Serverless was pioneered by AWS Lambda but now also includes anything that’s managed: “databases, messaging, storage, etc.”
+- Serverless does not mean there are no servers…
+- it means you just don’t manage / provision / see them
## Why AWS Lambda ?
-EC2 | Lambda
----- | ----
-Virtual Servers in the Cloud | Virtual functions – no servers to manage!
-Limited by RAM and CPU | Limited by time - short executions
-Continuously running | Run on-demand
-Scaling means intervention to add / remove servers | Scaling is automated!
+| EC2 | Lambda |
+| -------------------------------------------------- | ----------------------------------------- |
+| Virtual Servers in the Cloud | Virtual functions – no servers to manage! |
+| Limited by RAM and CPU | Limited by time - short executions |
+| Continuously running | Run on-demand |
+| Scaling means intervention to add / remove servers | Scaling is automated! |
-## Benefits of AWS Lambda
+### Benefits of AWS Lambda
-* Easy Pricing:
- * Pay per request and compute time
- * Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
-* Integrated with the whole AWS suite of services
-* Event-Driven: functions get invoked by AWS when needed
-* Integrated with many programming languages
-* Easy monitoring through AWS CloudWatch
-* Easy to get more resources per functions (up to 10GB of RAM!)
-* Increasing RAM will also improve CPU and network!
+- Easy Pricing:
+ - Pay per request and compute time
+ - Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
+- Integrated with the whole AWS suite of services
+- Event-Driven: functions get invoked by AWS when needed
+- Integrated with many programming languages
+- Easy monitoring through AWS CloudWatch
+- Easy to get more resources per functions (up to 10GB of RAM!)
+- Increasing RAM will also improve CPU and network!
-## AWS Lambda language support
+### AWS Lambda language support
-* Node.js (JavaScript)
-* Python
-* Java (Java 8 compatible)
-* C# (.NET Core)
-* Golang
-* C# / Powershell
-* Ruby
-* Custom Runtime API (community supported, example Rust)
-* Lambda Container Image
- * The container image must implement the Lambda Runtime API
- * ECS / Fargate is preferred for running arbitrary Docker images
+- Node.js (JavaScript)
+- Python
+- Java (Java 8 compatible)
+- C# (.NET Core)
+- Golang
+- C# / Powershell
+- Ruby
+- Custom Runtime API (community supported, example Rust)
+- Lambda Container Image
+ - The container image must implement the Lambda Runtime API
+ - ECS / Fargate is preferred for running arbitrary Docker images
-## AWS Lambda Pricing: example
+### AWS Lambda Pricing: example
-* You can find overall pricing information here:
-* Pay per calls:
- * First 1,000,000 requests are free
- * $0.20 per 1 million requests thereafter ($0.0000002 per request)
-* Pay per duration: (in increment of 1 ms)
- * 400,000 GB-seconds of compute time per month for FREE
- * == 400,000 seconds if function is 1GB RAM
- * == 3,200,000 seconds if function is 128 MB RAM
- * After that $1.00 for 600,000 GB-seconds
-* It is usually **very cheap** to run AWS Lambda so it’s **very popular**
+- You can find overall pricing information here:
+- Pay per calls:
+ - First 1,000,000 requests are free
+ - $0.20 per 1 million requests thereafter ($0.0000002 per request)
+- Pay per duration: (in increment of 1 ms)
+ - 400,000 GB-seconds of compute time per month for FREE
+ - == 400,000 seconds if function is 1GB RAM
+ - == 3,200,000 seconds if function is 128 MB RAM
+ - After that $1.00 for 600,000 GB-seconds
+- It is usually **very cheap** to run AWS Lambda so it’s **very popular**
## Amazon API Gateway
-* Example: building a serverless API
-* Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs
-* Serverless and scalable
-* Supports RESTful APIs and WebSocket APIs
-* Support for security, user authentication, API throttling, API keys, monitoring.
+- Example: building a serverless API
+- Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs
+- Serverless and scalable
+- Supports RESTful APIs and WebSocket APIs
+- Support for security, user authentication, API throttling, API keys, monitoring.
## AWS Batch
-* Fully managed batch processing at any scale
-* Efficiently run 100,000s of computing batch jobs on AWS
-* A “batch” job is a job with a start and an end (opposed to continuous)
-* Batch will dynamically launch EC2 instances or Spot Instances
-* AWS Batch provisions the right amount of compute / memory
-* You submit or schedule batch jobs and AWS Batch does the rest!
-* Batch jobs are defined as Docker images and run on ECS
-* Helpful for cost optimizations and focusing less on the infrastructure
+- Fully managed batch processing at any scale
+- Efficiently run 100,000s of computing batch jobs on AWS
+- A “batch” job is a job with a start and an end (opposed to continuous)
+- Batch will dynamically launch EC2 instances or Spot Instances
+- AWS Batch provisions the right amount of compute / memory
+- You submit or schedule batch jobs and AWS Batch does the rest!
+- Batch jobs are defined as Docker images and run on ECS
+- Helpful for cost optimizations and focusing less on the infrastructure
## Batch vs Lambda
-Batch | Lambda
----- | ----
-No time limit | Time limit
-Any runtime as long as it’s packaged as a Docker image | Limited runtime
-Rely on EBS / instance store for disk space | Limited temporary disk space
-Relies on EC2 (can be managed by AWS) | Serverless
+| Batch | Lambda |
+| ------------------------------------------------------ | ---------------------------- |
+| No time limit | Time limit |
+| Any runtime as long as it’s packaged as a Docker image | Limited runtime |
+| Rely on EBS / instance store for disk space | Limited temporary disk space |
+| Relies on EC2 (can be managed by AWS) | Serverless |
## Amazon Lightsail
-* Virtual servers, storage, databases, and networking
-* Low & predictable pricing
-* Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
-* Great for people with little cloud experience!
-* Can setup notifications and monitoring of your Lightsail resources
-* Use cases:
- * Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
- * Websites (templates for WordPress, Magento, Plesk, Joomla)
- * Dev / Test environment
-* Has high availability but no auto-scaling, limited AWS integrations
+- Virtual servers, storage, databases, and networking
+- Low & predictable pricing
+- Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
+- Great for people with little cloud experience!
+- Can setup notifications and monitoring of your Lightsail resources
+- Use cases:
+ - Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
+ - Websites (templates for WordPress, Magento, Plesk, Joomla)
+ - Dev / Test environment
+- Has high availability but no auto-scaling, limited AWS integrations
## Lambda Summary
-* Lambda is Serverless, Function as a Service, seamless scaling, reactive
-* Lambda Billing:
- * By the time run x by the RAM provisioned
- * By the number of invocations
-* Language Support: many programming languages except (arbitrary) Docker
-* Invocation time: up to 15 minutes
-* Use cases:
- * Create Thumbnails for images uploaded onto S3
- * Run a Serverless cron job
-* API Gateway: expose Lambda functions as HTTP API
+- Lambda is Serverless, Function as a Service, seamless scaling, reactive
+- Lambda Billing:
+ - By the time run x by the RAM provisioned
+ - By the number of invocations
+- Language Support: many programming languages except (arbitrary) Docker
+- Invocation time: up to 15 minutes
+- Use cases:
+ - Create Thumbnails for images uploaded onto S3
+ - Run a Serverless cron job
+- API Gateway: expose Lambda functions as HTTP API
## Other Compute Summary
-* Docker: container technology to run applications
-* ECS: run Docker containers on EC2 instances
-* Fargate:
-* Run Docker containers without provisioning the infrastructure
-* Serverless offering (no EC2 instances)
-* ECR: Private Docker Images Repository
-* Batch: run batch jobs on AWS across managed EC2 instances
-* Lightsail: predictable & low pricing for simple application & DB stacks
+- Docker: container technology to run applications
+- ECS: run Docker containers on EC2 instances
+- Fargate:
+- Run Docker containers without provisioning the infrastructure
+- Serverless offering (no EC2 instances)
+- ECR: Private Docker Images Repository
+- Batch: run batch jobs on AWS across managed EC2 instances
+- Lightsail: predictable & low pricing for simple application & DB stacks
diff --git a/sections/s3.md b/sections/s3.md
index 7e8d994..4eabec1 100644
--- a/sections/s3.md
+++ b/sections/s3.md
@@ -1,71 +1,109 @@
# Amazon S3
+- [Amazon S3](#amazon-s3)
+ - [S3 Use cases](#s3-use-cases)
+ - [Amazon S3 Overview - Buckets](#amazon-s3-overview---buckets)
+ - [Amazon S3 Overview - Objects](#amazon-s3-overview---objects)
+ - [S3 Security](#s3-security)
+ - [S3 Bucket Policies](#s3-bucket-policies)
+ - [Bucket settings for Block Public Access](#bucket-settings-for-block-public-access)
+ - [S3 Websites](#s3-websites)
+ - [S3 - Versioning](#s3---versioning)
+ - [S3 Access Logs](#s3-access-logs)
+ - [S3 Replication (CRR & SRR)](#s3-replication-crr--srr)
+ - [S3 Storage Classes](#s3-storage-classes)
+ - [S3 Durability and Availability](#s3-durability-and-availability)
+ - [S3 Standard General Purpose](#s3-standard-general-purpose)
+ - [S3 Storage Classes - Infrequent Access](#s3-storage-classes---infrequent-access)
+ - [S3 Standard Infrequent Access (S3 Standard-IA)](#s3-standard-infrequent-access-s3-standard-ia)
+ - [S3 One Zone Infrequent Access (S3 One Zone-IA)](#s3-one-zone-infrequent-access-s3-one-zone-ia)
+ - [Amazon S3 Glacier Storage Classes](#amazon-s3-glacier-storage-classes)
+ - [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
+ - [Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
+ - [Amazon S3 Glacier Deep Archive - for long term storage](#amazon-s3-glacier-deep-archive---for-long-term-storage)
+ - [S3 Intelligent-Tiering](#s3-intelligent-tiering)
+ - [S3 Object Lock & Glacier Vault Lock](#s3-object-lock--glacier-vault-lock)
+ - [Shared Responsibility Model for S3](#shared-responsibility-model-for-s3)
+ - [AWS Snow Family](#aws-snow-family)
+ - [Data Migrations with AWS Snow Family](#data-migrations-with-aws-snow-family)
+ - [Time to Transfer](#time-to-transfer)
+ - [Snowball Edge (for data transfers)](#snowball-edge-for-data-transfers)
+ - [AWS Snowcone](#aws-snowcone)
+ - [AWS Snowmobile](#aws-snowmobile)
+ - [Snow Family - Usage Process](#snow-family---usage-process)
+ - [What is Edge Computing?](#what-is-edge-computing)
+ - [Snow Family - Edge Computing](#snow-family---edge-computing)
+ - [AWS OpsHub](#aws-opshub)
+ - [Hybrid Cloud for Storage](#hybrid-cloud-for-storage)
+ - [AWS Storage Gateway](#aws-storage-gateway)
+ - [Amazon S3 - Summary](#amazon-s3---summary)
+
## S3 Use cases
-* Backup and storage
-* Disaster Recovery
-* Archive
-* Hybrid Cloud storage
-* Application hosting
-* Media hosting
-* Data lakes & big data analytics
-* Software delivery
-* Static website
+- Backup and storage
+- Disaster Recovery
+- Archive
+- Hybrid Cloud storage
+- Application hosting
+- Media hosting
+- Data lakes & big data analytics
+- Software delivery
+- Static website
## Amazon S3 Overview - Buckets
-* Amazon S3 allows people to store objects (files) in “buckets” (directories)
-* Buckets must have a globally unique name (across all regions all accounts)
-* Buckets are defined at the region level
-* S3 looks like a global service but buckets are created in a region
-* Naming convention
- * No uppercase
- * No underscore
- * 3-63 characters long
- * Not an IP
- * Must start with lowercase letter or number
+- Amazon S3 allows people to store objects (files) in “buckets” (directories)
+- Buckets must have a globally unique name (across all regions all accounts)
+- Buckets are defined at the region level
+- S3 looks like a global service but buckets are created in a region
+- Naming convention
+ - No uppercase
+ - No underscore
+ - 3-63 characters long
+ - Not an IP
+ - Must start with lowercase letter or number
## Amazon S3 Overview - Objects
-* Objects (files) have a Key
-* The key is the FULL path:
- * s3://my-bucket/my_file.txt
- * s3://my-bucket/my_folder1/another_folder/my_file.txt
-* The key is composed of **prefix** + **object name**
- * s3://my-bucket/my_folder1/another_folder/my_file.txt
-* There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
-* Just keys with very long names that contain slashes (“/”)
-* Object values are the content of the body:
- * Max Object Size is 5TB (5000GB)
- * If uploading more than 5GB, must use “multi-part upload”
-* Metadata (list of text key / value pairs – system or user metadata)
- * Tags (Unicode key / value pair – up to 10) – useful for security / lifecycle
- * Version ID (if versioning is enabled)
+- Objects (files) have a Key
+- The key is the FULL path:
+ - s3://my-bucket/my_file.txt
+ - s3://my-bucket/my_folder1/another_folder/my_file.txt
+- The key is composed of **prefix** + **object name**
+ - s3://my-bucket/my_folder1/another_folder/my_file.txt
+- There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
+- Just keys with very long names that contain slashes (“/”)
+- Object values are the content of the body:
+ - Max Object Size is 5TB (5000GB)
+ - If uploading more than 5GB, must use “multi-part upload”
+- Metadata (list of text key / value pairs – system or user metadata)
+ - Tags (Unicode key / value pair – up to 10) – useful for security / lifecycle
+ - Version ID (if versioning is enabled)
## S3 Security
-* **User based**
- * IAM policies - which API calls should be allowed for a specific user from IAM console
-* **Resource Based**
- * Bucket Policies - bucket wide rules from the S3 console - allows cross account
- * Object Access Control List (ACL) – finer grain
- * Bucket Access Control List (ACL) – less common
-* **Note:** an IAM principal can access an S3 object if
- * the user IAM permissions allow it OR the resource policy ALLOWS it
- * AND there’s no explicit DENY
-* **Encryption:** encrypt objects in Amazon S3 using encryption keys
+- **User based**
+ - IAM policies - which API calls should be allowed for a specific user from IAM console
+- **Resource Based**
+ - Bucket Policies - bucket wide rules from the S3 console - allows cross account
+ - Object Access Control List (ACL) – finer grain
+ - Bucket Access Control List (ACL) – less common
+- **Note:** an IAM principal can access an S3 object if
+ - the user IAM permissions allow it OR the resource policy ALLOWS it
+ - AND there’s no explicit DENY
+- **Encryption:** encrypt objects in Amazon S3 using encryption keys
-S3 Bucket Policies
+## S3 Bucket Policies
-* JSON based policies
- * Resources: buckets and objects
- * Actions: Set of API to Allow or Deny
- * Effect: Allow / Deny
+- JSON based policies
+ - Resources: buckets and objects
+ - Actions: Set of API to Allow or Deny
+ - Effect: Allow / Deny
Principal: The account or user to apply the policy to
-* Use S3 bucket for policy to:
- * Grant public access to the bucket
- * Force objects to be encrypted at upload
- * Grant access to another account (Cross Account)
+- Use S3 bucket for policy to:
+ - Grant public access to the bucket
+ - Force objects to be encrypted at upload
+ - Grant access to another account (Cross Account)
```json
{
@@ -88,215 +126,216 @@ S3 Bucket Policies
## Bucket settings for Block Public Access
-* Block all public access: On
- * Block public access to buckets and objects granted through new access control lists (ACLS): On
- * Block public access to buckets and objects granted through any access control lists (ACLS): On
- * Block public access to buckets and objects granted through new public bucket or access point policies: On
- * Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
+- Block all public access: On
+ - Block public access to buckets and objects granted through new access control lists (ACLS): On
+ - Block public access to buckets and objects granted through any access control lists (ACLS): On
+ - Block public access to buckets and objects granted through new public bucket or access point policies: On
+ - Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
-* These settings were created to prevent company data leaks
-* If you know your bucket should never be public, leave these on
-* Can be set at the account level
+- These settings were created to prevent company data leaks
+- If you know your bucket should never be public, leave these on
+- Can be set at the account level
## S3 Websites
-* S3 can host static websites and have them accessible on the www
-* The website URL will be:
-* bucket-name.s3-website-AWS-region.amazonaws.com
+- S3 can host static websites and have them accessible on the www
+- The website URL will be:
+- bucket-name.s3-website-AWS-region.amazonaws.com
OR
-* bucket-name.s3-website.AWS-region.amazonaws.com
-* **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
+- bucket-name.s3-website.AWS-region.amazonaws.com
+- **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
-## S3 -Versioning
+## S3 - Versioning
-* You can version your files in Amazon S3
-* It is enabled at the bucket level
-* Same key overwrite will increment the “version”: 1, 2, 3….
-* It is best practice to version your buckets
- * Protect against unintended deletes (ability to restore a version)
- * Easy roll back to previous version
-* Notes:
- * Any file that is not versioned prior to enabling versioning will have version “null”
- * Suspending versioning does not delete the previous versions
+- You can version your files in Amazon S3
+- It is enabled at the bucket level
+- Same key overwrite will increment the “version”: 1, 2, 3….
+- It is best practice to version your buckets
+ - Protect against unintended deletes (ability to restore a version)
+ - Easy roll back to previous version
+- Notes:
+ - Any file that is not versioned prior to enabling versioning will have version “null”
+ - Suspending versioning does not delete the previous versions
## S3 Access Logs
-* For audit purpose, you may want to log all access to S3 buckets
-* Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
-* That data can be analyzed using data analysis tools…
-* Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
+- For audit purpose, you may want to log all access to S3 buckets
+- Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
+- That data can be analyzed using data analysis tools…
+- Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
## S3 Replication (CRR & SRR)
-* Must enable versioning in source and destination
-* Cross Region Replication (CRR)
-* Same Region Replication (SRR)
-* Buckets can be in different accounts
-* Copying is asynchronous
-* Must give proper IAM permissions to S3
-* CRR - Use cases: compliance, lower latency access, replication across accounts
-* SRR – Use cases: log aggregation, live replication between production and test accounts
+- Must enable versioning in source and destination
+- Cross Region Replication (CRR)
+- Same Region Replication (SRR)
+- Buckets can be in different accounts
+- Copying is asynchronous
+- Must give proper IAM permissions to S3
+- CRR - Use cases: compliance, lower latency access, replication across accounts
+- SRR – Use cases: log aggregation, live replication between production and test accounts
## S3 Storage Classes
-* [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
-* [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
-* [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
-* [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
-* [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
-* [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive-–-for-long-term-storage)
-* [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
+- [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
+- [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
+- [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
+- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
+- [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
+- [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive-–-for-long-term-storage)
+- [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
-* Can move between classes manually or using S3 Lifecycle configurations
+- Can move between classes manually or using S3 Lifecycle configurations
-## S3 Durability and Availability
+### S3 Durability and Availability
-* Durability:
- * High durability (99.999999999%, 11 9’s) of objects across multiple AZ
- * If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
- * Same for all storage classes
-* Availability:
- * Measures how readily available a service is
- * Varies depending on storage class
- * Example: S3 standard has 99.99% availability = not available 53 minutes a year
+- Durability:
+ - High durability (99.999999999%, 11 9’s) of objects across multiple AZ
+ - If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
+ - Same for all storage classes
+- Availability:
+ - Measures how readily available a service is
+ - Varies depending on storage class
+ - Example: S3 standard has 99.99% availability = not available 53 minutes a year
-## S3 Standard General Purpose
+### S3 Standard General Purpose
-* 99.99% Availability
-* Used for frequently accessed data
-* Low latency and high throughput
-* Sustain 2 concurrent facility failures
-* Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
+- 99.99% Availability
+- Used for frequently accessed data
+- Low latency and high throughput
+- Sustain 2 concurrent facility failures
+- Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
-## S3 Storage Classes – Infrequent Access
+### S3 Storage Classes - Infrequent Access
-* For data that is less frequently accessed, but requires rapid access when needed
-* Lower cost than S3 Standard
+- For data that is less frequently accessed, but requires rapid access when needed
+- Lower cost than S3 Standard
-### S3 Standard Infrequent Access (S3 Standard-IA)
+#### S3 Standard Infrequent Access (S3 Standard-IA)
-* 99.9% Availability
-* Use cases: Disaster Recovery, backups
+- 99.9% Availability
+- Use cases: Disaster Recovery, backups
-### S3 One Zone Infrequent Access (S3 One Zone-IA)
+#### S3 One Zone Infrequent Access (S3 One Zone-IA)
-* High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
-* 99.5% Availability
-* Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
+- High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
+- 99.5% Availability
+- Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
-## Amazon S3 Glacier Storage Classes
+### Amazon S3 Glacier Storage Classes
-* Low-cost object storage meant for archiving / backup
-* Pricing: price for storage + object retrieval cost
+- Low-cost object storage meant for archiving / backup
+- Pricing: price for storage + object retrieval cost
-### Amazon S3 Glacier Instant Retrieval
+#### Amazon S3 Glacier Instant Retrieval
-* Millisecond retrieval, great for data accessed once a quarter
-* Minimum storage duration of 90 days
+- Millisecond retrieval, great for data accessed once a quarter
+- Minimum storage duration of 90 days
-### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
+#### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
-* Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – free
-* Minimum storage duration of 90 days
+- Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – free
+- Minimum storage duration of 90 days
-### Amazon S3 Glacier Deep Archive – for long term storage
+#### Amazon S3 Glacier Deep Archive - for long term storage
-* Standard (12 hours), Bulk (48 hours)
-* Minimum storage duration of 180 days
+- Standard (12 hours), Bulk (48 hours)
+- Minimum storage duration of 180 days
-## S3 Intelligent-Tiering
+### S3 Intelligent-Tiering
-* Small monthly monitoring and auto-tiering fee
-* Moves objects automatically between Access Tiers based on usage
-* There are no retrieval charges in S3 Intelligent-Tiering
-* Frequent Access tier (automatic): default tier
-* Infrequent Access tier (automatic): objects not accessed for 30 days
-* Archive Instant Access tier (automatic): objects not accessed for 90 days
-* Archive Access tier (optional): configurable from 90 days to 700+ days
-* Deep Archive Access tier (optional): config. from 180 days to 700+ days
+- Small monthly monitoring and auto-tiering fee
+- Moves objects automatically between Access Tiers based on usage
+- There are no retrieval charges in S3 Intelligent-Tiering
+- Frequent Access tier (automatic): default tier
+- Infrequent Access tier (automatic): objects not accessed for 30 days
+- Archive Instant Access tier (automatic): objects not accessed for 90 days
+- Archive Access tier (optional): configurable from 90 days to 700+ days
+- Deep Archive Access tier (optional): config. from 180 days to 700+ days
## S3 Object Lock & Glacier Vault Lock
-* S3 Object Lock
- * Adopt a WORM (Write Once Read Many) model
- * Block an object version deletion for a specified amount of time
-* Glacier Vault Lock
- * Adopt a WORM (Write Once Read Many) model
- * Lock the policy for future edits (can no longer be changed)
- * Helpful for compliance and data retention
+- S3 Object Lock
+ - Adopt a WORM (Write Once Read Many) model
+ - Block an object version deletion for a specified amount of time
+- Glacier Vault Lock
+ - Adopt a WORM (Write Once Read Many) model
+ - Lock the policy for future edits (can no longer be changed)
+ - Helpful for compliance and data retention
## Shared Responsibility Model for S3
-AWS | YOU
----- | ----
-Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup
-Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes
-Compliance validation | Data encryption at rest and in transit
+| AWS | YOU |
+| ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
+| Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup |
+| Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes |
+| Compliance validation | Data encryption at rest and in transit |
## AWS Snow Family
-* Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
-* Data migration:
- * Snowcone
- * Snowball Edge
- * Snowmobile
-* Edge computing:
- * Snowcone
- * Snowball Edge
+- Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
+- Data migration:
+ - Snowcone
+ - Snowball Edge
+ - Snowmobile
+- Edge computing:
+ - Snowcone
+ - Snowball Edge
-## Data Migrations with AWS Snow Family
+### Data Migrations with AWS Snow Family
-* **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
+- **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
-* Challenges:
- * Limited connectivity
- * Limited bandwidth
- * High network cost
- * Shared bandwidth (can’t maximize the line)
- * Connection stability
+- Challenges:
+ - Limited connectivity
+ - Limited bandwidth
+ - High network cost
+ - Shared bandwidth (can’t maximize the line)
+ - Connection stability
-## Time to Transfer
+### Time to Transfer
-Data | 100 Mbps | 1Gbps | 10Gbps
-10 TB | 12 days | 30 hours | 3 hours
-100 TB | 124 days | 12 days | 30 hours
-1 PB | 3 years | 124 days | 12 days
+| Data | 100 Mbps | 1Gbps | 10Gbps |
+| ------ | -------- | -------- | -------- |
+| 10 TB | 12 days | 30 hours | 3 hours |
+| 100 TB | 124 days | 12 days | 30 hours |
+| 1 PB | 3 years | 124 days | 12 days |
-## Snowball Edge (for data transfers)
+### Snowball Edge (for data transfers)
-* Physical data transport solution: move TBs or PBs of data in or out of AWS
-* Alternative to moving data over the network (and paying network fees)
-* Pay per data transfer job
-* Provide block storage and Amazon S3-compatible object storage
-* Snowball Edge Storage Optimized
- * 80 TB of HDD capacity for block volume and S3 compatible object storage
-* Snowball Edge Compute Optimized
- * 42 TB of HDD capacity for block volume and S3 compatible object storage
-* Use cases: large data cloud migrations, DC decommission, disaster recovery
+- Physical data transport solution: move TBs or PBs of data in or out of AWS
+- Alternative to moving data over the network (and paying network fees)
+- Pay per data transfer job
+- Provide block storage and Amazon S3-compatible object storage
+- Snowball Edge Storage Optimized
+ - 80 TB of HDD capacity for block volume and S3 compatible object storage
+- Snowball Edge Compute Optimized
+ - 42 TB of HDD capacity for block volume and S3 compatible object storage
+- Use cases: large data cloud migrations, DC decommission, disaster recovery
-## AWS Snowcone
+### AWS Snowcone
-* Small, portable computing, anywhere, rugged & secure, withstands harsh environments
-* Light (4.5 pounds, 2.1 kg)
-* Device used for edge computing, storage, and data transfer
-* **8 TBs of usable storage**
-* Use Snowcone where Snowball does not fit (space-constrained environment)
-* Must provide your own battery / cables
-* Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
+- Small, portable computing, anywhere, rugged & secure, withstands harsh environments
+- Light (4.5 pounds, 2.1 kg)
+- Device used for edge computing, storage, and data transfer
+- **8 TBs of usable storage**
+- Use Snowcone where Snowball does not fit (space-constrained environment)
+- Must provide your own battery / cables
+- Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
-## AWS Snowmobile
+### AWS Snowmobile
-* Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
-* Each Snowmobile has 100 PB of capacity (use multiple in parallel)
-* High security: temperature controlled, GPS, 24/7 video surveillance
-* **Better than Snowball if you transfer more than 10 PB**
+- Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
+- Each Snowmobile has 100 PB of capacity (use multiple in parallel)
+- High security: temperature controlled, GPS, 24/7 video surveillance
+- **Better than Snowball if you transfer more than 10 PB**
-Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile
----- | ---- | ---- | ----
-Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB
-Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline
+| Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile |
+| ---------------- | ------------------------------- | ------------------------------- | ----------------------- |
+| Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB |
+| Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline |
-## Snow Family – Usage Process
+### Snow Family - Usage Process
1. Request Snowball devices from the AWS console for delivery
2. Install the snowball client / AWS OpsHub on your servers
@@ -307,78 +346,78 @@ Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up
## What is Edge Computing?
-* Process data while it’s being created on an edge location
- * A truck on the road, a ship on the sea, a mining station underground...
-* These locations may have
- * Limited / no internet access
- * Limited / no easy access to computing power
-* We setup a **Snowball Edge / Snowcone** device to do edge computing
-* Use cases of Edge Computing:
- * Preprocess data
- * Machine learning at the edge
- * Transcoding media streams
-* Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
+- Process data while it’s being created on an edge location
+ - A truck on the road, a ship on the sea, a mining station underground...
+- These locations may have
+ - Limited / no internet access
+ - Limited / no easy access to computing power
+- We setup a **Snowball Edge / Snowcone** device to do edge computing
+- Use cases of Edge Computing:
+ - Preprocess data
+ - Machine learning at the edge
+ - Transcoding media streams
+- Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
-## Snow Family – Edge Computing
+## Snow Family - Edge Computing
-* **Snowcone (smaller)**
- * 2 CPUs, 4 GB of memory, wired or wireless access
- * USB-C power using a cord or the optional battery
-* **Snowball Edge – Compute Optimized**
- * 52 vCPUs, 208 GiB of RAM
- * Optional GPU (useful for video processing or machine learning)
- * 42 TB usable storage
-* **Snowball Edge – Storage Optimized**
- * Up to 40 vCPUs, 80 GiB of RAM
- * Object storage clustering available
-* All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
-* Long-term deployment options: 1 and 3 years discounted pricing
+- **Snowcone (smaller)**
+ - 2 CPUs, 4 GB of memory, wired or wireless access
+ - USB-C power using a cord or the optional battery
+- **Snowball Edge – Compute Optimized**
+ - 52 vCPUs, 208 GiB of RAM
+ - Optional GPU (useful for video processing or machine learning)
+ - 42 TB usable storage
+- **Snowball Edge – Storage Optimized**
+ - Up to 40 vCPUs, 80 GiB of RAM
+ - Object storage clustering available
+- All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
+- Long-term deployment options: 1 and 3 years discounted pricing
## AWS OpsHub
-* Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
-* Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
- * Unlocking and configuring single or clustered devices
- * Transferring files
- * Launching and managing instances running on Snow Family Devices
- * Monitor device metrics (storage capacity, active instances on your device)
- * Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
+- Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
+- Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
+ - Unlocking and configuring single or clustered devices
+ - Transferring files
+ - Launching and managing instances running on Snow Family Devices
+ - Monitor device metrics (storage capacity, active instances on your device)
+ - Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
## Hybrid Cloud for Storage
-* AWS is pushing for ”hybrid cloud”
- * Part of your infrastructure is on-premises
- * Part of your infrastructure is on the cloud
-* This can be due to
- * Long cloud migrations
- * Security requirements
- * Compliance requirements
- * IT strategy
-* S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
-* AWS Storage Gateway!
+- AWS is pushing for ”hybrid cloud”
+ - Part of your infrastructure is on-premises
+ - Part of your infrastructure is on the cloud
+- This can be due to
+ - Long cloud migrations
+ - Security requirements
+ - Compliance requirements
+ - IT strategy
+- S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
+- AWS Storage Gateway!
## AWS Storage Gateway
-* Bridge between on-premise data and cloud data in S3
-* Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
-* Use cases: disaster recovery, backup & restore, tiered storage
-* Types of Storage Gateway:
- * File Gateway
- * Volume Gateway
- * Tape Gateway
-* No need to know the types at the exam
+- Bridge between on-premise data and cloud data in S3
+- Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
+- Use cases: disaster recovery, backup & restore, tiered storage
+- Types of Storage Gateway:
+ - File Gateway
+ - Volume Gateway
+ - Tape Gateway
+- No need to know the types at the exam
-## Amazon S3 – Summary
+## Amazon S3 - Summary
-* Buckets vs Objects: global unique name, tied to a region
-* S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
-* S3 Websites: host a static website on Amazon S3
-* S3 Versioning: multiple versions for files, prevent accidental deletes
-* S3 Access Logs: log requests made within your S3 bucket
-* S3 Replication: same-region or cross-region, must enable versioning
-* S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
-* S3 Lifecycle Rules: transition objects between classes
-* S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
-* Snow Family: import data onto S3 through a physical device, edge computing
-* OpsHub: desktop application to manage Snow Family devices
-* Storage Gateway: hybrid solution to extend on-premises storage to S3
\ No newline at end of file
+- Buckets vs Objects: global unique name, tied to a region
+- S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
+- S3 Websites: host a static website on Amazon S3
+- S3 Versioning: multiple versions for files, prevent accidental deletes
+- S3 Access Logs: log requests made within your S3 bucket
+- S3 Replication: same-region or cross-region, must enable versioning
+- S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
+- S3 Lifecycle Rules: transition objects between classes
+- S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
+- Snow Family: import data onto S3 through a physical device, edge computing
+- OpsHub: desktop application to manage Snow Family devices
+- Storage Gateway: hybrid solution to extend on-premises storage to S3