[Modified] Table Of Contents added

This commit is contained in:
kananinirav
2022-08-16 10:20:01 +09:00
parent bfe63bf998
commit a2ec3e9877
6 changed files with 949 additions and 825 deletions

View File

@@ -4,7 +4,6 @@
### Table of contents ### Table of contents
- AWS Fundamentals
- [What is Cloud Computing?](sections/cloud_computing.md) - [What is Cloud Computing?](sections/cloud_computing.md)
- [IAM: Identity Access & Management](sections/iam.md) - [IAM: Identity Access & Management](sections/iam.md)
- [EC2: Virtual Machines](sections/ec2.md) - [EC2: Virtual Machines](sections/ec2.md)

View File

@@ -1,37 +1,64 @@
# Databases # Databases & Analytics
- [Databases & Analytics](#databases--analytics)
- [Databases Intro](#databases-intro)
- [Relational Databases](#relational-databases)
- [NoSQL Databases](#nosql-databases)
- [NoSQL data example: JSON](#nosql-data-example-json)
- [Databases & Shared Responsibility on AWS](#databases--shared-responsibility-on-aws)
- [AWS RDS Overview](#aws-rds-overview)
- [Advantage over using RDS versus deploying DB on EC2](#advantage-over-using-rds-versus-deploying-db-on-ec2)
- [RDS Deployments: Read Replicas, Multi-AZ](#rds-deployments-read-replicas-multi-az)
- [RDS Deployments: Multi-Region](#rds-deployments-multi-region)
- [Amazon Aurora](#amazon-aurora)
- [Amazon ElastiCache Overview](#amazon-elasticache-overview)
- [DynamoDB](#dynamodb)
- [DynamoDB Accelerator - DAX](#dynamodb-accelerator---dax)
- [DynamoDB - Global Tables](#dynamodb---global-tables)
- [Redshift Overview](#redshift-overview)
- [Amazon EMR](#amazon-emr)
- [Amazon Athena](#amazon-athena)
- [Amazon QuickSight](#amazon-quicksight)
- [DocumentDB](#documentdb)
- [Amazon Neptune](#amazon-neptune)
- [Amazon QLDB](#amazon-qldb)
- [Amazon Managed Blockchain](#amazon-managed-blockchain)
- [AWS Glue](#aws-glue)
- [DMS - Database Migration Service](#dms---database-migration-service)
- [Databases & Analytics Summary](#databases--analytics-summary)
## Databases Intro ## Databases Intro
* Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits - Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits
* Sometimes, you want to store data in a database… - Sometimes, you want to store data in a database…
* You can structure the data - You can structure the data
* You build indexes to efficiently query / search through the data - You build indexes to efficiently query / search through the data
* You define relationships between your datasets - You define relationships between your datasets
* Databases are optimized for a purpose and come with different features, shapes and constraint - Databases are optimized for a purpose and come with different features, shapes and constraint
## Relational Databases ## Relational Databases
* Looks just like Excel spreadsheets, with links between them! - Looks just like Excel spreadsheets, with links between them!
* Can use the SQL language to perform queries / lookups - Can use the SQL language to perform queries / lookups
## NoSQL Databases ## NoSQL Databases
* NoSQL = non-SQL = non relational databases - NoSQL = non-SQL = non relational databases
* NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications. - NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications.
* Benefits: - Benefits:
* Flexibility: easy to evolve data model - Flexibility: easy to evolve data model
* Scalability: designed to scale-out by using distributed clusters - Scalability: designed to scale-out by using distributed clusters
* High-performance: optimized for a specific data model - High-performance: optimized for a specific data model
* Highly functional: types optimized for the data model - Highly functional: types optimized for the data model
* Examples: Key-value, document, graph, in-memory, search databases - Examples: Key-value, document, graph, in-memory, search databases
### NoSQL data example: JSON ### NoSQL data example: JSON
* JSON = JavaScript Object Notation - JSON = JavaScript Object Notation
* JSON is a common form of data that fits into a NoSQL model - JSON is a common form of data that fits into a NoSQL model
* Data can be nested - Data can be nested
* Fields can change over time - Fields can change over time
* Support for new types: arrays, etc… - Support for new types: arrays, etc…
```json ```json
{ {
@@ -52,213 +79,213 @@
## Databases & Shared Responsibility on AWS ## Databases & Shared Responsibility on AWS
* AWS offers use to manage different databases - AWS offers use to manage different databases
* Benefits include: - Benefits include:
* Quick Provisioning, High Availability, Vertical and Horizontal Scaling - Quick Provisioning, High Availability, Vertical and Horizontal Scaling
* Automated Backup & Restore, Operations, Upgrades - Automated Backup & Restore, Operations, Upgrades
* Operating System Patching is handled by AWS - Operating System Patching is handled by AWS
* Monitoring, alerting - Monitoring, alerting
* Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling - Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling
## AWS RDS Overview ## AWS RDS Overview
* RDS stands for Relational Database Service - RDS stands for Relational Database Service
* Its a managed DB service for DB use SQL as a query language. - Its a managed DB service for DB use SQL as a query language.
* It allows you to create databases in the cloud that are managed by AWS - It allows you to create databases in the cloud that are managed by AWS
* Postgres - Postgres
* MySQL - MySQL
* MariaDB - MariaDB
* Oracle - Oracle
* Microsoft SQL Server - Microsoft SQL Server
* **Aurora (AWS Proprietary database)** - **Aurora (AWS Proprietary database)**
### Advantage over using RDS versus deploying DB on EC2 ### Advantage over using RDS versus deploying DB on EC2
* RDS is a managed service: - RDS is a managed service:
* Automated provisioning, OS patching - Automated provisioning, OS patching
* Continuous backups and restore to specific timestamp (Point in Time Restore)! - Continuous backups and restore to specific timestamp (Point in Time Restore)!
* Monitoring dashboards - Monitoring dashboards
* Read replicas for improved read performance - Read replicas for improved read performance
* Multi AZ setup for DR (Disaster Recovery) - Multi AZ setup for DR (Disaster Recovery)
* Maintenance windows for upgrades - Maintenance windows for upgrades
* Scaling capability (vertical and horizontal) - Scaling capability (vertical and horizontal)
* Storage backed by EBS (gp2 or io1) - Storage backed by EBS (gp2 or io1)
* BUT you cant SSH into your instances - BUT you cant SSH into your instances
## Amazon Aurora ### RDS Deployments: Read Replicas, Multi-AZ
* Aurora is a proprietary technology from AWS (not open sourced) | Read Replicas | Multi-AZ |
* PostgreSQL and MySQL are both supported as Aurora DB | ----------------------------------- | ------------------------------------------------- |
* Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS | Scale the read workload of your DB | Failover in case of AZ outage (high availability) |
* Aurora storage automatically grows in increments of 10GB, up to 64 TB. | Can create up to 5 Read Replicas | Data is only read/written to the main database |
* Aurora costs more than RDS (20% more) but is more efficient | Data is only written to the main DB | Can only have 1 other AZ as failover |
* Not in the free tier
## RDS Deployments: Read Replicas, Multi-AZ
Read Replicas | Multi-AZ
---- | ----
Scale the read workload of your DB | Failover in case of AZ outage (high availability)
Can create up to 5 Read Replicas | Data is only read/written to the main database
Data is only written to the main DB | Can only have 1 other AZ as failover
![Read Replicas | Multi-AZ](/images/read_replicas_multi_AZ.png) ![Read Replicas | Multi-AZ](/images/read_replicas_multi_AZ.png)
## RDS Deployments: Multi-Region ### RDS Deployments: Multi-Region
* Multi-Region (Read Replicas) - Multi-Region (Read Replicas)
* Disaster recovery in case of region issue - Disaster recovery in case of region issue
* Local performance for global reads - Local performance for global reads
* Replication cost - Replication cost
![Multi-Region](/images/multi_region.png) ![Multi-Region](/images/multi_region.png)
## Amazon Aurora
- Aurora is a proprietary technology from AWS (not open sourced)
- PostgreSQL and MySQL are both supported as Aurora DB
- Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
- Aurora costs more than RDS (20% more) but is more efficient
- Not in the free tier
## Amazon ElastiCache Overview ## Amazon ElastiCache Overview
* The same way RDS is to get managed Relational Databases… - The same way RDS is to get managed Relational Databases…
* ElastiCache is to get managed Redis or Memcached - ElastiCache is to get managed Redis or Memcached
* Caches are in-memory databases with high performance, low latency - Caches are in-memory databases with high performance, low latency
* Helps reduce load off databases for read intensive workloads - Helps reduce load off databases for read intensive workloads
* AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup - AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup
## DynamoDB ## DynamoDB
* Fully Managed Highly available with replication across 3 AZ - Fully Managed Highly available with replication across 3 AZ
* NoSQL database - not a relational database - NoSQL database - not a relational database
* Scales to massive workloads, distributed “serverless” database - Scales to massive workloads, distributed “serverless” database
* Millions of requests per seconds, trillions of row, 100s of TB of storage - Millions of requests per seconds, trillions of row, 100s of TB of storage
* Fast and consistent in performance - Fast and consistent in performance
* Single-digit millisecond latency low latency retrieval - Single-digit millisecond latency low latency retrieval
* Integrated with IAM for security, authorization and administration - Integrated with IAM for security, authorization and administration
* Low cost and auto scaling capabilities - Low cost and auto scaling capabilities
* Standard & Infrequent Access (IA) Table Class - Standard & Infrequent Access (IA) Table Class
### DynamoDB Accelerator - DAX ### DynamoDB Accelerator - DAX
* Fully Managed in-memory cache for DynamoDB - Fully Managed in-memory cache for DynamoDB
* 10x performance improvement single- digit millisecond latency to microseconds latency when accessing your DynamoDB tables - 10x performance improvement single- digit millisecond latency to microseconds latency when accessing your DynamoDB tables
* Secure, highly scalable & highly available - Secure, highly scalable & highly available
* Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases - Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
### DynamoDB Global Tables ### DynamoDB - Global Tables
* Make a DynamoDB table accessible with low latency in multiple-regions - Make a DynamoDB table accessible with low latency in multiple-regions
* Active-Active replication (read/write to any AWS Region) - Active-Active replication (read/write to any AWS Region)
## Redshift Overview ## Redshift Overview
* Redshift is based on PostgreSQL, but its not used for OLTP (Online Transactional Processing) - Redshift is based on PostgreSQL, but its not used for OLTP (Online Transactional Processing)
* Its OLAP online analytical processing (analytics and data warehousing) - Its OLAP online analytical processing (analytics and data warehousing)
* Load data once every hour, not every second - Load data once every hour, not every second
* 10x better performance than other data warehouses, scale to PBs of data - 10x better performance than other data warehouses, scale to PBs of data
* Columnar storage of data (instead of row based) - Columnar storage of data (instead of row based)
* Massively Parallel Query Execution (MPP), highly available - Massively Parallel Query Execution (MPP), highly available
* Pay as you go based on the instances provisioned - Pay as you go based on the instances provisioned
* Has a SQL interface for performing the queries - Has a SQL interface for performing the queries
* BI tools such as AWS Quicksight or Tableau integrate with it - BI tools such as AWS Quicksight or Tableau integrate with it
## Amazon EMR ## Amazon EMR
* EMR stands for “Elastic MapReduce” - EMR stands for “Elastic MapReduce”
* EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data - EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
* The clusters can be made of hundreds of EC2 instances - The clusters can be made of hundreds of EC2 instances
* Also supports Apache Spark, HBase, Presto, Flink - Also supports Apache Spark, HBase, Presto, Flink
* EMR takes care of all the provisioning and configuration - EMR takes care of all the provisioning and configuration
* Auto-scaling and integrated with Spot instances - Auto-scaling and integrated with Spot instances
* Use cases: data processing, machine learning, web indexing, big data - Use cases: data processing, machine learning, web indexing, big data
## Amazon Athena ## Amazon Athena
* Serverless query service to analyze data stored in Amazon S3 - Serverless query service to analyze data stored in Amazon S3
* Uses standard SQL language to query the files - Uses standard SQL language to query the files
* Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto) - Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
* Pricing: $5.00 per TB of data scanned - Pricing: $5.00 per TB of data scanned
* Use compressed or columnar data for cost-savings (less scan) - Use compressed or columnar data for cost-savings (less scan)
* Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc... - Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
* **analyze data in S3 using serverless SQL, use Athena** - **analyze data in S3 using serverless SQL, use Athena**
## Amazon QuickSight ## Amazon QuickSight
* Serverless machine learning-powered business intelligence service to create interactive dashboards - Serverless machine learning-powered business intelligence service to create interactive dashboards
* Fast, automatically scalable, embeddable, with per-session pricing - Fast, automatically scalable, embeddable, with per-session pricing
* Use cases: - Use cases:
* Business analytics - Business analytics
* Building visualizations - Building visualizations
* Perform ad-hoc analysis - Perform ad-hoc analysis
* Get business insights using data - Get business insights using data
* Integrated with RDS, Aurora, Athena, Redshift, S3… - Integrated with RDS, Aurora, Athena, Redshift, S3…
## DocumentDB ## DocumentDB
* Aurora is an “AWS-implementation” of PostgreSQL / MySQL … - Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
* DocumentDB is the same for MongoDB (which is a NoSQL database) - DocumentDB is the same for MongoDB (which is a NoSQL database)
* MongoDB is used to store, query, and index JSON data - MongoDB is used to store, query, and index JSON data
* Similar “deployment concepts” as Aurora - Similar “deployment concepts” as Aurora
* Fully Managed, highly available with replication across 3 AZ - Fully Managed, highly available with replication across 3 AZ
* Aurora storage automatically grows in increments of 10GB, up to 64 TB. - Aurora storage automatically grows in increments of 10GB, up to 64 TB.
* Automatically scales to workloads with millions of requests per seconds - Automatically scales to workloads with millions of requests per seconds
## Amazon Neptune ## Amazon Neptune
* Fully managed graph database - Fully managed graph database
* A popular graph dataset would be a social network - A popular graph dataset would be a social network
* Users have friends - Users have friends
* Posts have comments - Posts have comments
* Comments have likes from users - Comments have likes from users
* Users share and like posts… - Users share and like posts…
* Highly available across 3 AZ, with up to 15 read replicas - Highly available across 3 AZ, with up to 15 read replicas
* Build and run applications working with highly connected datasets optimized for these complex and hard queries - Build and run applications working with highly connected datasets optimized for these complex and hard queries
* Can store up to billions of relations and query the graph with milliseconds latency - Can store up to billions of relations and query the graph with milliseconds latency
* Highly available with replications across multiple AZs - Highly available with replications across multiple AZs
* Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking - Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
## Amazon QLDB ## Amazon QLDB
* QLDB stands for ”Quantum Ledger Database” - QLDB stands for ”Quantum Ledger Database”
* A ledger is a book **recording financial transactions** - A ledger is a book **recording financial transactions**
* Fully Managed, Serverless, High available, Replication across 3 AZ - Fully Managed, Serverless, High available, Replication across 3 AZ
* Used to **review history of all the changes made to your application data** over time - Used to **review history of all the changes made to your application data** over time
* **Immutable** system: no entry can be removed or modified, cryptographically verifiable - **Immutable** system: no entry can be removed or modified, cryptographically verifiable
* 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL - 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
* Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules - Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
## Amazon Managed Blockchain ## Amazon Managed Blockchain
* Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority. - Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority.
* Amazon Managed Blockchain is a managed service to: - Amazon Managed Blockchain is a managed service to:
* Join public blockchain networks - Join public blockchain networks
* Or create your own scalable private network - Or create your own scalable private network
* Compatible with the frameworks Hyperledger Fabric & Ethereum - Compatible with the frameworks Hyperledger Fabric & Ethereum
## AWS Glue ## AWS Glue
* Managed extract, transform, and load (ETL) service - Managed extract, transform, and load (ETL) service
* Useful to prepare and transform data for analytics - Useful to prepare and transform data for analytics
* Fully serverless service - Fully serverless service
* Glue Data Catalog: catalog of datasets - Glue Data Catalog: catalog of datasets
* can be used by Athena, Redshift, EMR - can be used by Athena, Redshift, EMR
## DMS Database Migration Service ## DMS - Database Migration Service
* Quickly and securely migrate databases to AWS, resilient, self healing - Quickly and securely migrate databases to AWS, resilient, self healing
* The source database remains available during the migration - The source database remains available during the migration
* Supports: - Supports:
* Homogeneous migrations: ex Oracle to Oracle - Homogeneous migrations: ex Oracle to Oracle
* Heterogeneous migrations: ex Microsoft SQL Server to Aurora - Heterogeneous migrations: ex Microsoft SQL Server to Aurora
## Databases & Analytics Summary in AWS ## Databases & Analytics Summary
* Relational Databases - OLTP: RDS & Aurora (SQL) - Relational Databases - OLTP: RDS & Aurora (SQL)
* Differences between Multi-AZ, Read Replicas, Multi-Region - Differences between Multi-AZ, Read Replicas, Multi-Region
* In-memory Database: ElastiCache - In-memory Database: ElastiCache
* Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB) - Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
* Warehouse - OLAP: Redshift (SQL) - Warehouse - OLAP: Redshift (SQL)
* Hadoop Cluster: EMR - Hadoop Cluster: EMR
* Athena: query data on Amazon S3 (serverless & SQL) - Athena: query data on Amazon S3 (serverless & SQL)
* QuickSight: dashboards on your data (serverless) - QuickSight: dashboards on your data (serverless)
* DocumentDB: “Aurora for MongoDB” (JSON NoSQL database) - DocumentDB: “Aurora for MongoDB” (JSON NoSQL database)
* Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable) - Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable)
* Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains - Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
* Glue: Managed ETL (Extract Transform Load) and Data Catalog service - Glue: Managed ETL (Extract Transform Load) and Data Catalog service
* Database Migration: DMS - Database Migration: DMS
* Neptune: graph database - Neptune: graph database

View File

@@ -1,221 +1,243 @@
# Deploying and Managing Infrastructure at Scale # Deploying and Managing Infrastructure at Scale
## What is CloudFormation - [Deploying and Managing Infrastructure at Scale](#deploying-and-managing-infrastructure-at-scale)
- [What is CloudFormation?](#what-is-cloudformation)
- [Benefits of AWS CloudFormation](#benefits-of-aws-cloudformation)
- [CloudFormation Stack Designer](#cloudformation-stack-designer)
- [AWS Cloud Development Kit (CDK)](#aws-cloud-development-kit-cdk)
- [Developer problems on AWS](#developer-problems-on-aws)
- [AWS Elastic Beanstalk Overview](#aws-elastic-beanstalk-overview)
- [Elastic Beanstalk - Health Monitoring](#elastic-beanstalk---health-monitoring)
- [AWS CodeDeploy](#aws-codedeploy)
- [AWS CodeCommit](#aws-codecommit)
- [AWS CodeBuild](#aws-codebuild)
- [AWS CodePipeline](#aws-codepipeline)
- [AWS CodeArtifact](#aws-codeartifact)
- [AWS CodeStar](#aws-codestar)
- [AWS Cloud9](#aws-cloud9)
- [AWS Systems Manager (SSM)](#aws-systems-manager-ssm)
- [How Systems Manager works](#how-systems-manager-works)
- [Systems Manager - SSM Session Manager](#systems-manager---ssm-session-manager)
- [AWS OpsWorks](#aws-opsworks)
- [Deployment - Summary](#deployment---summary)
- [Developer Services - Summary](#developer-services---summary)
* CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported). ## What is CloudFormation?
* For example, within a CloudFormation template, you say:
* I want a security group - CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported).
* I want two EC2 instances using this security group - For example, within a CloudFormation template, you say:
* I want an S3 bucket - I want a security group
* I want a load balancer (ELB) in front of these machines - I want two EC2 instances using this security group
* Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify - I want an S3 bucket
- I want a load balancer (ELB) in front of these machines
- Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
### Benefits of AWS CloudFormation ### Benefits of AWS CloudFormation
* Infrastructure as code - Infrastructure as code
* No resources are manually created, which is excellent for control - No resources are manually created, which is excellent for control
* Changes to the infrastructure are reviewed through code - Changes to the infrastructure are reviewed through code
* Cost - Cost
* Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you - Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
* You can estimate the costs of your resources using the CloudFormation template - You can estimate the costs of your resources using the CloudFormation template
* Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely - Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
* Productivity - Productivity
* Ability to destroy and re-create an infrastructure on the cloud on the fly - Ability to destroy and re-create an infrastructure on the cloud on the fly
* Automated generation of Diagram for your templates! - Automated generation of Diagram for your templates!
* Declarative programming (no need to figure out ordering and orchestration) - Declarative programming (no need to figure out ordering and orchestration)
* Dont re-invent the wheel - Dont re-invent the wheel
* Leverage existing templates on the web! - Leverage existing templates on the web!
* Leverage the documentation - Leverage the documentation
* Supports (almost) all AWS resources: - Supports (almost) all AWS resources:
* Everything well see in this course is supported - Everything well see in this course is supported
* You can use “custom resources” for resources that are not supported - You can use “custom resources” for resources that are not supported
### CloudFormation Stack Designer ### CloudFormation Stack Designer
* Example: WordPress CloudFormation Stack - Example: WordPress CloudFormation Stack
* We can see all the resources - We can see all the resources
* We can see the relations between the components - We can see the relations between the components
## AWS Cloud Development Kit (CDK) ## AWS Cloud Development Kit (CDK)
* Define your cloud infrastructure using a familiar language: - Define your cloud infrastructure using a familiar language:
* JavaScript/TypeScript, Python, Java, and .NET - JavaScript/TypeScript, Python, Java, and .NET
* The code is “compiled” into a CloudFormation template (JSON/YAML) - The code is “compiled” into a CloudFormation template (JSON/YAML)
* You can therefore deploy infrastructure and application runtime code together - You can therefore deploy infrastructure and application runtime code together
* Great for Lambda functions - Great for Lambda functions
* Great for Docker containers in ECS / EKS - Great for Docker containers in ECS / EKS
## Developer problems on AWS ## Developer problems on AWS
* Managing infrastructure - Managing infrastructure
* Deploying Code - Deploying Code
* Configuring all the databases, load balancers, etc - Configuring all the databases, load balancers, etc
* Scaling concerns - Scaling concerns
* Most web apps have the same architecture (ALB + ASG) - Most web apps have the same architecture (ALB + ASG)
* All the developers want is for their code to run! - All the developers want is for their code to run!
* Possibly, consistently across different applications and environments - Possibly, consistently across different applications and environments
## AWS Elastic Beanstalk Overview ## AWS Elastic Beanstalk Overview
* Elastic Beanstalk is a developer centric view of deploying an application on AWS - Elastic Beanstalk is a developer centric view of deploying an application on AWS
* It uses all the components weve seen before: EC2, ASG, ELB, RDS, etc… - It uses all the components weve seen before: EC2, ASG, ELB, RDS, etc…
* But its all in one view thats easy to make sense of! - But its all in one view thats easy to make sense of!
* We still have full control over the configuration - We still have full control over the configuration
* Beanstalk = Platform as a Service (PaaS) - Beanstalk = Platform as a Service (PaaS)
* Beanstalk is free but you pay for the underlying instances - Beanstalk is free but you pay for the underlying instances
* Managed service - Managed service
* Instance configuration / OS is handled by Beanstalk - Instance configuration / OS is handled by Beanstalk
* Deployment strategy is configurable but performed by Elastic Beanstalk - Deployment strategy is configurable but performed by Elastic Beanstalk
* Capacity provisioning - Capacity provisioning
* Load balancing & auto-scaling - Load balancing & auto-scaling
* Application health-monitoring & responsiveness - Application health-monitoring & responsiveness
* Just the application code is the responsibility of the developer - Just the application code is the responsibility of the developer
* Three architecture models: - Three architecture models:
* Single Instance deployment: good for dev - Single Instance deployment: good for dev
* LB + ASG: great for production or pre-production web applications - LB + ASG: great for production or pre-production web applications
* ASG only: great for non-web apps in production (workers, etc..) - ASG only: great for non-web apps in production (workers, etc..)
* Support for many platforms: - Support for many platforms:
* Go - Go
* Java SE - Java SE
* Java with Tomcat - Java with Tomcat
* .NET on Windows Server with IIS - .NET on Windows Server with IIS
* Node.js - Node.js
* PHP - PHP
* Python - Python
* Ruby - Ruby
* Packer Builder - Packer Builder
* Single Container Docker - Single Container Docker
* Multi-Container Docker - Multi-Container Docker
* Preconfigured Docker - Preconfigured Docker
### Elastic Beanstalk Health Monitoring ### Elastic Beanstalk - Health Monitoring
* Health agent pushes metrics to CloudWatch - Health agent pushes metrics to CloudWatch
* Checks for app health, publishes health events - Checks for app health, publishes health events
## AWS CodeDeploy ## AWS CodeDeploy
* We want to deploy our application automatically - We want to deploy our application automatically
* Works with EC2 Instances - Works with EC2 Instances
* Works with On-Premises Servers - Works with On-Premises Servers
* Hybrid service - Hybrid service
* Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent - Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent
## AWS CodeCommit ## AWS CodeCommit
* Before pushing the application code to servers, it needs to be stored somewhere - Before pushing the application code to servers, it needs to be stored somewhere
* Developers usually store code in a repository, using the Git technology - Developers usually store code in a repository, using the Git technology
* A famous public offering is GitHub, AWS competing product is CodeCommit - A famous public offering is GitHub, AWS competing product is CodeCommit
* CodeCommit: - CodeCommit:
* Source-control service that hosts Git-based repositories - Source-control service that hosts Git-based repositories
* Makes it easy to collaborate with others on code - Makes it easy to collaborate with others on code
* The code changes are automatically versioned - The code changes are automatically versioned
* Benefits: - Benefits:
* Fully managed - Fully managed
* Scalable & highly available - Scalable & highly available
* Private, Secured, Integrated with AWS - Private, Secured, Integrated with AWS
## AWS CodeBuild ## AWS CodeBuild
* Code building service in the cloud (name is obvious) - Code building service in the cloud (name is obvious)
* Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example) - Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example)
* Benefits: - Benefits:
* Fully managed, serverless - Fully managed, serverless
* Continuously scalable & highly available - Continuously scalable & highly available
* Secure - Secure
* Pay-as-you-go pricing only pay for the build time - Pay-as-you-go pricing only pay for the build time
## AWS CodePipeline ## AWS CodePipeline
* Orchestrate the different steps to have the code automatically pushed to production - Orchestrate the different steps to have the code automatically pushed to production
* Code => Build => Test => Provision => Deploy - Code => Build => Test => Provision => Deploy
* Basis for CICD (Continuous Integration & Continuous Delivery) - Basis for CICD (Continuous Integration & Continuous Delivery)
* Benefits: - Benefits:
* Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins… - Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins…
* Fast delivery & rapid updates - Fast delivery & rapid updates
* CodePipeline: orchestration layer - CodePipeline: orchestration layer
* CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk - CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk
## AWS CodeArtifact ## AWS CodeArtifact
* Software packages depend on each other to be built (also called code dependencies), and new ones are created - Software packages depend on each other to be built (also called code dependencies), and new ones are created
* Storing and retrieving these dependencies is called artifact management - Storing and retrieving these dependencies is called artifact management
* Traditionally you need to setup your own artifact management system - Traditionally you need to setup your own artifact management system
* CodeArtifact is a secure, scalable, and cost-effective artifact management for software development - CodeArtifact is a secure, scalable, and cost-effective artifact management for software development
* Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet - Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet
* Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact - Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact
## AWS CodeStar ## AWS CodeStar
* Unified UI to easily manage software development activities in one place - Unified UI to easily manage software development activities in one place
* “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc… - “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc…
* Can edit the code ”in-the-cloud” using AWS Cloud9 - Can edit the code ”in-the-cloud” using AWS Cloud9
## AWS Cloud9 ## AWS Cloud9
* AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code - AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code
* “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used - “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used
* A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary - A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary
* AWS Cloud9 also allows for code collaboration in real-time (pair programming) - AWS Cloud9 also allows for code collaboration in real-time (pair programming)
## AWS Systems Manager (SSM) ## AWS Systems Manager (SSM)
* Helps you manage your EC2 and On-Premises systems at scale - Helps you manage your EC2 and On-Premises systems at scale
* Another Hybrid AWS service - Another Hybrid AWS service
* Get operational insights about the state of your infrastructure - Get operational insights about the state of your infrastructure
* Suite of 10+ products - Suite of 10+ products
* Most important features are: - Most important features are:
* Patching automation for enhanced compliance - Patching automation for enhanced compliance
* Run commands across an entire fleet of servers - Run commands across an entire fleet of servers
* Store parameter configuration with the SSM Parameter Store - Store parameter configuration with the SSM Parameter Store
* Works for both Windows and Linux OS - Works for both Windows and Linux OS
### How Systems Manager works ### How Systems Manager works
* We need to install the SSM agent onto the systems we control - We need to install the SSM agent onto the systems we control
* Installed by default on Amazon Linux AMI & some Ubuntu AMI - Installed by default on Amazon Linux AMI & some Ubuntu AMI
* If an instance cant be controlled with SSM, its probably an issue with the SSM agent! - If an instance cant be controlled with SSM, its probably an issue with the SSM agent!
* Thanks to the SSM agent, we can run commands, patch & configure our servers - Thanks to the SSM agent, we can run commands, patch & configure our servers
### Systems Manager SSM Session Manager ### Systems Manager - SSM Session Manager
* Allows you to start a secure shell on your EC2 and on-premises servers - Allows you to start a secure shell on your EC2 and on-premises servers
* No SSH access, bastion hosts, or SSH keys needed - No SSH access, bastion hosts, or SSH keys needed
* No port 22 needed (better security) - No port 22 needed (better security)
* Supports Linux, macOS, and Windows - Supports Linux, macOS, and Windows
* Send session log data to S3 or CloudWatch Logs - Send session log data to S3 or CloudWatch Logs
## AWS OpsWorks ## AWS OpsWorks
* Chef & Puppet help you perform server configuration automatically, or repetitive actions - Chef & Puppet help you perform server configuration automatically, or repetitive actions
* They work great with EC2 & On-Premises VM - They work great with EC2 & On-Premises VM
* AWS OpsWorks = Managed Chef & Puppet - AWS OpsWorks = Managed Chef & Puppet
* Its an alternative to AWS SSM - Its an alternative to AWS SSM
* Only provision standard AWS resources: - Only provision standard AWS resources:
* EC2 Instances, Databases, Load Balancers, EBS volumes… - EC2 Instances, Databases, Load Balancers, EBS volumes…
* **Chef or Puppet needed => AWS OpsWorks** - **Chef or Puppet needed => AWS OpsWorks**
## Deployment - Summary ## Deployment - Summary
* CloudFormation: (AWS only) - CloudFormation: (AWS only)
* Infrastructure as Code, works with almost all of AWS resources - Infrastructure as Code, works with almost all of AWS resources
* Repeat across Regions & Accounts - Repeat across Regions & Accounts
* Beanstalk: (AWS only) - Beanstalk: (AWS only)
* Platform as a Service (PaaS), limited to certain programming languages or Docker - Platform as a Service (PaaS), limited to certain programming languages or Docker
* Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS - Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS
* CodeDeploy (hybrid): deploy & upgrade any application onto servers - CodeDeploy (hybrid): deploy & upgrade any application onto servers
* Systems Manager (hybrid): patch, configure and run commands at scale - Systems Manager (hybrid): patch, configure and run commands at scale
* OpsWorks (hybrid): managed Chef and Puppet in AWS - OpsWorks (hybrid): managed Chef and Puppet in AWS
## Developer Services - Summary ## Developer Services - Summary
* CodeCommit: Store code in private git repository (version controlled) - CodeCommit: Store code in private git repository (version controlled)
* CodeBuild: Build & test code in AWS - CodeBuild: Build & test code in AWS
* CodeDeploy: Deploy code onto servers - CodeDeploy: Deploy code onto servers
* CodePipeline: Orchestration of pipeline (from code to build to deploy) - CodePipeline: Orchestration of pipeline (from code to build to deploy)
* CodeArtifact: Store software packages / dependencies on AWS - CodeArtifact: Store software packages / dependencies on AWS
* CodeStar: Unified view for allowing developers to do CICD and code - CodeStar: Unified view for allowing developers to do CICD and code
* Cloud9: Cloud IDE (Integrated Development Environment) with collab - Cloud9: Cloud IDE (Integrated Development Environment) with collab
* AWS CDK: Define your cloud infrastructure using a programming language - AWS CDK: Define your cloud infrastructure using a programming language

View File

@@ -1,136 +1,154 @@
# EC2 Instance Storage # EC2 Instance Storage
* [EBS volumes](#ebs-volume) - [EC2 Instance Storage](#ec2-instance-storage)
* [EFS: network file system, can be attached to 100s of instances in a region](#efs-elastic-file-system) - [EBS Volumes](#ebs-volumes)
* [EFS-IA: cost-optimized storage class for infrequent accessed files](#efs-infrequent-access-efs-ia) - [Whats an EBS Volume?](#whats-an-ebs-volume)
* [FSx for Windows: Network File System for Windows servers](#amazon-fsx-for-windows-file-server) - [EBS Volume](#ebs-volume)
* [FSx for Lustre: High Performance Computing Linux file system](#amazon-fsx-for-lustre) - [EBS Delete on Termination attribute](#ebs--delete-on-termination-attribute)
- [EBS Snapshots](#ebs-snapshots)
- [EBS Snapshots Features](#ebs-snapshots-features)
- [EFS: Elastic File System](#efs-elastic-file-system)
- [EFS Infrequent Access (EFS-IA)](#efs-infrequent-access-efs-ia)
- [Amazon FSx Overview](#amazon-fsx--overview)
- [Amazon FSx for Windows File Server](#amazon-fsx-for-windows-file-server)
- [Amazon FSx for Lustre](#amazon-fsx-for-lustre)
- [EC2 Instance Store](#ec2-instance-store)
- [Shared Responsibility Model for EC2 Storage](#shared-responsibility-model-for-ec2-storage)
- [AMI Overview](#ami-overview)
- [AMI Process (from an EC2 instance)](#ami-process-from-an-ec2-instance)
- [EC2 Image Builder](#ec2-image-builder)
- EBS: Elastic Block Store, Volume is a network drive you can attach to your instances while they run
- EFS: network file system, can be attached to 100s of instances in a region
- EFS-IA: cost-optimized storage class for infrequent accessed files
- FSx for Windows: Network File System for Windows servers
- FSx for Lustre: High Performance Computing Linux file system
## EBS Volumes ## EBS Volumes
### Whats an EBS Volume? ### Whats an EBS Volume?
* An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run - An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run
* It allows your instances to persist data, even after their termination - It allows your instances to persist data, even after their termination
* They can only be mounted to one instance at a time (at the CCP level) - They can only be mounted to one instance at a time (at the CCP level)
* They are bound to a specific availability zone - They are bound to a specific availability zone
* Analogy: Think of them as a “network USB stick” - Analogy: Think of them as a “network USB stick”
* Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month - Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
### EBS Volume ### EBS Volume
* Its a network drive (i.e. not a physical drive) - Its a network drive (i.e. not a physical drive)
* It uses the network to communicate the instance, which means there might be a bit of latency - It uses the network to communicate the instance, which means there might be a bit of latency
* It can be detached from an EC2 instance and attached to another one quickly - It can be detached from an EC2 instance and attached to another one quickly
* Its locked to an Availability Zone (AZ) - Its locked to an Availability Zone (AZ)
* An EBS Volume in us-east-1a cannot be attached to us-east-1b - An EBS Volume in us-east-1a cannot be attached to us-east-1b
* To move a volume across, you first need to snapshot it - To move a volume across, you first need to snapshot it
* Have a provisioned capacity (size in GBs, and IOPS) - Have a provisioned capacity (size in GBs, and IOPS)
* You get billed for all the provisioned capacity - You get billed for all the provisioned capacity
* You can increase the capacity of the drive over time - You can increase the capacity of the drive over time
### EBS Delete on Termination attribute ### EBS Delete on Termination attribute
* Controls the EBS behaviour when an EC2 instance terminates - Controls the EBS behaviour when an EC2 instance terminates
* By default, the root EBS volume is deleted (attribute enabled) - By default, the root EBS volume is deleted (attribute enabled)
* By default, any other attached EBS volume is not deleted (attribute disabled) - By default, any other attached EBS volume is not deleted (attribute disabled)
* This can be controlled by the AWS console / AWS CLI - This can be controlled by the AWS console / AWS CLI
* Use case: preserve root volume when instance is terminated - Use case: preserve root volume when instance is terminated
### EBS Snapshots ### EBS Snapshots
* Make a backup (snapshot) of your EBS volume at a point in time - Make a backup (snapshot) of your EBS volume at a point in time
* Not necessary to detach volume to do snapshot, but recommended - Not necessary to detach volume to do snapshot, but recommended
* Can copy snapshots across AZ or Region - Can copy snapshots across AZ or Region
### EBS Snapshots Features ### EBS Snapshots Features
* EBS Snapshot Archive - EBS Snapshot Archive
* Move a Snapshot to an ”archive tier” that is 75% cheaper - Move a Snapshot to an ”archive tier” that is 75% cheaper
* Takes within 24 to 72 hours for restoring the archive - Takes within 24 to 72 hours for restoring the archive
* Recycle Bin for EBS Snapshots - Recycle Bin for EBS Snapshots
* Setup rules to retain deleted snapshots so you can recover them after an accidental deletion - Setup rules to retain deleted snapshots so you can recover them after an accidental deletion
* Specify retention (from 1 day to 1 year) - Specify retention (from 1 day to 1 year)
## EFS: Elastic File System ## EFS: Elastic File System
* Managed NFS (network file system) that can be mounted on 100s of EC2 - Managed NFS (network file system) that can be mounted on 100s of EC2
* EFS works with Linux EC2 instances in multi-AZ - EFS works with Linux EC2 instances in multi-AZ
* Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning - Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
## EFS Infrequent Access (EFS-IA) ## EFS Infrequent Access (EFS-IA)
* Storage class that is cost-optimized for files not accessed every day - Storage class that is cost-optimized for files not accessed every day
* Up to 92% lower cost compared to EFS Standard - Up to 92% lower cost compared to EFS Standard
* EFS will automatically move your files to EFS-IA based on the last time they were accessed - EFS will automatically move your files to EFS-IA based on the last time they were accessed
* Enable EFS-IA with a Lifecycle Policy - Enable EFS-IA with a Lifecycle Policy
* Example: move files that are not accessed for 60 days to EFS-IA - Example: move files that are not accessed for 60 days to EFS-IA
* Transparent to the applications accessing EFS - Transparent to the applications accessing EFS
## Amazon FSx Overview ## Amazon FSx Overview
* Launch 3rd party high-performance file systems on AWS - Launch 3rd party high-performance file systems on AWS
* Fully managed service - Fully managed service
* FSx for Lustre - FSx for Lustre
* FSx for Windows File Server - FSx for Windows File Server
* FSx for NetApp ONTAP - FSx for NetApp ONTAP
### Amazon FSx for Windows File Server ### Amazon FSx for Windows File Server
* A fully managed, highly reliable, and scalable Windows native shared file system - A fully managed, highly reliable, and scalable Windows native shared file system
* Built on Windows File Server - Built on Windows File Server
* Supports SMB protocol & Windows NTFS - Supports SMB protocol & Windows NTFS
* Integrated with Microsoft Active Directory - Integrated with Microsoft Active Directory
* Can be accessed from AWS or your on-premise infrastructure - Can be accessed from AWS or your on-premise infrastructure
### Amazon FSx for Lustre ### Amazon FSx for Lustre
* A fully managed, high-performance, scalable file storage for High Performance Computing (HPC) - A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
* The name Lustre is derived from “Linux” and “cluster” - The name Lustre is derived from “Linux” and “cluster”
* Machine Learning, Analytics, Video Processing, Financial Modeling - Machine Learning, Analytics, Video Processing, Financial Modeling
* Scales up to 100s GB/s, millions of IOPS, sub-ms latencies - Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
## EC2 Instance Store ## EC2 Instance Store
* EBS volumes are network drives with good but “limited” performance - EBS volumes are network drives with good but “limited” performance
* If you need a high-performance hardware disk, use EC2 Instance Store - If you need a high-performance hardware disk, use EC2 Instance Store
* Better I/O performance - Better I/O performance
* EC2 Instance Store lose their storage if theyre stopped (ephemeral) - EC2 Instance Store lose their storage if theyre stopped (ephemeral)
* Good for buffer / cache / scratch data / temporary content - Good for buffer / cache / scratch data / temporary content
* Risk of data loss if hardware fails - Risk of data loss if hardware fails
* Backups and Replication are your responsibility - Backups and Replication are your responsibility
## Shared Responsibility Model for EC2 Storage ## Shared Responsibility Model for EC2 Storage
AWS | USER | AWS | USER |
---- | ---- | ------------------------------------------------- | -------------------------------------------------- |
Infrastructure | Setting up backup / snapshot procedures | Infrastructure | Setting up backup / snapshot procedures |
Replication for data for EBS volumes & EFS drives | Setting up data encryption | Replication for data for EBS volumes & EFS drives | Setting up data encryption |
Replacing faulty hardware | Responsibility of any data on the drives | Replacing faulty hardware | Responsibility of any data on the drives |
Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store | Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store |
## AMI Overview ## AMI Overview
* AMI = Amazon Machine Image - AMI = Amazon Machine Image
* AMI are a customization of an EC2 instance - AMI are a customization of an EC2 instance
* You add your own software, configuration, operating system, monitoring… - You add your own software, configuration, operating system, monitoring…
* Faster boot / configuration time because all your software is pre-packaged - Faster boot / configuration time because all your software is pre-packaged
* AMI are built for a specific region (and can be copied across regions) - AMI are built for a specific region (and can be copied across regions)
* You can launch EC2 instances from: - You can launch EC2 instances from:
* A Public AMI: AWS provided - A Public AMI: AWS provided
* Your own AMI: you make and maintain them yourself - Your own AMI: you make and maintain them yourself
* An AWS Marketplace AMI: an AMI someone else made (and potentially sells) - An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
### AMI Process (from an EC2 instance) ### AMI Process (from an EC2 instance)
* Start an EC2 instance and customize it - Start an EC2 instance and customize it
* Stop the instance (for data integrity) - Stop the instance (for data integrity)
* Build an AMI this will also create EBS snapshots - Build an AMI this will also create EBS snapshots
* Launch instances from other AMIs - Launch instances from other AMIs
## EC2 Image Builder ## EC2 Image Builder
* Used to automate the creation of Virtual Machines or container images - Used to automate the creation of Virtual Machines or container images
* => Automate the creation, maintain, validate and test EC2 AMIs - => Automate the creation, maintain, validate and test EC2 AMIs
* Can be run on a schedule (weekly, whenever packages are updated, etc…) - Can be run on a schedule (weekly, whenever packages are updated, etc…)
* Free service (only pay for the underlying resources) - Free service (only pay for the underlying resources)

View File

@@ -1,173 +1,192 @@
# Other Compute # Other Compute
What is Docker? - [Other Compute](#other-compute)
- [What is Docker?](#what-is-docker)
- [Where Docker images are stored?](#where-docker-images-are-stored)
- [Docker versus Virtual Machines](#docker-versus-virtual-machines)
- [ECS](#ecs)
- [Fargate](#fargate)
- [ECR](#ecr)
- [Whats serverless?](#whats-serverless)
- [Why AWS Lambda ?](#why-aws-lambda-)
- [Benefits of AWS Lambda](#benefits-of-aws-lambda)
- [AWS Lambda language support](#aws-lambda-language-support)
- [AWS Lambda Pricing: example](#aws-lambda-pricing-example)
- [Amazon API Gateway](#amazon-api-gateway)
- [AWS Batch](#aws-batch)
- [Batch vs Lambda](#batch-vs-lambda)
- [Amazon Lightsail](#amazon-lightsail)
- [Lambda Summary](#lambda-summary)
- [Other Compute Summary](#other-compute-summary)
* Docker is a software development platform to deploy apps ## What is Docker?
* Apps are packaged in containers that can be run on any OS
* Apps run the same, regardless of where theyre run
* Any machine
* No compatibility issues
* Predictable behavior
* Less work
* Easier to maintain and deploy
* Works with any language, any OS, any technology
* Scale containers up and down very quickly (seconds)
Where Docker images are stored? - Docker is a software development platform to deploy apps
- Apps are packaged in containers that can be run on any OS
- Apps run the same, regardless of where theyre run
- Any machine
- No compatibility issues
- Predictable behavior
- Less work
- Easier to maintain and deploy
- Works with any language, any OS, any technology
- Scale containers up and down very quickly (seconds)
* Docker images are stored in Docker Repositories ### Where Docker images are stored?
* Public: Docker Hub <https://hub.docker.com/>
* Find base images for many technologies or OS:
* Ubuntu
* MySQL
* NodeJS, Java…
* Private: Amazon ECR (Elastic Container Registry)
## Docker versus Virtual Machines - Docker images are stored in Docker Repositories
- Public: Docker Hub <https://hub.docker.com/>
- Find base images for many technologies or OS:
- Ubuntu
- MySQL
- NodeJS, Java…
- Private: Amazon ECR (Elastic Container Registry)
* Docker is ”sort of” a virtualization technology, but not exactly ### Docker versus Virtual Machines
* Resources are shared with the host => many containers on one server
- Docker is ”sort of” a virtualization technology, but not exactly
- Resources are shared with the host => many containers on one server
## ECS ## ECS
* ECS = Elastic Container Service - ECS = Elastic Container Service
* Launch Docker containers on AWS - Launch Docker containers on AWS
* You must provision & maintain the infrastructure (the EC2 instances) - You must provision & maintain the infrastructure (the EC2 instances)
* AWS takes care of starting / stopping containers - AWS takes care of starting / stopping containers
* Has integrations with the Application Load Balancer - Has integrations with the Application Load Balancer
## Fargate ## Fargate
* Launch Docker containers on AWS - Launch Docker containers on AWS
* You do not provision the infrastructure (no EC2 instances to manage) simpler! - You do not provision the infrastructure (no EC2 instances to manage) simpler!
* Serverless offering - Serverless offering
* AWS just runs containers for you based on the CPU / RAM you need - AWS just runs containers for you based on the CPU / RAM you need
## ECR ## ECR
* Elastic Container Registry - Elastic Container Registry
* Private Docker Registry on AWS - Private Docker Registry on AWS
* This is where you store your Docker images so they can be run by ECS or Fargate - This is where you store your Docker images so they can be run by ECS or Fargate
## Whats serverless? ## Whats serverless?
* Serverless is a new paradigm in which the developers dont have to manage servers anymore… - Serverless is a new paradigm in which the developers dont have to manage servers anymore…
* They just deploy code - They just deploy code
* They just deploy… functions ! - They just deploy… functions !
* Initially... Serverless == FaaS (Function as a Service) - Initially... Serverless == FaaS (Function as a Service)
* Serverless was pioneered by AWS Lambda but now also includes anything thats managed: “databases, messaging, storage, etc.” - Serverless was pioneered by AWS Lambda but now also includes anything thats managed: “databases, messaging, storage, etc.”
* Serverless does not mean there are no servers… - Serverless does not mean there are no servers…
* it means you just dont manage / provision / see them - it means you just dont manage / provision / see them
## Why AWS Lambda ? ## Why AWS Lambda ?
EC2 | Lambda | EC2 | Lambda |
---- | ---- | -------------------------------------------------- | ----------------------------------------- |
Virtual Servers in the Cloud | Virtual functions no servers to manage! | Virtual Servers in the Cloud | Virtual functions no servers to manage! |
Limited by RAM and CPU | Limited by time - short executions | Limited by RAM and CPU | Limited by time - short executions |
Continuously running | Run on-demand | Continuously running | Run on-demand |
Scaling means intervention to add / remove servers | Scaling is automated! | Scaling means intervention to add / remove servers | Scaling is automated! |
## Benefits of AWS Lambda ### Benefits of AWS Lambda
* Easy Pricing: - Easy Pricing:
* Pay per request and compute time - Pay per request and compute time
* Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time - Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
* Integrated with the whole AWS suite of services - Integrated with the whole AWS suite of services
* Event-Driven: functions get invoked by AWS when needed - Event-Driven: functions get invoked by AWS when needed
* Integrated with many programming languages - Integrated with many programming languages
* Easy monitoring through AWS CloudWatch - Easy monitoring through AWS CloudWatch
* Easy to get more resources per functions (up to 10GB of RAM!) - Easy to get more resources per functions (up to 10GB of RAM!)
* Increasing RAM will also improve CPU and network! - Increasing RAM will also improve CPU and network!
## AWS Lambda language support ### AWS Lambda language support
* Node.js (JavaScript) - Node.js (JavaScript)
* Python - Python
* Java (Java 8 compatible) - Java (Java 8 compatible)
* C# (.NET Core) - C# (.NET Core)
* Golang - Golang
* C# / Powershell - C# / Powershell
* Ruby - Ruby
* Custom Runtime API (community supported, example Rust) - Custom Runtime API (community supported, example Rust)
* Lambda Container Image - Lambda Container Image
* The container image must implement the Lambda Runtime API - The container image must implement the Lambda Runtime API
* ECS / Fargate is preferred for running arbitrary Docker images - ECS / Fargate is preferred for running arbitrary Docker images
## AWS Lambda Pricing: example ### AWS Lambda Pricing: example
* You can find overall pricing information here: <https://aws.amazon.com/lambda/pricing/> - You can find overall pricing information here: <https://aws.amazon.com/lambda/pricing/>
* Pay per calls: - Pay per calls:
* First 1,000,000 requests are free - First 1,000,000 requests are free
* $0.20 per 1 million requests thereafter ($0.0000002 per request) - $0.20 per 1 million requests thereafter ($0.0000002 per request)
* Pay per duration: (in increment of 1 ms) - Pay per duration: (in increment of 1 ms)
* 400,000 GB-seconds of compute time per month for FREE - 400,000 GB-seconds of compute time per month for FREE
* == 400,000 seconds if function is 1GB RAM - == 400,000 seconds if function is 1GB RAM
* == 3,200,000 seconds if function is 128 MB RAM - == 3,200,000 seconds if function is 128 MB RAM
* After that $1.00 for 600,000 GB-seconds - After that $1.00 for 600,000 GB-seconds
* It is usually **very cheap** to run AWS Lambda so its **very popular** - It is usually **very cheap** to run AWS Lambda so its **very popular**
## Amazon API Gateway ## Amazon API Gateway
* Example: building a serverless API - Example: building a serverless API
* Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs - Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs
* Serverless and scalable - Serverless and scalable
* Supports RESTful APIs and WebSocket APIs - Supports RESTful APIs and WebSocket APIs
* Support for security, user authentication, API throttling, API keys, monitoring. - Support for security, user authentication, API throttling, API keys, monitoring.
## AWS Batch ## AWS Batch
* Fully managed batch processing at any scale - Fully managed batch processing at any scale
* Efficiently run 100,000s of computing batch jobs on AWS - Efficiently run 100,000s of computing batch jobs on AWS
* A “batch” job is a job with a start and an end (opposed to continuous) - A “batch” job is a job with a start and an end (opposed to continuous)
* Batch will dynamically launch EC2 instances or Spot Instances - Batch will dynamically launch EC2 instances or Spot Instances
* AWS Batch provisions the right amount of compute / memory - AWS Batch provisions the right amount of compute / memory
* You submit or schedule batch jobs and AWS Batch does the rest! - You submit or schedule batch jobs and AWS Batch does the rest!
* Batch jobs are defined as Docker images and run on ECS - Batch jobs are defined as Docker images and run on ECS
* Helpful for cost optimizations and focusing less on the infrastructure - Helpful for cost optimizations and focusing less on the infrastructure
## Batch vs Lambda ## Batch vs Lambda
Batch | Lambda | Batch | Lambda |
---- | ---- | ------------------------------------------------------ | ---------------------------- |
No time limit | Time limit | No time limit | Time limit |
Any runtime as long as its packaged as a Docker image | Limited runtime | Any runtime as long as its packaged as a Docker image | Limited runtime |
Rely on EBS / instance store for disk space | Limited temporary disk space | Rely on EBS / instance store for disk space | Limited temporary disk space |
Relies on EC2 (can be managed by AWS) | Serverless | Relies on EC2 (can be managed by AWS) | Serverless |
## Amazon Lightsail ## Amazon Lightsail
* Virtual servers, storage, databases, and networking - Virtual servers, storage, databases, and networking
* Low & predictable pricing - Low & predictable pricing
* Simpler alternative to using EC2, RDS, ELB, EBS, Route 53… - Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
* Great for people with little cloud experience! - Great for people with little cloud experience!
* Can setup notifications and monitoring of your Lightsail resources - Can setup notifications and monitoring of your Lightsail resources
* Use cases: - Use cases:
* Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…) - Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
* Websites (templates for WordPress, Magento, Plesk, Joomla) - Websites (templates for WordPress, Magento, Plesk, Joomla)
* Dev / Test environment - Dev / Test environment
* Has high availability but no auto-scaling, limited AWS integrations - Has high availability but no auto-scaling, limited AWS integrations
## Lambda Summary ## Lambda Summary
* Lambda is Serverless, Function as a Service, seamless scaling, reactive - Lambda is Serverless, Function as a Service, seamless scaling, reactive
* Lambda Billing: - Lambda Billing:
* By the time run x by the RAM provisioned - By the time run x by the RAM provisioned
* By the number of invocations - By the number of invocations
* Language Support: many programming languages except (arbitrary) Docker - Language Support: many programming languages except (arbitrary) Docker
* Invocation time: up to 15 minutes - Invocation time: up to 15 minutes
* Use cases: - Use cases:
* Create Thumbnails for images uploaded onto S3 - Create Thumbnails for images uploaded onto S3
* Run a Serverless cron job - Run a Serverless cron job
* API Gateway: expose Lambda functions as HTTP API - API Gateway: expose Lambda functions as HTTP API
## Other Compute Summary ## Other Compute Summary
* Docker: container technology to run applications - Docker: container technology to run applications
* ECS: run Docker containers on EC2 instances - ECS: run Docker containers on EC2 instances
* Fargate: - Fargate:
* Run Docker containers without provisioning the infrastructure - Run Docker containers without provisioning the infrastructure
* Serverless offering (no EC2 instances) - Serverless offering (no EC2 instances)
* ECR: Private Docker Images Repository - ECR: Private Docker Images Repository
* Batch: run batch jobs on AWS across managed EC2 instances - Batch: run batch jobs on AWS across managed EC2 instances
* Lightsail: predictable & low pricing for simple application & DB stacks - Lightsail: predictable & low pricing for simple application & DB stacks

View File

@@ -1,71 +1,109 @@
# Amazon S3 # Amazon S3
- [Amazon S3](#amazon-s3)
- [S3 Use cases](#s3-use-cases)
- [Amazon S3 Overview - Buckets](#amazon-s3-overview---buckets)
- [Amazon S3 Overview - Objects](#amazon-s3-overview---objects)
- [S3 Security](#s3-security)
- [S3 Bucket Policies](#s3-bucket-policies)
- [Bucket settings for Block Public Access](#bucket-settings-for-block-public-access)
- [S3 Websites](#s3-websites)
- [S3 - Versioning](#s3---versioning)
- [S3 Access Logs](#s3-access-logs)
- [S3 Replication (CRR & SRR)](#s3-replication-crr--srr)
- [S3 Storage Classes](#s3-storage-classes)
- [S3 Durability and Availability](#s3-durability-and-availability)
- [S3 Standard General Purpose](#s3-standard-general-purpose)
- [S3 Storage Classes - Infrequent Access](#s3-storage-classes---infrequent-access)
- [S3 Standard Infrequent Access (S3 Standard-IA)](#s3-standard-infrequent-access-s3-standard-ia)
- [S3 One Zone Infrequent Access (S3 One Zone-IA)](#s3-one-zone-infrequent-access-s3-one-zone-ia)
- [Amazon S3 Glacier Storage Classes](#amazon-s3-glacier-storage-classes)
- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
- [Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
- [Amazon S3 Glacier Deep Archive - for long term storage](#amazon-s3-glacier-deep-archive---for-long-term-storage)
- [S3 Intelligent-Tiering](#s3-intelligent-tiering)
- [S3 Object Lock & Glacier Vault Lock](#s3-object-lock--glacier-vault-lock)
- [Shared Responsibility Model for S3](#shared-responsibility-model-for-s3)
- [AWS Snow Family](#aws-snow-family)
- [Data Migrations with AWS Snow Family](#data-migrations-with-aws-snow-family)
- [Time to Transfer](#time-to-transfer)
- [Snowball Edge (for data transfers)](#snowball-edge-for-data-transfers)
- [AWS Snowcone](#aws-snowcone)
- [AWS Snowmobile](#aws-snowmobile)
- [Snow Family - Usage Process](#snow-family---usage-process)
- [What is Edge Computing?](#what-is-edge-computing)
- [Snow Family - Edge Computing](#snow-family---edge-computing)
- [AWS OpsHub](#aws-opshub)
- [Hybrid Cloud for Storage](#hybrid-cloud-for-storage)
- [AWS Storage Gateway](#aws-storage-gateway)
- [Amazon S3 - Summary](#amazon-s3---summary)
## S3 Use cases ## S3 Use cases
* Backup and storage - Backup and storage
* Disaster Recovery - Disaster Recovery
* Archive - Archive
* Hybrid Cloud storage - Hybrid Cloud storage
* Application hosting - Application hosting
* Media hosting - Media hosting
* Data lakes & big data analytics - Data lakes & big data analytics
* Software delivery - Software delivery
* Static website - Static website
## Amazon S3 Overview - Buckets ## Amazon S3 Overview - Buckets
* Amazon S3 allows people to store objects (files) in “buckets” (directories) - Amazon S3 allows people to store objects (files) in “buckets” (directories)
* Buckets must have a globally unique name (across all regions all accounts) - Buckets must have a globally unique name (across all regions all accounts)
* Buckets are defined at the region level - Buckets are defined at the region level
* S3 looks like a global service but buckets are created in a region - S3 looks like a global service but buckets are created in a region
* Naming convention - Naming convention
* No uppercase - No uppercase
* No underscore - No underscore
* 3-63 characters long - 3-63 characters long
* Not an IP - Not an IP
* Must start with lowercase letter or number - Must start with lowercase letter or number
## Amazon S3 Overview - Objects ## Amazon S3 Overview - Objects
* Objects (files) have a Key - Objects (files) have a Key
* The key is the FULL path: - The key is the FULL path:
* s3://my-bucket/my_file.txt - s3://my-bucket/my_file.txt
* s3://my-bucket/my_folder1/another_folder/my_file.txt - s3://my-bucket/my_folder1/another_folder/my_file.txt
* The key is composed of **prefix** + **object name** - The key is composed of **prefix** + **object name**
* s3://my-bucket/my_folder1/another_folder/my_file.txt - s3://my-bucket/my_folder1/another_folder/my_file.txt
* Theres no concept of “directories” within buckets (although the UI will trick you to think otherwise) - Theres no concept of “directories” within buckets (although the UI will trick you to think otherwise)
* Just keys with very long names that contain slashes (“/”) - Just keys with very long names that contain slashes (“/”)
* Object values are the content of the body: - Object values are the content of the body:
* Max Object Size is 5TB (5000GB) - Max Object Size is 5TB (5000GB)
* If uploading more than 5GB, must use “multi-part upload” - If uploading more than 5GB, must use “multi-part upload”
* Metadata (list of text key / value pairs system or user metadata) - Metadata (list of text key / value pairs system or user metadata)
* Tags (Unicode key / value pair up to 10) useful for security / lifecycle - Tags (Unicode key / value pair up to 10) useful for security / lifecycle
* Version ID (if versioning is enabled) - Version ID (if versioning is enabled)
## S3 Security ## S3 Security
* **User based** - **User based**
* IAM policies - which API calls should be allowed for a specific user from IAM console - IAM policies - which API calls should be allowed for a specific user from IAM console
* **Resource Based** - **Resource Based**
* Bucket Policies - bucket wide rules from the S3 console - allows cross account - Bucket Policies - bucket wide rules from the S3 console - allows cross account
* Object Access Control List (ACL) finer grain - Object Access Control List (ACL) finer grain
* Bucket Access Control List (ACL) less common - Bucket Access Control List (ACL) less common
* **Note:** an IAM principal can access an S3 object if - **Note:** an IAM principal can access an S3 object if
* the user IAM permissions allow it OR the resource policy ALLOWS it - the user IAM permissions allow it OR the resource policy ALLOWS it
* AND theres no explicit DENY - AND theres no explicit DENY
* **Encryption:** encrypt objects in Amazon S3 using encryption keys - **Encryption:** encrypt objects in Amazon S3 using encryption keys
S3 Bucket Policies ## S3 Bucket Policies
* JSON based policies - JSON based policies
* Resources: buckets and objects - Resources: buckets and objects
* Actions: Set of API to Allow or Deny - Actions: Set of API to Allow or Deny
* Effect: Allow / Deny - Effect: Allow / Deny
Principal: The account or user to apply the policy to Principal: The account or user to apply the policy to
* Use S3 bucket for policy to: - Use S3 bucket for policy to:
* Grant public access to the bucket - Grant public access to the bucket
* Force objects to be encrypted at upload - Force objects to be encrypted at upload
* Grant access to another account (Cross Account) - Grant access to another account (Cross Account)
```json ```json
{ {
@@ -88,215 +126,216 @@ S3 Bucket Policies
## Bucket settings for Block Public Access ## Bucket settings for Block Public Access
* Block all public access: On - Block all public access: On
* Block public access to buckets and objects granted through new access control lists (ACLS): On - Block public access to buckets and objects granted through new access control lists (ACLS): On
* Block public access to buckets and objects granted through any access control lists (ACLS): On - Block public access to buckets and objects granted through any access control lists (ACLS): On
* Block public access to buckets and objects granted through new public bucket or access point policies: On - Block public access to buckets and objects granted through new public bucket or access point policies: On
* Block public and cross-account access to buckets and objects through any public bucket or access point policies: On - Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
* These settings were created to prevent company data leaks - These settings were created to prevent company data leaks
* If you know your bucket should never be public, leave these on - If you know your bucket should never be public, leave these on
* Can be set at the account level - Can be set at the account level
## S3 Websites ## S3 Websites
* S3 can host static websites and have them accessible on the www - S3 can host static websites and have them accessible on the www
* The website URL will be: - The website URL will be:
* bucket-name.s3-website-AWS-region.amazonaws.com - bucket-name.s3-website-AWS-region.amazonaws.com
OR OR
* bucket-name.s3-website.AWS-region.amazonaws.com - bucket-name.s3-website.AWS-region.amazonaws.com
* **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!** - **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
## S3 - Versioning ## S3 - Versioning
* You can version your files in Amazon S3 - You can version your files in Amazon S3
* It is enabled at the bucket level - It is enabled at the bucket level
* Same key overwrite will increment the “version”: 1, 2, 3…. - Same key overwrite will increment the “version”: 1, 2, 3….
* It is best practice to version your buckets - It is best practice to version your buckets
* Protect against unintended deletes (ability to restore a version) - Protect against unintended deletes (ability to restore a version)
* Easy roll back to previous version - Easy roll back to previous version
* Notes: - Notes:
* Any file that is not versioned prior to enabling versioning will have version “null” - Any file that is not versioned prior to enabling versioning will have version “null”
* Suspending versioning does not delete the previous versions - Suspending versioning does not delete the previous versions
## S3 Access Logs ## S3 Access Logs
* For audit purpose, you may want to log all access to S3 buckets - For audit purpose, you may want to log all access to S3 buckets
* Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket - Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
* That data can be analyzed using data analysis tools… - That data can be analyzed using data analysis tools…
* Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc… - Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
## S3 Replication (CRR & SRR) ## S3 Replication (CRR & SRR)
* Must enable versioning in source and destination - Must enable versioning in source and destination
* Cross Region Replication (CRR) - Cross Region Replication (CRR)
* Same Region Replication (SRR) - Same Region Replication (SRR)
* Buckets can be in different accounts - Buckets can be in different accounts
* Copying is asynchronous - Copying is asynchronous
* Must give proper IAM permissions to S3 - Must give proper IAM permissions to S3
* CRR - Use cases: compliance, lower latency access, replication across accounts - CRR - Use cases: compliance, lower latency access, replication across accounts
* SRR Use cases: log aggregation, live replication between production and test accounts - SRR Use cases: log aggregation, live replication between production and test accounts
## S3 Storage Classes ## S3 Storage Classes
* [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose) - [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
* [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia) - [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
* [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia) - [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
* [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval) - [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
* [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier) - [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
* [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive--for-long-term-storage) - [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive--for-long-term-storage)
* [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering) - [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
* Can move between classes manually or using S3 Lifecycle configurations - Can move between classes manually or using S3 Lifecycle configurations
## S3 Durability and Availability ### S3 Durability and Availability
* Durability: - Durability:
* High durability (99.999999999%, 11 9s) of objects across multiple AZ - High durability (99.999999999%, 11 9s) of objects across multiple AZ
* If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years - If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
* Same for all storage classes - Same for all storage classes
* Availability: - Availability:
* Measures how readily available a service is - Measures how readily available a service is
* Varies depending on storage class - Varies depending on storage class
* Example: S3 standard has 99.99% availability = not available 53 minutes a year - Example: S3 standard has 99.99% availability = not available 53 minutes a year
## S3 Standard General Purpose ### S3 Standard General Purpose
* 99.99% Availability - 99.99% Availability
* Used for frequently accessed data - Used for frequently accessed data
* Low latency and high throughput - Low latency and high throughput
* Sustain 2 concurrent facility failures - Sustain 2 concurrent facility failures
* Use Cases: Big Data analytics, mobile & gaming applications, content distribution… - Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
## S3 Storage Classes Infrequent Access ### S3 Storage Classes - Infrequent Access
* For data that is less frequently accessed, but requires rapid access when needed - For data that is less frequently accessed, but requires rapid access when needed
* Lower cost than S3 Standard - Lower cost than S3 Standard
### S3 Standard Infrequent Access (S3 Standard-IA) #### S3 Standard Infrequent Access (S3 Standard-IA)
* 99.9% Availability - 99.9% Availability
* Use cases: Disaster Recovery, backups - Use cases: Disaster Recovery, backups
### S3 One Zone Infrequent Access (S3 One Zone-IA) #### S3 One Zone Infrequent Access (S3 One Zone-IA)
* High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed - High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
* 99.5% Availability - 99.5% Availability
* Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate - Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
## Amazon S3 Glacier Storage Classes ### Amazon S3 Glacier Storage Classes
* Low-cost object storage meant for archiving / backup - Low-cost object storage meant for archiving / backup
* Pricing: price for storage + object retrieval cost - Pricing: price for storage + object retrieval cost
### Amazon S3 Glacier Instant Retrieval #### Amazon S3 Glacier Instant Retrieval
* Millisecond retrieval, great for data accessed once a quarter - Millisecond retrieval, great for data accessed once a quarter
* Minimum storage duration of 90 days - Minimum storage duration of 90 days
### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier) #### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
* Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) free - Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) free
* Minimum storage duration of 90 days - Minimum storage duration of 90 days
### Amazon S3 Glacier Deep Archive for long term storage #### Amazon S3 Glacier Deep Archive - for long term storage
* Standard (12 hours), Bulk (48 hours) - Standard (12 hours), Bulk (48 hours)
* Minimum storage duration of 180 days - Minimum storage duration of 180 days
## S3 Intelligent-Tiering ### S3 Intelligent-Tiering
* Small monthly monitoring and auto-tiering fee - Small monthly monitoring and auto-tiering fee
* Moves objects automatically between Access Tiers based on usage - Moves objects automatically between Access Tiers based on usage
* There are no retrieval charges in S3 Intelligent-Tiering - There are no retrieval charges in S3 Intelligent-Tiering
* Frequent Access tier (automatic): default tier - Frequent Access tier (automatic): default tier
* Infrequent Access tier (automatic): objects not accessed for 30 days - Infrequent Access tier (automatic): objects not accessed for 30 days
* Archive Instant Access tier (automatic): objects not accessed for 90 days - Archive Instant Access tier (automatic): objects not accessed for 90 days
* Archive Access tier (optional): configurable from 90 days to 700+ days - Archive Access tier (optional): configurable from 90 days to 700+ days
* Deep Archive Access tier (optional): config. from 180 days to 700+ days - Deep Archive Access tier (optional): config. from 180 days to 700+ days
## S3 Object Lock & Glacier Vault Lock ## S3 Object Lock & Glacier Vault Lock
* S3 Object Lock - S3 Object Lock
* Adopt a WORM (Write Once Read Many) model - Adopt a WORM (Write Once Read Many) model
* Block an object version deletion for a specified amount of time - Block an object version deletion for a specified amount of time
* Glacier Vault Lock - Glacier Vault Lock
* Adopt a WORM (Write Once Read Many) model - Adopt a WORM (Write Once Read Many) model
* Lock the policy for future edits (can no longer be changed) - Lock the policy for future edits (can no longer be changed)
* Helpful for compliance and data retention - Helpful for compliance and data retention
## Shared Responsibility Model for S3 ## Shared Responsibility Model for S3
AWS | YOU | AWS | YOU |
---- | ---- | ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup | Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup |
Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes | Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes |
Compliance validation | Data encryption at rest and in transit | Compliance validation | Data encryption at rest and in transit |
## AWS Snow Family ## AWS Snow Family
* Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS - Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
* Data migration: - Data migration:
* Snowcone - Snowcone
* Snowball Edge - Snowball Edge
* Snowmobile - Snowmobile
* Edge computing: - Edge computing:
* Snowcone - Snowcone
* Snowball Edge - Snowball Edge
## Data Migrations with AWS Snow Family ### Data Migrations with AWS Snow Family
* **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices! - **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
* Challenges: - Challenges:
* Limited connectivity - Limited connectivity
* Limited bandwidth - Limited bandwidth
* High network cost - High network cost
* Shared bandwidth (cant maximize the line) - Shared bandwidth (cant maximize the line)
* Connection stability - Connection stability
## Time to Transfer ### Time to Transfer
Data | 100 Mbps | 1Gbps | 10Gbps | Data | 100 Mbps | 1Gbps | 10Gbps |
10 TB | 12 days | 30 hours | 3 hours | ------ | -------- | -------- | -------- |
100 TB | 124 days | 12 days | 30 hours | 10 TB | 12 days | 30 hours | 3 hours |
1 PB | 3 years | 124 days | 12 days | 100 TB | 124 days | 12 days | 30 hours |
| 1 PB | 3 years | 124 days | 12 days |
## Snowball Edge (for data transfers) ### Snowball Edge (for data transfers)
* Physical data transport solution: move TBs or PBs of data in or out of AWS - Physical data transport solution: move TBs or PBs of data in or out of AWS
* Alternative to moving data over the network (and paying network fees) - Alternative to moving data over the network (and paying network fees)
* Pay per data transfer job - Pay per data transfer job
* Provide block storage and Amazon S3-compatible object storage - Provide block storage and Amazon S3-compatible object storage
* Snowball Edge Storage Optimized - Snowball Edge Storage Optimized
* 80 TB of HDD capacity for block volume and S3 compatible object storage - 80 TB of HDD capacity for block volume and S3 compatible object storage
* Snowball Edge Compute Optimized - Snowball Edge Compute Optimized
* 42 TB of HDD capacity for block volume and S3 compatible object storage - 42 TB of HDD capacity for block volume and S3 compatible object storage
* Use cases: large data cloud migrations, DC decommission, disaster recovery - Use cases: large data cloud migrations, DC decommission, disaster recovery
## AWS Snowcone ### AWS Snowcone
* Small, portable computing, anywhere, rugged & secure, withstands harsh environments - Small, portable computing, anywhere, rugged & secure, withstands harsh environments
* Light (4.5 pounds, 2.1 kg) - Light (4.5 pounds, 2.1 kg)
* Device used for edge computing, storage, and data transfer - Device used for edge computing, storage, and data transfer
* **8 TBs of usable storage** - **8 TBs of usable storage**
* Use Snowcone where Snowball does not fit (space-constrained environment) - Use Snowcone where Snowball does not fit (space-constrained environment)
* Must provide your own battery / cables - Must provide your own battery / cables
* Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data - Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
## AWS Snowmobile ### AWS Snowmobile
* Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs) - Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
* Each Snowmobile has 100 PB of capacity (use multiple in parallel) - Each Snowmobile has 100 PB of capacity (use multiple in parallel)
* High security: temperature controlled, GPS, 24/7 video surveillance - High security: temperature controlled, GPS, 24/7 video surveillance
* **Better than Snowball if you transfer more than 10 PB** - **Better than Snowball if you transfer more than 10 PB**
Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile | Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile |
---- | ---- | ---- | ---- | ---------------- | ------------------------------- | ------------------------------- | ----------------------- |
Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB | Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB |
Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline | Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline |
## Snow Family Usage Process ### Snow Family - Usage Process
1. Request Snowball devices from the AWS console for delivery 1. Request Snowball devices from the AWS console for delivery
2. Install the snowball client / AWS OpsHub on your servers 2. Install the snowball client / AWS OpsHub on your servers
@@ -307,78 +346,78 @@ Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up
## What is Edge Computing? ## What is Edge Computing?
* Process data while its being created on an edge location - Process data while its being created on an edge location
* A truck on the road, a ship on the sea, a mining station underground... - A truck on the road, a ship on the sea, a mining station underground...
* These locations may have - These locations may have
* Limited / no internet access - Limited / no internet access
* Limited / no easy access to computing power - Limited / no easy access to computing power
* We setup a **Snowball Edge / Snowcone** device to do edge computing - We setup a **Snowball Edge / Snowcone** device to do edge computing
* Use cases of Edge Computing: - Use cases of Edge Computing:
* Preprocess data - Preprocess data
* Machine learning at the edge - Machine learning at the edge
* Transcoding media streams - Transcoding media streams
* Eventually (if need be) we can ship back the device to AWS (for transferring data for example) - Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
## Snow Family Edge Computing ## Snow Family - Edge Computing
* **Snowcone (smaller)** - **Snowcone (smaller)**
* 2 CPUs, 4 GB of memory, wired or wireless access - 2 CPUs, 4 GB of memory, wired or wireless access
* USB-C power using a cord or the optional battery - USB-C power using a cord or the optional battery
* **Snowball Edge Compute Optimized** - **Snowball Edge Compute Optimized**
* 52 vCPUs, 208 GiB of RAM - 52 vCPUs, 208 GiB of RAM
* Optional GPU (useful for video processing or machine learning) - Optional GPU (useful for video processing or machine learning)
* 42 TB usable storage - 42 TB usable storage
* **Snowball Edge Storage Optimized** - **Snowball Edge Storage Optimized**
* Up to 40 vCPUs, 80 GiB of RAM - Up to 40 vCPUs, 80 GiB of RAM
* Object storage clustering available - Object storage clustering available
* All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass) - All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
* Long-term deployment options: 1 and 3 years discounted pricing - Long-term deployment options: 1 and 3 years discounted pricing
## AWS OpsHub ## AWS OpsHub
* Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool) - Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
* Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device - Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
* Unlocking and configuring single or clustered devices - Unlocking and configuring single or clustered devices
* Transferring files - Transferring files
* Launching and managing instances running on Snow Family Devices - Launching and managing instances running on Snow Family Devices
* Monitor device metrics (storage capacity, active instances on your device) - Monitor device metrics (storage capacity, active instances on your device)
* Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS)) - Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
## Hybrid Cloud for Storage ## Hybrid Cloud for Storage
* AWS is pushing for ”hybrid cloud” - AWS is pushing for ”hybrid cloud”
* Part of your infrastructure is on-premises - Part of your infrastructure is on-premises
* Part of your infrastructure is on the cloud - Part of your infrastructure is on the cloud
* This can be due to - This can be due to
* Long cloud migrations - Long cloud migrations
* Security requirements - Security requirements
* Compliance requirements - Compliance requirements
* IT strategy - IT strategy
* S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise? - S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
* AWS Storage Gateway! - AWS Storage Gateway!
## AWS Storage Gateway ## AWS Storage Gateway
* Bridge between on-premise data and cloud data in S3 - Bridge between on-premise data and cloud data in S3
* Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud - Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
* Use cases: disaster recovery, backup & restore, tiered storage - Use cases: disaster recovery, backup & restore, tiered storage
* Types of Storage Gateway: - Types of Storage Gateway:
* File Gateway - File Gateway
* Volume Gateway - Volume Gateway
* Tape Gateway - Tape Gateway
* No need to know the types at the exam - No need to know the types at the exam
## Amazon S3 Summary ## Amazon S3 - Summary
* Buckets vs Objects: global unique name, tied to a region - Buckets vs Objects: global unique name, tied to a region
* S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption - S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
* S3 Websites: host a static website on Amazon S3 - S3 Websites: host a static website on Amazon S3
* S3 Versioning: multiple versions for files, prevent accidental deletes - S3 Versioning: multiple versions for files, prevent accidental deletes
* S3 Access Logs: log requests made within your S3 bucket - S3 Access Logs: log requests made within your S3 bucket
* S3 Replication: same-region or cross-region, must enable versioning - S3 Replication: same-region or cross-region, must enable versioning
* S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive - S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
* S3 Lifecycle Rules: transition objects between classes - S3 Lifecycle Rules: transition objects between classes
* S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many) - S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
* Snow Family: import data onto S3 through a physical device, edge computing - Snow Family: import data onto S3 through a physical device, edge computing
* OpsHub: desktop application to manage Snow Family devices - OpsHub: desktop application to manage Snow Family devices
* Storage Gateway: hybrid solution to extend on-premises storage to S3 - Storage Gateway: hybrid solution to extend on-premises storage to S3