[Modified] Table Of Contents added
This commit is contained in:
@@ -4,7 +4,6 @@
|
||||
|
||||
### Table of contents
|
||||
|
||||
- AWS Fundamentals
|
||||
- [What is Cloud Computing?](sections/cloud_computing.md)
|
||||
- [IAM: Identity Access & Management](sections/iam.md)
|
||||
- [EC2: Virtual Machines](sections/ec2.md)
|
||||
|
||||
@@ -1,37 +1,64 @@
|
||||
# Databases
|
||||
# Databases & Analytics
|
||||
|
||||
- [Databases & Analytics](#databases--analytics)
|
||||
- [Databases Intro](#databases-intro)
|
||||
- [Relational Databases](#relational-databases)
|
||||
- [NoSQL Databases](#nosql-databases)
|
||||
- [NoSQL data example: JSON](#nosql-data-example-json)
|
||||
- [Databases & Shared Responsibility on AWS](#databases--shared-responsibility-on-aws)
|
||||
- [AWS RDS Overview](#aws-rds-overview)
|
||||
- [Advantage over using RDS versus deploying DB on EC2](#advantage-over-using-rds-versus-deploying-db-on-ec2)
|
||||
- [RDS Deployments: Read Replicas, Multi-AZ](#rds-deployments-read-replicas-multi-az)
|
||||
- [RDS Deployments: Multi-Region](#rds-deployments-multi-region)
|
||||
- [Amazon Aurora](#amazon-aurora)
|
||||
- [Amazon ElastiCache Overview](#amazon-elasticache-overview)
|
||||
- [DynamoDB](#dynamodb)
|
||||
- [DynamoDB Accelerator - DAX](#dynamodb-accelerator---dax)
|
||||
- [DynamoDB - Global Tables](#dynamodb---global-tables)
|
||||
- [Redshift Overview](#redshift-overview)
|
||||
- [Amazon EMR](#amazon-emr)
|
||||
- [Amazon Athena](#amazon-athena)
|
||||
- [Amazon QuickSight](#amazon-quicksight)
|
||||
- [DocumentDB](#documentdb)
|
||||
- [Amazon Neptune](#amazon-neptune)
|
||||
- [Amazon QLDB](#amazon-qldb)
|
||||
- [Amazon Managed Blockchain](#amazon-managed-blockchain)
|
||||
- [AWS Glue](#aws-glue)
|
||||
- [DMS - Database Migration Service](#dms---database-migration-service)
|
||||
- [Databases & Analytics Summary](#databases--analytics-summary)
|
||||
|
||||
## Databases Intro
|
||||
|
||||
* Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits
|
||||
* Sometimes, you want to store data in a database…
|
||||
* You can structure the data
|
||||
* You build indexes to efficiently query / search through the data
|
||||
* You define relationships between your datasets
|
||||
* Databases are optimized for a purpose and come with different features, shapes and constraint
|
||||
- Storing data on disk (EFS, EBS, EC2 Instance Store, S3) can have its limits
|
||||
- Sometimes, you want to store data in a database…
|
||||
- You can structure the data
|
||||
- You build indexes to efficiently query / search through the data
|
||||
- You define relationships between your datasets
|
||||
- Databases are optimized for a purpose and come with different features, shapes and constraint
|
||||
|
||||
## Relational Databases
|
||||
|
||||
* Looks just like Excel spreadsheets, with links between them!
|
||||
* Can use the SQL language to perform queries / lookups
|
||||
- Looks just like Excel spreadsheets, with links between them!
|
||||
- Can use the SQL language to perform queries / lookups
|
||||
|
||||
## NoSQL Databases
|
||||
|
||||
* NoSQL = non-SQL = non relational databases
|
||||
* NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications.
|
||||
* Benefits:
|
||||
* Flexibility: easy to evolve data model
|
||||
* Scalability: designed to scale-out by using distributed clusters
|
||||
* High-performance: optimized for a specific data model
|
||||
* Highly functional: types optimized for the data model
|
||||
* Examples: Key-value, document, graph, in-memory, search databases
|
||||
- NoSQL = non-SQL = non relational databases
|
||||
- NoSQL databases are purpose built for specific data models and have flexible schemas for building modern applications.
|
||||
- Benefits:
|
||||
- Flexibility: easy to evolve data model
|
||||
- Scalability: designed to scale-out by using distributed clusters
|
||||
- High-performance: optimized for a specific data model
|
||||
- Highly functional: types optimized for the data model
|
||||
- Examples: Key-value, document, graph, in-memory, search databases
|
||||
|
||||
### NoSQL data example: JSON
|
||||
|
||||
* JSON = JavaScript Object Notation
|
||||
* JSON is a common form of data that fits into a NoSQL model
|
||||
* Data can be nested
|
||||
* Fields can change over time
|
||||
* Support for new types: arrays, etc…
|
||||
- JSON = JavaScript Object Notation
|
||||
- JSON is a common form of data that fits into a NoSQL model
|
||||
- Data can be nested
|
||||
- Fields can change over time
|
||||
- Support for new types: arrays, etc…
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -52,213 +79,213 @@
|
||||
|
||||
## Databases & Shared Responsibility on AWS
|
||||
|
||||
* AWS offers use to manage different databases
|
||||
* Benefits include:
|
||||
* Quick Provisioning, High Availability, Vertical and Horizontal Scaling
|
||||
* Automated Backup & Restore, Operations, Upgrades
|
||||
* Operating System Patching is handled by AWS
|
||||
* Monitoring, alerting
|
||||
* Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling
|
||||
- AWS offers use to manage different databases
|
||||
- Benefits include:
|
||||
- Quick Provisioning, High Availability, Vertical and Horizontal Scaling
|
||||
- Automated Backup & Restore, Operations, Upgrades
|
||||
- Operating System Patching is handled by AWS
|
||||
- Monitoring, alerting
|
||||
- Note: many databases technologies could be run on EC2, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scaling
|
||||
|
||||
## AWS RDS Overview
|
||||
|
||||
* RDS stands for Relational Database Service
|
||||
* It’s a managed DB service for DB use SQL as a query language.
|
||||
* It allows you to create databases in the cloud that are managed by AWS
|
||||
* Postgres
|
||||
* MySQL
|
||||
* MariaDB
|
||||
* Oracle
|
||||
* Microsoft SQL Server
|
||||
* **Aurora (AWS Proprietary database)**
|
||||
- RDS stands for Relational Database Service
|
||||
- It’s a managed DB service for DB use SQL as a query language.
|
||||
- It allows you to create databases in the cloud that are managed by AWS
|
||||
- Postgres
|
||||
- MySQL
|
||||
- MariaDB
|
||||
- Oracle
|
||||
- Microsoft SQL Server
|
||||
- **Aurora (AWS Proprietary database)**
|
||||
|
||||
### Advantage over using RDS versus deploying DB on EC2
|
||||
|
||||
* RDS is a managed service:
|
||||
* Automated provisioning, OS patching
|
||||
* Continuous backups and restore to specific timestamp (Point in Time Restore)!
|
||||
* Monitoring dashboards
|
||||
* Read replicas for improved read performance
|
||||
* Multi AZ setup for DR (Disaster Recovery)
|
||||
* Maintenance windows for upgrades
|
||||
* Scaling capability (vertical and horizontal)
|
||||
* Storage backed by EBS (gp2 or io1)
|
||||
* BUT you can’t SSH into your instances
|
||||
- RDS is a managed service:
|
||||
- Automated provisioning, OS patching
|
||||
- Continuous backups and restore to specific timestamp (Point in Time Restore)!
|
||||
- Monitoring dashboards
|
||||
- Read replicas for improved read performance
|
||||
- Multi AZ setup for DR (Disaster Recovery)
|
||||
- Maintenance windows for upgrades
|
||||
- Scaling capability (vertical and horizontal)
|
||||
- Storage backed by EBS (gp2 or io1)
|
||||
- BUT you can’t SSH into your instances
|
||||
|
||||
## Amazon Aurora
|
||||
### RDS Deployments: Read Replicas, Multi-AZ
|
||||
|
||||
* Aurora is a proprietary technology from AWS (not open sourced)
|
||||
* PostgreSQL and MySQL are both supported as Aurora DB
|
||||
* Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
|
||||
* Aurora storage automatically grows in increments of 10GB, up to 64 TB.
|
||||
* Aurora costs more than RDS (20% more) – but is more efficient
|
||||
* Not in the free tier
|
||||
|
||||
## RDS Deployments: Read Replicas, Multi-AZ
|
||||
|
||||
Read Replicas | Multi-AZ
|
||||
---- | ----
|
||||
Scale the read workload of your DB | Failover in case of AZ outage (high availability)
|
||||
Can create up to 5 Read Replicas | Data is only read/written to the main database
|
||||
Data is only written to the main DB | Can only have 1 other AZ as failover
|
||||
| Read Replicas | Multi-AZ |
|
||||
| ----------------------------------- | ------------------------------------------------- |
|
||||
| Scale the read workload of your DB | Failover in case of AZ outage (high availability) |
|
||||
| Can create up to 5 Read Replicas | Data is only read/written to the main database |
|
||||
| Data is only written to the main DB | Can only have 1 other AZ as failover |
|
||||
|
||||

|
||||
|
||||
## RDS Deployments: Multi-Region
|
||||
### RDS Deployments: Multi-Region
|
||||
|
||||
* Multi-Region (Read Replicas)
|
||||
* Disaster recovery in case of region issue
|
||||
* Local performance for global reads
|
||||
* Replication cost
|
||||
- Multi-Region (Read Replicas)
|
||||
- Disaster recovery in case of region issue
|
||||
- Local performance for global reads
|
||||
- Replication cost
|
||||
|
||||

|
||||
|
||||
## Amazon Aurora
|
||||
|
||||
- Aurora is a proprietary technology from AWS (not open sourced)
|
||||
- PostgreSQL and MySQL are both supported as Aurora DB
|
||||
- Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
|
||||
- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
|
||||
- Aurora costs more than RDS (20% more) – but is more efficient
|
||||
- Not in the free tier
|
||||
|
||||
## Amazon ElastiCache Overview
|
||||
|
||||
* The same way RDS is to get managed Relational Databases…
|
||||
* ElastiCache is to get managed Redis or Memcached
|
||||
* Caches are in-memory databases with high performance, low latency
|
||||
* Helps reduce load off databases for read intensive workloads
|
||||
* AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup
|
||||
- The same way RDS is to get managed Relational Databases…
|
||||
- ElastiCache is to get managed Redis or Memcached
|
||||
- Caches are in-memory databases with high performance, low latency
|
||||
- Helps reduce load off databases for read intensive workloads
|
||||
- AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backup
|
||||
|
||||
## DynamoDB
|
||||
|
||||
* Fully Managed Highly available with replication across 3 AZ
|
||||
* NoSQL database - not a relational database
|
||||
* Scales to massive workloads, distributed “serverless” database
|
||||
* Millions of requests per seconds, trillions of row, 100s of TB of storage
|
||||
* Fast and consistent in performance
|
||||
* Single-digit millisecond latency – low latency retrieval
|
||||
* Integrated with IAM for security, authorization and administration
|
||||
* Low cost and auto scaling capabilities
|
||||
* Standard & Infrequent Access (IA) Table Class
|
||||
- Fully Managed Highly available with replication across 3 AZ
|
||||
- NoSQL database - not a relational database
|
||||
- Scales to massive workloads, distributed “serverless” database
|
||||
- Millions of requests per seconds, trillions of row, 100s of TB of storage
|
||||
- Fast and consistent in performance
|
||||
- Single-digit millisecond latency – low latency retrieval
|
||||
- Integrated with IAM for security, authorization and administration
|
||||
- Low cost and auto scaling capabilities
|
||||
- Standard & Infrequent Access (IA) Table Class
|
||||
|
||||
### DynamoDB Accelerator - DAX
|
||||
|
||||
* Fully Managed in-memory cache for DynamoDB
|
||||
* 10x performance improvement – single- digit millisecond latency to microseconds latency – when accessing your DynamoDB tables
|
||||
* Secure, highly scalable & highly available
|
||||
* Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
|
||||
- Fully Managed in-memory cache for DynamoDB
|
||||
- 10x performance improvement – single- digit millisecond latency to microseconds latency – when accessing your DynamoDB tables
|
||||
- Secure, highly scalable & highly available
|
||||
- Difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
|
||||
|
||||
### DynamoDB – Global Tables
|
||||
### DynamoDB - Global Tables
|
||||
|
||||
* Make a DynamoDB table accessible with low latency in multiple-regions
|
||||
* Active-Active replication (read/write to any AWS Region)
|
||||
- Make a DynamoDB table accessible with low latency in multiple-regions
|
||||
- Active-Active replication (read/write to any AWS Region)
|
||||
|
||||
## Redshift Overview
|
||||
|
||||
* Redshift is based on PostgreSQL, but it’s not used for OLTP (Online Transactional Processing)
|
||||
* It’s OLAP – online analytical processing (analytics and data warehousing)
|
||||
* Load data once every hour, not every second
|
||||
* 10x better performance than other data warehouses, scale to PBs of data
|
||||
* Columnar storage of data (instead of row based)
|
||||
* Massively Parallel Query Execution (MPP), highly available
|
||||
* Pay as you go based on the instances provisioned
|
||||
* Has a SQL interface for performing the queries
|
||||
* BI tools such as AWS Quicksight or Tableau integrate with it
|
||||
- Redshift is based on PostgreSQL, but it’s not used for OLTP (Online Transactional Processing)
|
||||
- It’s OLAP – online analytical processing (analytics and data warehousing)
|
||||
- Load data once every hour, not every second
|
||||
- 10x better performance than other data warehouses, scale to PBs of data
|
||||
- Columnar storage of data (instead of row based)
|
||||
- Massively Parallel Query Execution (MPP), highly available
|
||||
- Pay as you go based on the instances provisioned
|
||||
- Has a SQL interface for performing the queries
|
||||
- BI tools such as AWS Quicksight or Tableau integrate with it
|
||||
|
||||
## Amazon EMR
|
||||
|
||||
* EMR stands for “Elastic MapReduce”
|
||||
* EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
|
||||
* The clusters can be made of hundreds of EC2 instances
|
||||
* Also supports Apache Spark, HBase, Presto, Flink
|
||||
* EMR takes care of all the provisioning and configuration
|
||||
* Auto-scaling and integrated with Spot instances
|
||||
* Use cases: data processing, machine learning, web indexing, big data
|
||||
- EMR stands for “Elastic MapReduce”
|
||||
- EMR helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
|
||||
- The clusters can be made of hundreds of EC2 instances
|
||||
- Also supports Apache Spark, HBase, Presto, Flink
|
||||
- EMR takes care of all the provisioning and configuration
|
||||
- Auto-scaling and integrated with Spot instances
|
||||
- Use cases: data processing, machine learning, web indexing, big data
|
||||
|
||||
## Amazon Athena
|
||||
|
||||
* Serverless query service to analyze data stored in Amazon S3
|
||||
* Uses standard SQL language to query the files
|
||||
* Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
|
||||
* Pricing: $5.00 per TB of data scanned
|
||||
* Use compressed or columnar data for cost-savings (less scan)
|
||||
* Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
|
||||
* **analyze data in S3 using serverless SQL, use Athena**
|
||||
- Serverless query service to analyze data stored in Amazon S3
|
||||
- Uses standard SQL language to query the files
|
||||
- Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
|
||||
- Pricing: $5.00 per TB of data scanned
|
||||
- Use compressed or columnar data for cost-savings (less scan)
|
||||
- Use cases: Business intelligence / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
|
||||
- **analyze data in S3 using serverless SQL, use Athena**
|
||||
|
||||
## Amazon QuickSight
|
||||
|
||||
* Serverless machine learning-powered business intelligence service to create interactive dashboards
|
||||
* Fast, automatically scalable, embeddable, with per-session pricing
|
||||
* Use cases:
|
||||
* Business analytics
|
||||
* Building visualizations
|
||||
* Perform ad-hoc analysis
|
||||
* Get business insights using data
|
||||
* Integrated with RDS, Aurora, Athena, Redshift, S3…
|
||||
- Serverless machine learning-powered business intelligence service to create interactive dashboards
|
||||
- Fast, automatically scalable, embeddable, with per-session pricing
|
||||
- Use cases:
|
||||
- Business analytics
|
||||
- Building visualizations
|
||||
- Perform ad-hoc analysis
|
||||
- Get business insights using data
|
||||
- Integrated with RDS, Aurora, Athena, Redshift, S3…
|
||||
|
||||
## DocumentDB
|
||||
|
||||
* Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
|
||||
* DocumentDB is the same for MongoDB (which is a NoSQL database)
|
||||
* MongoDB is used to store, query, and index JSON data
|
||||
* Similar “deployment concepts” as Aurora
|
||||
* Fully Managed, highly available with replication across 3 AZ
|
||||
* Aurora storage automatically grows in increments of 10GB, up to 64 TB.
|
||||
* Automatically scales to workloads with millions of requests per seconds
|
||||
- Aurora is an “AWS-implementation” of PostgreSQL / MySQL …
|
||||
- DocumentDB is the same for MongoDB (which is a NoSQL database)
|
||||
- MongoDB is used to store, query, and index JSON data
|
||||
- Similar “deployment concepts” as Aurora
|
||||
- Fully Managed, highly available with replication across 3 AZ
|
||||
- Aurora storage automatically grows in increments of 10GB, up to 64 TB.
|
||||
- Automatically scales to workloads with millions of requests per seconds
|
||||
|
||||
## Amazon Neptune
|
||||
|
||||
* Fully managed graph database
|
||||
* A popular graph dataset would be a social network
|
||||
* Users have friends
|
||||
* Posts have comments
|
||||
* Comments have likes from users
|
||||
* Users share and like posts…
|
||||
* Highly available across 3 AZ, with up to 15 read replicas
|
||||
* Build and run applications working with highly connected datasets – optimized for these complex and hard queries
|
||||
* Can store up to billions of relations and query the graph with milliseconds latency
|
||||
* Highly available with replications across multiple AZs
|
||||
* Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
|
||||
- Fully managed graph database
|
||||
- A popular graph dataset would be a social network
|
||||
- Users have friends
|
||||
- Posts have comments
|
||||
- Comments have likes from users
|
||||
- Users share and like posts…
|
||||
- Highly available across 3 AZ, with up to 15 read replicas
|
||||
- Build and run applications working with highly connected datasets – optimized for these complex and hard queries
|
||||
- Can store up to billions of relations and query the graph with milliseconds latency
|
||||
- Highly available with replications across multiple AZs
|
||||
- Great for knowledge graphs (Wikipedia), fraud detection, recommendation engines, social networking
|
||||
|
||||
## Amazon QLDB
|
||||
|
||||
* QLDB stands for ”Quantum Ledger Database”
|
||||
* A ledger is a book **recording financial transactions**
|
||||
* Fully Managed, Serverless, High available, Replication across 3 AZ
|
||||
* Used to **review history of all the changes made to your application data** over time
|
||||
* **Immutable** system: no entry can be removed or modified, cryptographically verifiable
|
||||
* 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
|
||||
* Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
|
||||
- QLDB stands for ”Quantum Ledger Database”
|
||||
- A ledger is a book **recording financial transactions**
|
||||
- Fully Managed, Serverless, High available, Replication across 3 AZ
|
||||
- Used to **review history of all the changes made to your application data** over time
|
||||
- **Immutable** system: no entry can be removed or modified, cryptographically verifiable
|
||||
- 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
|
||||
- Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
|
||||
|
||||
## Amazon Managed Blockchain
|
||||
|
||||
* Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority.
|
||||
* Amazon Managed Blockchain is a managed service to:
|
||||
* Join public blockchain networks
|
||||
* Or create your own scalable private network
|
||||
* Compatible with the frameworks Hyperledger Fabric & Ethereum
|
||||
- Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority.
|
||||
- Amazon Managed Blockchain is a managed service to:
|
||||
- Join public blockchain networks
|
||||
- Or create your own scalable private network
|
||||
- Compatible with the frameworks Hyperledger Fabric & Ethereum
|
||||
|
||||
## AWS Glue
|
||||
|
||||
* Managed extract, transform, and load (ETL) service
|
||||
* Useful to prepare and transform data for analytics
|
||||
* Fully serverless service
|
||||
* Glue Data Catalog: catalog of datasets
|
||||
* can be used by Athena, Redshift, EMR
|
||||
- Managed extract, transform, and load (ETL) service
|
||||
- Useful to prepare and transform data for analytics
|
||||
- Fully serverless service
|
||||
- Glue Data Catalog: catalog of datasets
|
||||
- can be used by Athena, Redshift, EMR
|
||||
|
||||
## DMS – Database Migration Service
|
||||
## DMS - Database Migration Service
|
||||
|
||||
* Quickly and securely migrate databases to AWS, resilient, self healing
|
||||
* The source database remains available during the migration
|
||||
* Supports:
|
||||
* Homogeneous migrations: ex Oracle to Oracle
|
||||
* Heterogeneous migrations: ex Microsoft SQL Server to Aurora
|
||||
- Quickly and securely migrate databases to AWS, resilient, self healing
|
||||
- The source database remains available during the migration
|
||||
- Supports:
|
||||
- Homogeneous migrations: ex Oracle to Oracle
|
||||
- Heterogeneous migrations: ex Microsoft SQL Server to Aurora
|
||||
|
||||
## Databases & Analytics Summary in AWS
|
||||
## Databases & Analytics Summary
|
||||
|
||||
* Relational Databases - OLTP: RDS & Aurora (SQL)
|
||||
* Differences between Multi-AZ, Read Replicas, Multi-Region
|
||||
* In-memory Database: ElastiCache
|
||||
* Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
|
||||
* Warehouse - OLAP: Redshift (SQL)
|
||||
* Hadoop Cluster: EMR
|
||||
* Athena: query data on Amazon S3 (serverless & SQL)
|
||||
* QuickSight: dashboards on your data (serverless)
|
||||
* DocumentDB: “Aurora for MongoDB” (JSON – NoSQL database)
|
||||
* Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable)
|
||||
* Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
|
||||
* Glue: Managed ETL (Extract Transform Load) and Data Catalog service
|
||||
* Database Migration: DMS
|
||||
* Neptune: graph database
|
||||
- Relational Databases - OLTP: RDS & Aurora (SQL)
|
||||
- Differences between Multi-AZ, Read Replicas, Multi-Region
|
||||
- In-memory Database: ElastiCache
|
||||
- Key/Value Database: DynamoDB (serverless) & DAX (cache for DynamoDB)
|
||||
- Warehouse - OLAP: Redshift (SQL)
|
||||
- Hadoop Cluster: EMR
|
||||
- Athena: query data on Amazon S3 (serverless & SQL)
|
||||
- QuickSight: dashboards on your data (serverless)
|
||||
- DocumentDB: “Aurora for MongoDB” (JSON – NoSQL database)
|
||||
- Amazon QLDB: Financial Transactions Ledger (immutable journal, cryptographically verifiable)
|
||||
- Amazon Managed Blockchain: managed Hyperledger Fabric & Ethereum blockchains
|
||||
- Glue: Managed ETL (Extract Transform Load) and Data Catalog service
|
||||
- Database Migration: DMS
|
||||
- Neptune: graph database
|
||||
|
||||
@@ -1,221 +1,243 @@
|
||||
# Deploying and Managing Infrastructure at Scale
|
||||
|
||||
## What is CloudFormation
|
||||
- [Deploying and Managing Infrastructure at Scale](#deploying-and-managing-infrastructure-at-scale)
|
||||
- [What is CloudFormation?](#what-is-cloudformation)
|
||||
- [Benefits of AWS CloudFormation](#benefits-of-aws-cloudformation)
|
||||
- [CloudFormation Stack Designer](#cloudformation-stack-designer)
|
||||
- [AWS Cloud Development Kit (CDK)](#aws-cloud-development-kit-cdk)
|
||||
- [Developer problems on AWS](#developer-problems-on-aws)
|
||||
- [AWS Elastic Beanstalk Overview](#aws-elastic-beanstalk-overview)
|
||||
- [Elastic Beanstalk - Health Monitoring](#elastic-beanstalk---health-monitoring)
|
||||
- [AWS CodeDeploy](#aws-codedeploy)
|
||||
- [AWS CodeCommit](#aws-codecommit)
|
||||
- [AWS CodeBuild](#aws-codebuild)
|
||||
- [AWS CodePipeline](#aws-codepipeline)
|
||||
- [AWS CodeArtifact](#aws-codeartifact)
|
||||
- [AWS CodeStar](#aws-codestar)
|
||||
- [AWS Cloud9](#aws-cloud9)
|
||||
- [AWS Systems Manager (SSM)](#aws-systems-manager-ssm)
|
||||
- [How Systems Manager works](#how-systems-manager-works)
|
||||
- [Systems Manager - SSM Session Manager](#systems-manager---ssm-session-manager)
|
||||
- [AWS OpsWorks](#aws-opsworks)
|
||||
- [Deployment - Summary](#deployment---summary)
|
||||
- [Developer Services - Summary](#developer-services---summary)
|
||||
|
||||
* CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported).
|
||||
* For example, within a CloudFormation template, you say:
|
||||
* I want a security group
|
||||
* I want two EC2 instances using this security group
|
||||
* I want an S3 bucket
|
||||
* I want a load balancer (ELB) in front of these machines
|
||||
* Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
|
||||
## What is CloudFormation?
|
||||
|
||||
- CloudFormation is a declarative way of outlining your AWS Infrastructure, for any resources (most of them are supported).
|
||||
- For example, within a CloudFormation template, you say:
|
||||
- I want a security group
|
||||
- I want two EC2 instances using this security group
|
||||
- I want an S3 bucket
|
||||
- I want a load balancer (ELB) in front of these machines
|
||||
- Then CloudFormation creates those for you, in the right order, with the exact configuration that you specify
|
||||
|
||||
### Benefits of AWS CloudFormation
|
||||
|
||||
* Infrastructure as code
|
||||
* No resources are manually created, which is excellent for control
|
||||
* Changes to the infrastructure are reviewed through code
|
||||
* Cost
|
||||
* Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
|
||||
* You can estimate the costs of your resources using the CloudFormation template
|
||||
* Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
|
||||
* Productivity
|
||||
* Ability to destroy and re-create an infrastructure on the cloud on the fly
|
||||
* Automated generation of Diagram for your templates!
|
||||
* Declarative programming (no need to figure out ordering and orchestration)
|
||||
* Don’t re-invent the wheel
|
||||
* Leverage existing templates on the web!
|
||||
* Leverage the documentation
|
||||
* Supports (almost) all AWS resources:
|
||||
* Everything we’ll see in this course is supported
|
||||
* You can use “custom resources” for resources that are not supported
|
||||
- Infrastructure as code
|
||||
- No resources are manually created, which is excellent for control
|
||||
- Changes to the infrastructure are reviewed through code
|
||||
- Cost
|
||||
- Each resources within the stack is tagged with an identifier so you can easily see how much a stack costs you
|
||||
- You can estimate the costs of your resources using the CloudFormation template
|
||||
- Savings strategy: In Dev, you could automation deletion of templates at 5 PM and recreated at 8 AM, safely
|
||||
- Productivity
|
||||
- Ability to destroy and re-create an infrastructure on the cloud on the fly
|
||||
- Automated generation of Diagram for your templates!
|
||||
- Declarative programming (no need to figure out ordering and orchestration)
|
||||
- Don’t re-invent the wheel
|
||||
- Leverage existing templates on the web!
|
||||
- Leverage the documentation
|
||||
- Supports (almost) all AWS resources:
|
||||
- Everything we’ll see in this course is supported
|
||||
- You can use “custom resources” for resources that are not supported
|
||||
|
||||
### CloudFormation Stack Designer
|
||||
|
||||
* Example: WordPress CloudFormation Stack
|
||||
* We can see all the resources
|
||||
* We can see the relations between the components
|
||||
- Example: WordPress CloudFormation Stack
|
||||
- We can see all the resources
|
||||
- We can see the relations between the components
|
||||
|
||||
## AWS Cloud Development Kit (CDK)
|
||||
|
||||
* Define your cloud infrastructure using a familiar language:
|
||||
* JavaScript/TypeScript, Python, Java, and .NET
|
||||
* The code is “compiled” into a CloudFormation template (JSON/YAML)
|
||||
* You can therefore deploy infrastructure and application runtime code together
|
||||
* Great for Lambda functions
|
||||
* Great for Docker containers in ECS / EKS
|
||||
- Define your cloud infrastructure using a familiar language:
|
||||
- JavaScript/TypeScript, Python, Java, and .NET
|
||||
- The code is “compiled” into a CloudFormation template (JSON/YAML)
|
||||
- You can therefore deploy infrastructure and application runtime code together
|
||||
- Great for Lambda functions
|
||||
- Great for Docker containers in ECS / EKS
|
||||
|
||||
## Developer problems on AWS
|
||||
|
||||
* Managing infrastructure
|
||||
* Deploying Code
|
||||
* Configuring all the databases, load balancers, etc
|
||||
* Scaling concerns
|
||||
* Most web apps have the same architecture (ALB + ASG)
|
||||
* All the developers want is for their code to run!
|
||||
* Possibly, consistently across different applications and environments
|
||||
- Managing infrastructure
|
||||
- Deploying Code
|
||||
- Configuring all the databases, load balancers, etc
|
||||
- Scaling concerns
|
||||
- Most web apps have the same architecture (ALB + ASG)
|
||||
- All the developers want is for their code to run!
|
||||
- Possibly, consistently across different applications and environments
|
||||
|
||||
## AWS Elastic Beanstalk Overview
|
||||
|
||||
* Elastic Beanstalk is a developer centric view of deploying an application on AWS
|
||||
* It uses all the component’s we’ve seen before: EC2, ASG, ELB, RDS, etc…
|
||||
* But it’s all in one view that’s easy to make sense of!
|
||||
* We still have full control over the configuration
|
||||
* Beanstalk = Platform as a Service (PaaS)
|
||||
* Beanstalk is free but you pay for the underlying instances
|
||||
* Managed service
|
||||
* Instance configuration / OS is handled by Beanstalk
|
||||
* Deployment strategy is configurable but performed by Elastic Beanstalk
|
||||
* Capacity provisioning
|
||||
* Load balancing & auto-scaling
|
||||
* Application health-monitoring & responsiveness
|
||||
* Just the application code is the responsibility of the developer
|
||||
* Three architecture models:
|
||||
* Single Instance deployment: good for dev
|
||||
* LB + ASG: great for production or pre-production web applications
|
||||
* ASG only: great for non-web apps in production (workers, etc..)
|
||||
- Elastic Beanstalk is a developer centric view of deploying an application on AWS
|
||||
- It uses all the component’s we’ve seen before: EC2, ASG, ELB, RDS, etc…
|
||||
- But it’s all in one view that’s easy to make sense of!
|
||||
- We still have full control over the configuration
|
||||
- Beanstalk = Platform as a Service (PaaS)
|
||||
- Beanstalk is free but you pay for the underlying instances
|
||||
- Managed service
|
||||
- Instance configuration / OS is handled by Beanstalk
|
||||
- Deployment strategy is configurable but performed by Elastic Beanstalk
|
||||
- Capacity provisioning
|
||||
- Load balancing & auto-scaling
|
||||
- Application health-monitoring & responsiveness
|
||||
- Just the application code is the responsibility of the developer
|
||||
- Three architecture models:
|
||||
- Single Instance deployment: good for dev
|
||||
- LB + ASG: great for production or pre-production web applications
|
||||
- ASG only: great for non-web apps in production (workers, etc..)
|
||||
|
||||
* Support for many platforms:
|
||||
* Go
|
||||
* Java SE
|
||||
* Java with Tomcat
|
||||
* .NET on Windows Server with IIS
|
||||
* Node.js
|
||||
* PHP
|
||||
* Python
|
||||
* Ruby
|
||||
* Packer Builder
|
||||
* Single Container Docker
|
||||
* Multi-Container Docker
|
||||
* Preconfigured Docker
|
||||
- Support for many platforms:
|
||||
- Go
|
||||
- Java SE
|
||||
- Java with Tomcat
|
||||
- .NET on Windows Server with IIS
|
||||
- Node.js
|
||||
- PHP
|
||||
- Python
|
||||
- Ruby
|
||||
- Packer Builder
|
||||
- Single Container Docker
|
||||
- Multi-Container Docker
|
||||
- Preconfigured Docker
|
||||
|
||||
### Elastic Beanstalk – Health Monitoring
|
||||
### Elastic Beanstalk - Health Monitoring
|
||||
|
||||
* Health agent pushes metrics to CloudWatch
|
||||
* Checks for app health, publishes health events
|
||||
- Health agent pushes metrics to CloudWatch
|
||||
- Checks for app health, publishes health events
|
||||
|
||||
## AWS CodeDeploy
|
||||
|
||||
* We want to deploy our application automatically
|
||||
* Works with EC2 Instances
|
||||
* Works with On-Premises Servers
|
||||
* Hybrid service
|
||||
* Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent
|
||||
- We want to deploy our application automatically
|
||||
- Works with EC2 Instances
|
||||
- Works with On-Premises Servers
|
||||
- Hybrid service
|
||||
- Servers / Instances must be provisioned and configured ahead of time with the CodeDeploy Agent
|
||||
|
||||
## AWS CodeCommit
|
||||
|
||||
* Before pushing the application code to servers, it needs to be stored somewhere
|
||||
* Developers usually store code in a repository, using the Git technology
|
||||
* A famous public offering is GitHub, AWS’ competing product is CodeCommit
|
||||
* CodeCommit:
|
||||
* Source-control service that hosts Git-based repositories
|
||||
* Makes it easy to collaborate with others on code
|
||||
* The code changes are automatically versioned
|
||||
* Benefits:
|
||||
* Fully managed
|
||||
* Scalable & highly available
|
||||
* Private, Secured, Integrated with AWS
|
||||
- Before pushing the application code to servers, it needs to be stored somewhere
|
||||
- Developers usually store code in a repository, using the Git technology
|
||||
- A famous public offering is GitHub, AWS’ competing product is CodeCommit
|
||||
- CodeCommit:
|
||||
- Source-control service that hosts Git-based repositories
|
||||
- Makes it easy to collaborate with others on code
|
||||
- The code changes are automatically versioned
|
||||
- Benefits:
|
||||
- Fully managed
|
||||
- Scalable & highly available
|
||||
- Private, Secured, Integrated with AWS
|
||||
|
||||
## AWS CodeBuild
|
||||
|
||||
* Code building service in the cloud (name is obvious)
|
||||
* Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example)
|
||||
* Benefits:
|
||||
* Fully managed, serverless
|
||||
* Continuously scalable & highly available
|
||||
* Secure
|
||||
* Pay-as-you-go pricing – only pay for the build time
|
||||
- Code building service in the cloud (name is obvious)
|
||||
- Compiles source code, run tests, and produces packages that are ready to be deployed (by CodeDeploy for example)
|
||||
- Benefits:
|
||||
- Fully managed, serverless
|
||||
- Continuously scalable & highly available
|
||||
- Secure
|
||||
- Pay-as-you-go pricing – only pay for the build time
|
||||
|
||||
## AWS CodePipeline
|
||||
|
||||
* Orchestrate the different steps to have the code automatically pushed to production
|
||||
* Code => Build => Test => Provision => Deploy
|
||||
* Basis for CICD (Continuous Integration & Continuous Delivery)
|
||||
* Benefits:
|
||||
* Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins…
|
||||
* Fast delivery & rapid updates
|
||||
- Orchestrate the different steps to have the code automatically pushed to production
|
||||
- Code => Build => Test => Provision => Deploy
|
||||
- Basis for CICD (Continuous Integration & Continuous Delivery)
|
||||
- Benefits:
|
||||
- Fully managed, compatible with CodeCommit, CodeBuild, CodeDeploy, Elastic Beanstalk, CloudFormation, GitHub, 3rd-party services (GitHub…) & custom plugins…
|
||||
- Fast delivery & rapid updates
|
||||
|
||||
* CodePipeline: orchestration layer
|
||||
* CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk
|
||||
- CodePipeline: orchestration layer
|
||||
- CodeCommit => CodeBuild => CodeDeploy => Elastic Beanstalk
|
||||
|
||||
## AWS CodeArtifact
|
||||
|
||||
* Software packages depend on each other to be built (also called code dependencies), and new ones are created
|
||||
* Storing and retrieving these dependencies is called artifact management
|
||||
* Traditionally you need to setup your own artifact management system
|
||||
* CodeArtifact is a secure, scalable, and cost-effective artifact management for software development
|
||||
* Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet
|
||||
* Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact
|
||||
- Software packages depend on each other to be built (also called code dependencies), and new ones are created
|
||||
- Storing and retrieving these dependencies is called artifact management
|
||||
- Traditionally you need to setup your own artifact management system
|
||||
- CodeArtifact is a secure, scalable, and cost-effective artifact management for software development
|
||||
- Works with common dependency management tools such as Maven, Gradle, npm, yarn, twine, pip, and NuGet
|
||||
- Developers and CodeBuild can then retrieve dependencies straight from CodeArtifact
|
||||
|
||||
## AWS CodeStar
|
||||
|
||||
* Unified UI to easily manage software development activities in one place
|
||||
* “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc…
|
||||
* Can edit the code ”in-the-cloud” using AWS Cloud9
|
||||
- Unified UI to easily manage software development activities in one place
|
||||
- “Quick way” to get started to correctly set-up CodeCommit, CodePipeline, CodeBuild, CodeDeploy, Elastic Beanstalk, EC2, etc…
|
||||
- Can edit the code ”in-the-cloud” using AWS Cloud9
|
||||
|
||||
## AWS Cloud9
|
||||
|
||||
* AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code
|
||||
* “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used
|
||||
* A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary
|
||||
* AWS Cloud9 also allows for code collaboration in real-time (pair programming)
|
||||
- AWS Cloud9 is a cloud IDE (Integrated Development Environment) for writing, running and debugging code
|
||||
- “Classic” IDE (like IntelliJ, Visual Studio Code…) are downloaded on a computer before being used
|
||||
- A cloud IDE can be used within a web browser, meaning you can work on your projects from your office, home, or anywhere with internet with no setup necessary
|
||||
- AWS Cloud9 also allows for code collaboration in real-time (pair programming)
|
||||
|
||||
## AWS Systems Manager (SSM)
|
||||
|
||||
* Helps you manage your EC2 and On-Premises systems at scale
|
||||
* Another Hybrid AWS service
|
||||
* Get operational insights about the state of your infrastructure
|
||||
* Suite of 10+ products
|
||||
* Most important features are:
|
||||
* Patching automation for enhanced compliance
|
||||
* Run commands across an entire fleet of servers
|
||||
* Store parameter configuration with the SSM Parameter Store
|
||||
* Works for both Windows and Linux OS
|
||||
- Helps you manage your EC2 and On-Premises systems at scale
|
||||
- Another Hybrid AWS service
|
||||
- Get operational insights about the state of your infrastructure
|
||||
- Suite of 10+ products
|
||||
- Most important features are:
|
||||
- Patching automation for enhanced compliance
|
||||
- Run commands across an entire fleet of servers
|
||||
- Store parameter configuration with the SSM Parameter Store
|
||||
- Works for both Windows and Linux OS
|
||||
|
||||
### How Systems Manager works
|
||||
|
||||
* We need to install the SSM agent onto the systems we control
|
||||
* Installed by default on Amazon Linux AMI & some Ubuntu AMI
|
||||
* If an instance can’t be controlled with SSM, it’s probably an issue with the SSM agent!
|
||||
* Thanks to the SSM agent, we can run commands, patch & configure our servers
|
||||
- We need to install the SSM agent onto the systems we control
|
||||
- Installed by default on Amazon Linux AMI & some Ubuntu AMI
|
||||
- If an instance can’t be controlled with SSM, it’s probably an issue with the SSM agent!
|
||||
- Thanks to the SSM agent, we can run commands, patch & configure our servers
|
||||
|
||||
### Systems Manager – SSM Session Manager
|
||||
### Systems Manager - SSM Session Manager
|
||||
|
||||
* Allows you to start a secure shell on your EC2 and on-premises servers
|
||||
* No SSH access, bastion hosts, or SSH keys needed
|
||||
* No port 22 needed (better security)
|
||||
* Supports Linux, macOS, and Windows
|
||||
* Send session log data to S3 or CloudWatch Logs
|
||||
- Allows you to start a secure shell on your EC2 and on-premises servers
|
||||
- No SSH access, bastion hosts, or SSH keys needed
|
||||
- No port 22 needed (better security)
|
||||
- Supports Linux, macOS, and Windows
|
||||
- Send session log data to S3 or CloudWatch Logs
|
||||
|
||||
## AWS OpsWorks
|
||||
|
||||
* Chef & Puppet help you perform server configuration automatically, or repetitive actions
|
||||
* They work great with EC2 & On-Premises VM
|
||||
* AWS OpsWorks = Managed Chef & Puppet
|
||||
* It’s an alternative to AWS SSM
|
||||
* Only provision standard AWS resources:
|
||||
* EC2 Instances, Databases, Load Balancers, EBS volumes…
|
||||
* **Chef or Puppet needed => AWS OpsWorks**
|
||||
- Chef & Puppet help you perform server configuration automatically, or repetitive actions
|
||||
- They work great with EC2 & On-Premises VM
|
||||
- AWS OpsWorks = Managed Chef & Puppet
|
||||
- It’s an alternative to AWS SSM
|
||||
- Only provision standard AWS resources:
|
||||
- EC2 Instances, Databases, Load Balancers, EBS volumes…
|
||||
- **Chef or Puppet needed => AWS OpsWorks**
|
||||
|
||||
## Deployment - Summary
|
||||
|
||||
* CloudFormation: (AWS only)
|
||||
* Infrastructure as Code, works with almost all of AWS resources
|
||||
* Repeat across Regions & Accounts
|
||||
* Beanstalk: (AWS only)
|
||||
* Platform as a Service (PaaS), limited to certain programming languages or Docker
|
||||
* Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS
|
||||
* CodeDeploy (hybrid): deploy & upgrade any application onto servers
|
||||
* Systems Manager (hybrid): patch, configure and run commands at scale
|
||||
* OpsWorks (hybrid): managed Chef and Puppet in AWS
|
||||
- CloudFormation: (AWS only)
|
||||
- Infrastructure as Code, works with almost all of AWS resources
|
||||
- Repeat across Regions & Accounts
|
||||
- Beanstalk: (AWS only)
|
||||
- Platform as a Service (PaaS), limited to certain programming languages or Docker
|
||||
- Deploy code consistently with a known architecture: ex, ALB + EC2 + RDS
|
||||
- CodeDeploy (hybrid): deploy & upgrade any application onto servers
|
||||
- Systems Manager (hybrid): patch, configure and run commands at scale
|
||||
- OpsWorks (hybrid): managed Chef and Puppet in AWS
|
||||
|
||||
## Developer Services - Summary
|
||||
|
||||
* CodeCommit: Store code in private git repository (version controlled)
|
||||
* CodeBuild: Build & test code in AWS
|
||||
* CodeDeploy: Deploy code onto servers
|
||||
* CodePipeline: Orchestration of pipeline (from code to build to deploy)
|
||||
* CodeArtifact: Store software packages / dependencies on AWS
|
||||
* CodeStar: Unified view for allowing developers to do CICD and code
|
||||
* Cloud9: Cloud IDE (Integrated Development Environment) with collab
|
||||
* AWS CDK: Define your cloud infrastructure using a programming language
|
||||
- CodeCommit: Store code in private git repository (version controlled)
|
||||
- CodeBuild: Build & test code in AWS
|
||||
- CodeDeploy: Deploy code onto servers
|
||||
- CodePipeline: Orchestration of pipeline (from code to build to deploy)
|
||||
- CodeArtifact: Store software packages / dependencies on AWS
|
||||
- CodeStar: Unified view for allowing developers to do CICD and code
|
||||
- Cloud9: Cloud IDE (Integrated Development Environment) with collab
|
||||
- AWS CDK: Define your cloud infrastructure using a programming language
|
||||
|
||||
@@ -1,136 +1,154 @@
|
||||
# EC2 Instance Storage
|
||||
|
||||
* [EBS volumes](#ebs-volume)
|
||||
* [EFS: network file system, can be attached to 100s of instances in a region](#efs-elastic-file-system)
|
||||
* [EFS-IA: cost-optimized storage class for infrequent accessed files](#efs-infrequent-access-efs-ia)
|
||||
* [FSx for Windows: Network File System for Windows servers](#amazon-fsx-for-windows-file-server)
|
||||
* [FSx for Lustre: High Performance Computing Linux file system](#amazon-fsx-for-lustre)
|
||||
- [EC2 Instance Storage](#ec2-instance-storage)
|
||||
- [EBS Volumes](#ebs-volumes)
|
||||
- [What’s an EBS Volume?](#whats-an-ebs-volume)
|
||||
- [EBS Volume](#ebs-volume)
|
||||
- [EBS – Delete on Termination attribute](#ebs--delete-on-termination-attribute)
|
||||
- [EBS Snapshots](#ebs-snapshots)
|
||||
- [EBS Snapshots Features](#ebs-snapshots-features)
|
||||
- [EFS: Elastic File System](#efs-elastic-file-system)
|
||||
- [EFS Infrequent Access (EFS-IA)](#efs-infrequent-access-efs-ia)
|
||||
- [Amazon FSx – Overview](#amazon-fsx--overview)
|
||||
- [Amazon FSx for Windows File Server](#amazon-fsx-for-windows-file-server)
|
||||
- [Amazon FSx for Lustre](#amazon-fsx-for-lustre)
|
||||
- [EC2 Instance Store](#ec2-instance-store)
|
||||
- [Shared Responsibility Model for EC2 Storage](#shared-responsibility-model-for-ec2-storage)
|
||||
- [AMI Overview](#ami-overview)
|
||||
- [AMI Process (from an EC2 instance)](#ami-process-from-an-ec2-instance)
|
||||
- [EC2 Image Builder](#ec2-image-builder)
|
||||
|
||||
- EBS: Elastic Block Store, Volume is a network drive you can attach to your instances while they run
|
||||
- EFS: network file system, can be attached to 100s of instances in a region
|
||||
- EFS-IA: cost-optimized storage class for infrequent accessed files
|
||||
- FSx for Windows: Network File System for Windows servers
|
||||
- FSx for Lustre: High Performance Computing Linux file system
|
||||
|
||||
## EBS Volumes
|
||||
|
||||
### What’s an EBS Volume?
|
||||
|
||||
* An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run
|
||||
* It allows your instances to persist data, even after their termination
|
||||
* They can only be mounted to one instance at a time (at the CCP level)
|
||||
* They are bound to a specific availability zone
|
||||
* Analogy: Think of them as a “network USB stick”
|
||||
* Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
|
||||
- An EBS (Elastic Block Store) Volume is a network drive you can attach to your instances while they run
|
||||
- It allows your instances to persist data, even after their termination
|
||||
- They can only be mounted to one instance at a time (at the CCP level)
|
||||
- They are bound to a specific availability zone
|
||||
- Analogy: Think of them as a “network USB stick”
|
||||
- Free tier: 30 GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
|
||||
|
||||
### EBS Volume
|
||||
|
||||
* It’s a network drive (i.e. not a physical drive)
|
||||
* It uses the network to communicate the instance, which means there might be a bit of latency
|
||||
* It can be detached from an EC2 instance and attached to another one quickly
|
||||
* It’s locked to an Availability Zone (AZ)
|
||||
* An EBS Volume in us-east-1a cannot be attached to us-east-1b
|
||||
* To move a volume across, you first need to snapshot it
|
||||
* Have a provisioned capacity (size in GBs, and IOPS)
|
||||
* You get billed for all the provisioned capacity
|
||||
* You can increase the capacity of the drive over time
|
||||
- It’s a network drive (i.e. not a physical drive)
|
||||
- It uses the network to communicate the instance, which means there might be a bit of latency
|
||||
- It can be detached from an EC2 instance and attached to another one quickly
|
||||
- It’s locked to an Availability Zone (AZ)
|
||||
- An EBS Volume in us-east-1a cannot be attached to us-east-1b
|
||||
- To move a volume across, you first need to snapshot it
|
||||
- Have a provisioned capacity (size in GBs, and IOPS)
|
||||
- You get billed for all the provisioned capacity
|
||||
- You can increase the capacity of the drive over time
|
||||
|
||||
### EBS – Delete on Termination attribute
|
||||
|
||||
* Controls the EBS behaviour when an EC2 instance terminates
|
||||
* By default, the root EBS volume is deleted (attribute enabled)
|
||||
* By default, any other attached EBS volume is not deleted (attribute disabled)
|
||||
* This can be controlled by the AWS console / AWS CLI
|
||||
* Use case: preserve root volume when instance is terminated
|
||||
- Controls the EBS behaviour when an EC2 instance terminates
|
||||
- By default, the root EBS volume is deleted (attribute enabled)
|
||||
- By default, any other attached EBS volume is not deleted (attribute disabled)
|
||||
- This can be controlled by the AWS console / AWS CLI
|
||||
- Use case: preserve root volume when instance is terminated
|
||||
|
||||
### EBS Snapshots
|
||||
|
||||
* Make a backup (snapshot) of your EBS volume at a point in time
|
||||
* Not necessary to detach volume to do snapshot, but recommended
|
||||
* Can copy snapshots across AZ or Region
|
||||
- Make a backup (snapshot) of your EBS volume at a point in time
|
||||
- Not necessary to detach volume to do snapshot, but recommended
|
||||
- Can copy snapshots across AZ or Region
|
||||
|
||||
### EBS Snapshots Features
|
||||
|
||||
* EBS Snapshot Archive
|
||||
* Move a Snapshot to an ”archive tier” that is 75% cheaper
|
||||
* Takes within 24 to 72 hours for restoring the archive
|
||||
* Recycle Bin for EBS Snapshots
|
||||
* Setup rules to retain deleted snapshots so you can recover them after an accidental deletion
|
||||
* Specify retention (from 1 day to 1 year)
|
||||
- EBS Snapshot Archive
|
||||
- Move a Snapshot to an ”archive tier” that is 75% cheaper
|
||||
- Takes within 24 to 72 hours for restoring the archive
|
||||
- Recycle Bin for EBS Snapshots
|
||||
- Setup rules to retain deleted snapshots so you can recover them after an accidental deletion
|
||||
- Specify retention (from 1 day to 1 year)
|
||||
|
||||
## EFS: Elastic File System
|
||||
|
||||
* Managed NFS (network file system) that can be mounted on 100s of EC2
|
||||
* EFS works with Linux EC2 instances in multi-AZ
|
||||
* Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
|
||||
- Managed NFS (network file system) that can be mounted on 100s of EC2
|
||||
- EFS works with Linux EC2 instances in multi-AZ
|
||||
- Highly available, scalable, expensive (3x gp2), pay per use, no capacity planning
|
||||
|
||||
## EFS Infrequent Access (EFS-IA)
|
||||
|
||||
* Storage class that is cost-optimized for files not accessed every day
|
||||
* Up to 92% lower cost compared to EFS Standard
|
||||
* EFS will automatically move your files to EFS-IA based on the last time they were accessed
|
||||
* Enable EFS-IA with a Lifecycle Policy
|
||||
* Example: move files that are not accessed for 60 days to EFS-IA
|
||||
* Transparent to the applications accessing EFS
|
||||
- Storage class that is cost-optimized for files not accessed every day
|
||||
- Up to 92% lower cost compared to EFS Standard
|
||||
- EFS will automatically move your files to EFS-IA based on the last time they were accessed
|
||||
- Enable EFS-IA with a Lifecycle Policy
|
||||
- Example: move files that are not accessed for 60 days to EFS-IA
|
||||
- Transparent to the applications accessing EFS
|
||||
|
||||
## Amazon FSx – Overview
|
||||
|
||||
* Launch 3rd party high-performance file systems on AWS
|
||||
* Fully managed service
|
||||
* FSx for Lustre
|
||||
* FSx for Windows File Server
|
||||
* FSx for NetApp ONTAP
|
||||
- Launch 3rd party high-performance file systems on AWS
|
||||
- Fully managed service
|
||||
- FSx for Lustre
|
||||
- FSx for Windows File Server
|
||||
- FSx for NetApp ONTAP
|
||||
|
||||
### Amazon FSx for Windows File Server
|
||||
|
||||
* A fully managed, highly reliable, and scalable Windows native shared file system
|
||||
* Built on Windows File Server
|
||||
* Supports SMB protocol & Windows NTFS
|
||||
* Integrated with Microsoft Active Directory
|
||||
* Can be accessed from AWS or your on-premise infrastructure
|
||||
- A fully managed, highly reliable, and scalable Windows native shared file system
|
||||
- Built on Windows File Server
|
||||
- Supports SMB protocol & Windows NTFS
|
||||
- Integrated with Microsoft Active Directory
|
||||
- Can be accessed from AWS or your on-premise infrastructure
|
||||
|
||||
### Amazon FSx for Lustre
|
||||
|
||||
* A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
|
||||
* The name Lustre is derived from “Linux” and “cluster”
|
||||
* Machine Learning, Analytics, Video Processing, Financial Modeling
|
||||
* Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
|
||||
- A fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
|
||||
- The name Lustre is derived from “Linux” and “cluster”
|
||||
- Machine Learning, Analytics, Video Processing, Financial Modeling
|
||||
- Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
|
||||
|
||||
## EC2 Instance Store
|
||||
|
||||
* EBS volumes are network drives with good but “limited” performance
|
||||
* If you need a high-performance hardware disk, use EC2 Instance Store
|
||||
* Better I/O performance
|
||||
* EC2 Instance Store lose their storage if they’re stopped (ephemeral)
|
||||
* Good for buffer / cache / scratch data / temporary content
|
||||
* Risk of data loss if hardware fails
|
||||
* Backups and Replication are your responsibility
|
||||
- EBS volumes are network drives with good but “limited” performance
|
||||
- If you need a high-performance hardware disk, use EC2 Instance Store
|
||||
- Better I/O performance
|
||||
- EC2 Instance Store lose their storage if they’re stopped (ephemeral)
|
||||
- Good for buffer / cache / scratch data / temporary content
|
||||
- Risk of data loss if hardware fails
|
||||
- Backups and Replication are your responsibility
|
||||
|
||||
## Shared Responsibility Model for EC2 Storage
|
||||
|
||||
AWS | USER
|
||||
---- | ----
|
||||
Infrastructure | Setting up backup / snapshot procedures
|
||||
Replication for data for EBS volumes & EFS drives | Setting up data encryption
|
||||
Replacing faulty hardware | Responsibility of any data on the drives
|
||||
Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store
|
||||
| AWS | USER |
|
||||
| ------------------------------------------------- | -------------------------------------------------- |
|
||||
| Infrastructure | Setting up backup / snapshot procedures |
|
||||
| Replication for data for EBS volumes & EFS drives | Setting up data encryption |
|
||||
| Replacing faulty hardware | Responsibility of any data on the drives |
|
||||
| Ensuring their employees cannot access your data | Understanding the risk of using EC2 Instance Store |
|
||||
|
||||
## AMI Overview
|
||||
|
||||
* AMI = Amazon Machine Image
|
||||
* AMI are a customization of an EC2 instance
|
||||
* You add your own software, configuration, operating system, monitoring…
|
||||
* Faster boot / configuration time because all your software is pre-packaged
|
||||
* AMI are built for a specific region (and can be copied across regions)
|
||||
* You can launch EC2 instances from:
|
||||
* A Public AMI: AWS provided
|
||||
* Your own AMI: you make and maintain them yourself
|
||||
* An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
|
||||
- AMI = Amazon Machine Image
|
||||
- AMI are a customization of an EC2 instance
|
||||
- You add your own software, configuration, operating system, monitoring…
|
||||
- Faster boot / configuration time because all your software is pre-packaged
|
||||
- AMI are built for a specific region (and can be copied across regions)
|
||||
- You can launch EC2 instances from:
|
||||
- A Public AMI: AWS provided
|
||||
- Your own AMI: you make and maintain them yourself
|
||||
- An AWS Marketplace AMI: an AMI someone else made (and potentially sells)
|
||||
|
||||
### AMI Process (from an EC2 instance)
|
||||
|
||||
* Start an EC2 instance and customize it
|
||||
* Stop the instance (for data integrity)
|
||||
* Build an AMI – this will also create EBS snapshots
|
||||
* Launch instances from other AMIs
|
||||
- Start an EC2 instance and customize it
|
||||
- Stop the instance (for data integrity)
|
||||
- Build an AMI – this will also create EBS snapshots
|
||||
- Launch instances from other AMIs
|
||||
|
||||
## EC2 Image Builder
|
||||
|
||||
* Used to automate the creation of Virtual Machines or container images
|
||||
* => Automate the creation, maintain, validate and test EC2 AMIs
|
||||
* Can be run on a schedule (weekly, whenever packages are updated, etc…)
|
||||
* Free service (only pay for the underlying resources)
|
||||
- Used to automate the creation of Virtual Machines or container images
|
||||
- => Automate the creation, maintain, validate and test EC2 AMIs
|
||||
- Can be run on a schedule (weekly, whenever packages are updated, etc…)
|
||||
- Free service (only pay for the underlying resources)
|
||||
|
||||
@@ -1,173 +1,192 @@
|
||||
# Other Compute
|
||||
|
||||
What is Docker?
|
||||
- [Other Compute](#other-compute)
|
||||
- [What is Docker?](#what-is-docker)
|
||||
- [Where Docker images are stored?](#where-docker-images-are-stored)
|
||||
- [Docker versus Virtual Machines](#docker-versus-virtual-machines)
|
||||
- [ECS](#ecs)
|
||||
- [Fargate](#fargate)
|
||||
- [ECR](#ecr)
|
||||
- [What’s serverless?](#whats-serverless)
|
||||
- [Why AWS Lambda ?](#why-aws-lambda-)
|
||||
- [Benefits of AWS Lambda](#benefits-of-aws-lambda)
|
||||
- [AWS Lambda language support](#aws-lambda-language-support)
|
||||
- [AWS Lambda Pricing: example](#aws-lambda-pricing-example)
|
||||
- [Amazon API Gateway](#amazon-api-gateway)
|
||||
- [AWS Batch](#aws-batch)
|
||||
- [Batch vs Lambda](#batch-vs-lambda)
|
||||
- [Amazon Lightsail](#amazon-lightsail)
|
||||
- [Lambda Summary](#lambda-summary)
|
||||
- [Other Compute Summary](#other-compute-summary)
|
||||
|
||||
* Docker is a software development platform to deploy apps
|
||||
* Apps are packaged in containers that can be run on any OS
|
||||
* Apps run the same, regardless of where they’re run
|
||||
* Any machine
|
||||
* No compatibility issues
|
||||
* Predictable behavior
|
||||
* Less work
|
||||
* Easier to maintain and deploy
|
||||
* Works with any language, any OS, any technology
|
||||
* Scale containers up and down very quickly (seconds)
|
||||
## What is Docker?
|
||||
|
||||
Where Docker images are stored?
|
||||
- Docker is a software development platform to deploy apps
|
||||
- Apps are packaged in containers that can be run on any OS
|
||||
- Apps run the same, regardless of where they’re run
|
||||
- Any machine
|
||||
- No compatibility issues
|
||||
- Predictable behavior
|
||||
- Less work
|
||||
- Easier to maintain and deploy
|
||||
- Works with any language, any OS, any technology
|
||||
- Scale containers up and down very quickly (seconds)
|
||||
|
||||
* Docker images are stored in Docker Repositories
|
||||
* Public: Docker Hub <https://hub.docker.com/>
|
||||
* Find base images for many technologies or OS:
|
||||
* Ubuntu
|
||||
* MySQL
|
||||
* NodeJS, Java…
|
||||
* Private: Amazon ECR (Elastic Container Registry)
|
||||
### Where Docker images are stored?
|
||||
|
||||
## Docker versus Virtual Machines
|
||||
- Docker images are stored in Docker Repositories
|
||||
- Public: Docker Hub <https://hub.docker.com/>
|
||||
- Find base images for many technologies or OS:
|
||||
- Ubuntu
|
||||
- MySQL
|
||||
- NodeJS, Java…
|
||||
- Private: Amazon ECR (Elastic Container Registry)
|
||||
|
||||
* Docker is ”sort of” a virtualization technology, but not exactly
|
||||
* Resources are shared with the host => many containers on one server
|
||||
### Docker versus Virtual Machines
|
||||
|
||||
- Docker is ”sort of” a virtualization technology, but not exactly
|
||||
- Resources are shared with the host => many containers on one server
|
||||
|
||||
## ECS
|
||||
|
||||
* ECS = Elastic Container Service
|
||||
* Launch Docker containers on AWS
|
||||
* You must provision & maintain the infrastructure (the EC2 instances)
|
||||
* AWS takes care of starting / stopping containers
|
||||
* Has integrations with the Application Load Balancer
|
||||
- ECS = Elastic Container Service
|
||||
- Launch Docker containers on AWS
|
||||
- You must provision & maintain the infrastructure (the EC2 instances)
|
||||
- AWS takes care of starting / stopping containers
|
||||
- Has integrations with the Application Load Balancer
|
||||
|
||||
## Fargate
|
||||
|
||||
* Launch Docker containers on AWS
|
||||
* You do not provision the infrastructure (no EC2 instances to manage) – simpler!
|
||||
* Serverless offering
|
||||
* AWS just runs containers for you based on the CPU / RAM you need
|
||||
- Launch Docker containers on AWS
|
||||
- You do not provision the infrastructure (no EC2 instances to manage) – simpler!
|
||||
- Serverless offering
|
||||
- AWS just runs containers for you based on the CPU / RAM you need
|
||||
|
||||
## ECR
|
||||
|
||||
* Elastic Container Registry
|
||||
* Private Docker Registry on AWS
|
||||
* This is where you store your Docker images so they can be run by ECS or Fargate
|
||||
- Elastic Container Registry
|
||||
- Private Docker Registry on AWS
|
||||
- This is where you store your Docker images so they can be run by ECS or Fargate
|
||||
|
||||
## What’s serverless?
|
||||
|
||||
* Serverless is a new paradigm in which the developers don’t have to manage servers anymore…
|
||||
* They just deploy code
|
||||
* They just deploy… functions !
|
||||
* Initially... Serverless == FaaS (Function as a Service)
|
||||
* Serverless was pioneered by AWS Lambda but now also includes anything that’s managed: “databases, messaging, storage, etc.”
|
||||
* Serverless does not mean there are no servers…
|
||||
* it means you just don’t manage / provision / see them
|
||||
- Serverless is a new paradigm in which the developers don’t have to manage servers anymore…
|
||||
- They just deploy code
|
||||
- They just deploy… functions !
|
||||
- Initially... Serverless == FaaS (Function as a Service)
|
||||
- Serverless was pioneered by AWS Lambda but now also includes anything that’s managed: “databases, messaging, storage, etc.”
|
||||
- Serverless does not mean there are no servers…
|
||||
- it means you just don’t manage / provision / see them
|
||||
|
||||
## Why AWS Lambda ?
|
||||
|
||||
EC2 | Lambda
|
||||
---- | ----
|
||||
Virtual Servers in the Cloud | Virtual functions – no servers to manage!
|
||||
Limited by RAM and CPU | Limited by time - short executions
|
||||
Continuously running | Run on-demand
|
||||
Scaling means intervention to add / remove servers | Scaling is automated!
|
||||
| EC2 | Lambda |
|
||||
| -------------------------------------------------- | ----------------------------------------- |
|
||||
| Virtual Servers in the Cloud | Virtual functions – no servers to manage! |
|
||||
| Limited by RAM and CPU | Limited by time - short executions |
|
||||
| Continuously running | Run on-demand |
|
||||
| Scaling means intervention to add / remove servers | Scaling is automated! |
|
||||
|
||||
## Benefits of AWS Lambda
|
||||
### Benefits of AWS Lambda
|
||||
|
||||
* Easy Pricing:
|
||||
* Pay per request and compute time
|
||||
* Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
|
||||
* Integrated with the whole AWS suite of services
|
||||
* Event-Driven: functions get invoked by AWS when needed
|
||||
* Integrated with many programming languages
|
||||
* Easy monitoring through AWS CloudWatch
|
||||
* Easy to get more resources per functions (up to 10GB of RAM!)
|
||||
* Increasing RAM will also improve CPU and network!
|
||||
- Easy Pricing:
|
||||
- Pay per request and compute time
|
||||
- Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
|
||||
- Integrated with the whole AWS suite of services
|
||||
- Event-Driven: functions get invoked by AWS when needed
|
||||
- Integrated with many programming languages
|
||||
- Easy monitoring through AWS CloudWatch
|
||||
- Easy to get more resources per functions (up to 10GB of RAM!)
|
||||
- Increasing RAM will also improve CPU and network!
|
||||
|
||||
## AWS Lambda language support
|
||||
### AWS Lambda language support
|
||||
|
||||
* Node.js (JavaScript)
|
||||
* Python
|
||||
* Java (Java 8 compatible)
|
||||
* C# (.NET Core)
|
||||
* Golang
|
||||
* C# / Powershell
|
||||
* Ruby
|
||||
* Custom Runtime API (community supported, example Rust)
|
||||
* Lambda Container Image
|
||||
* The container image must implement the Lambda Runtime API
|
||||
* ECS / Fargate is preferred for running arbitrary Docker images
|
||||
- Node.js (JavaScript)
|
||||
- Python
|
||||
- Java (Java 8 compatible)
|
||||
- C# (.NET Core)
|
||||
- Golang
|
||||
- C# / Powershell
|
||||
- Ruby
|
||||
- Custom Runtime API (community supported, example Rust)
|
||||
- Lambda Container Image
|
||||
- The container image must implement the Lambda Runtime API
|
||||
- ECS / Fargate is preferred for running arbitrary Docker images
|
||||
|
||||
## AWS Lambda Pricing: example
|
||||
### AWS Lambda Pricing: example
|
||||
|
||||
* You can find overall pricing information here: <https://aws.amazon.com/lambda/pricing/>
|
||||
* Pay per calls:
|
||||
* First 1,000,000 requests are free
|
||||
* $0.20 per 1 million requests thereafter ($0.0000002 per request)
|
||||
* Pay per duration: (in increment of 1 ms)
|
||||
* 400,000 GB-seconds of compute time per month for FREE
|
||||
* == 400,000 seconds if function is 1GB RAM
|
||||
* == 3,200,000 seconds if function is 128 MB RAM
|
||||
* After that $1.00 for 600,000 GB-seconds
|
||||
* It is usually **very cheap** to run AWS Lambda so it’s **very popular**
|
||||
- You can find overall pricing information here: <https://aws.amazon.com/lambda/pricing/>
|
||||
- Pay per calls:
|
||||
- First 1,000,000 requests are free
|
||||
- $0.20 per 1 million requests thereafter ($0.0000002 per request)
|
||||
- Pay per duration: (in increment of 1 ms)
|
||||
- 400,000 GB-seconds of compute time per month for FREE
|
||||
- == 400,000 seconds if function is 1GB RAM
|
||||
- == 3,200,000 seconds if function is 128 MB RAM
|
||||
- After that $1.00 for 600,000 GB-seconds
|
||||
- It is usually **very cheap** to run AWS Lambda so it’s **very popular**
|
||||
|
||||
## Amazon API Gateway
|
||||
|
||||
* Example: building a serverless API
|
||||
* Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs
|
||||
* Serverless and scalable
|
||||
* Supports RESTful APIs and WebSocket APIs
|
||||
* Support for security, user authentication, API throttling, API keys, monitoring.
|
||||
- Example: building a serverless API
|
||||
- Fully managed service for developers to easily create, publish, maintain, monitor, and secure APIs
|
||||
- Serverless and scalable
|
||||
- Supports RESTful APIs and WebSocket APIs
|
||||
- Support for security, user authentication, API throttling, API keys, monitoring.
|
||||
|
||||
## AWS Batch
|
||||
|
||||
* Fully managed batch processing at any scale
|
||||
* Efficiently run 100,000s of computing batch jobs on AWS
|
||||
* A “batch” job is a job with a start and an end (opposed to continuous)
|
||||
* Batch will dynamically launch EC2 instances or Spot Instances
|
||||
* AWS Batch provisions the right amount of compute / memory
|
||||
* You submit or schedule batch jobs and AWS Batch does the rest!
|
||||
* Batch jobs are defined as Docker images and run on ECS
|
||||
* Helpful for cost optimizations and focusing less on the infrastructure
|
||||
- Fully managed batch processing at any scale
|
||||
- Efficiently run 100,000s of computing batch jobs on AWS
|
||||
- A “batch” job is a job with a start and an end (opposed to continuous)
|
||||
- Batch will dynamically launch EC2 instances or Spot Instances
|
||||
- AWS Batch provisions the right amount of compute / memory
|
||||
- You submit or schedule batch jobs and AWS Batch does the rest!
|
||||
- Batch jobs are defined as Docker images and run on ECS
|
||||
- Helpful for cost optimizations and focusing less on the infrastructure
|
||||
|
||||
## Batch vs Lambda
|
||||
|
||||
Batch | Lambda
|
||||
---- | ----
|
||||
No time limit | Time limit
|
||||
Any runtime as long as it’s packaged as a Docker image | Limited runtime
|
||||
Rely on EBS / instance store for disk space | Limited temporary disk space
|
||||
Relies on EC2 (can be managed by AWS) | Serverless
|
||||
| Batch | Lambda |
|
||||
| ------------------------------------------------------ | ---------------------------- |
|
||||
| No time limit | Time limit |
|
||||
| Any runtime as long as it’s packaged as a Docker image | Limited runtime |
|
||||
| Rely on EBS / instance store for disk space | Limited temporary disk space |
|
||||
| Relies on EC2 (can be managed by AWS) | Serverless |
|
||||
|
||||
## Amazon Lightsail
|
||||
|
||||
* Virtual servers, storage, databases, and networking
|
||||
* Low & predictable pricing
|
||||
* Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
|
||||
* Great for people with little cloud experience!
|
||||
* Can setup notifications and monitoring of your Lightsail resources
|
||||
* Use cases:
|
||||
* Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
|
||||
* Websites (templates for WordPress, Magento, Plesk, Joomla)
|
||||
* Dev / Test environment
|
||||
* Has high availability but no auto-scaling, limited AWS integrations
|
||||
- Virtual servers, storage, databases, and networking
|
||||
- Low & predictable pricing
|
||||
- Simpler alternative to using EC2, RDS, ELB, EBS, Route 53…
|
||||
- Great for people with little cloud experience!
|
||||
- Can setup notifications and monitoring of your Lightsail resources
|
||||
- Use cases:
|
||||
- Simple web applications (has templates for LAMP, Nginx, MEAN, Node.js…)
|
||||
- Websites (templates for WordPress, Magento, Plesk, Joomla)
|
||||
- Dev / Test environment
|
||||
- Has high availability but no auto-scaling, limited AWS integrations
|
||||
|
||||
## Lambda Summary
|
||||
|
||||
* Lambda is Serverless, Function as a Service, seamless scaling, reactive
|
||||
* Lambda Billing:
|
||||
* By the time run x by the RAM provisioned
|
||||
* By the number of invocations
|
||||
* Language Support: many programming languages except (arbitrary) Docker
|
||||
* Invocation time: up to 15 minutes
|
||||
* Use cases:
|
||||
* Create Thumbnails for images uploaded onto S3
|
||||
* Run a Serverless cron job
|
||||
* API Gateway: expose Lambda functions as HTTP API
|
||||
- Lambda is Serverless, Function as a Service, seamless scaling, reactive
|
||||
- Lambda Billing:
|
||||
- By the time run x by the RAM provisioned
|
||||
- By the number of invocations
|
||||
- Language Support: many programming languages except (arbitrary) Docker
|
||||
- Invocation time: up to 15 minutes
|
||||
- Use cases:
|
||||
- Create Thumbnails for images uploaded onto S3
|
||||
- Run a Serverless cron job
|
||||
- API Gateway: expose Lambda functions as HTTP API
|
||||
|
||||
## Other Compute Summary
|
||||
|
||||
* Docker: container technology to run applications
|
||||
* ECS: run Docker containers on EC2 instances
|
||||
* Fargate:
|
||||
* Run Docker containers without provisioning the infrastructure
|
||||
* Serverless offering (no EC2 instances)
|
||||
* ECR: Private Docker Images Repository
|
||||
* Batch: run batch jobs on AWS across managed EC2 instances
|
||||
* Lightsail: predictable & low pricing for simple application & DB stacks
|
||||
- Docker: container technology to run applications
|
||||
- ECS: run Docker containers on EC2 instances
|
||||
- Fargate:
|
||||
- Run Docker containers without provisioning the infrastructure
|
||||
- Serverless offering (no EC2 instances)
|
||||
- ECR: Private Docker Images Repository
|
||||
- Batch: run batch jobs on AWS across managed EC2 instances
|
||||
- Lightsail: predictable & low pricing for simple application & DB stacks
|
||||
|
||||
565
sections/s3.md
565
sections/s3.md
@@ -1,71 +1,109 @@
|
||||
# Amazon S3
|
||||
|
||||
- [Amazon S3](#amazon-s3)
|
||||
- [S3 Use cases](#s3-use-cases)
|
||||
- [Amazon S3 Overview - Buckets](#amazon-s3-overview---buckets)
|
||||
- [Amazon S3 Overview - Objects](#amazon-s3-overview---objects)
|
||||
- [S3 Security](#s3-security)
|
||||
- [S3 Bucket Policies](#s3-bucket-policies)
|
||||
- [Bucket settings for Block Public Access](#bucket-settings-for-block-public-access)
|
||||
- [S3 Websites](#s3-websites)
|
||||
- [S3 - Versioning](#s3---versioning)
|
||||
- [S3 Access Logs](#s3-access-logs)
|
||||
- [S3 Replication (CRR & SRR)](#s3-replication-crr--srr)
|
||||
- [S3 Storage Classes](#s3-storage-classes)
|
||||
- [S3 Durability and Availability](#s3-durability-and-availability)
|
||||
- [S3 Standard General Purpose](#s3-standard-general-purpose)
|
||||
- [S3 Storage Classes - Infrequent Access](#s3-storage-classes---infrequent-access)
|
||||
- [S3 Standard Infrequent Access (S3 Standard-IA)](#s3-standard-infrequent-access-s3-standard-ia)
|
||||
- [S3 One Zone Infrequent Access (S3 One Zone-IA)](#s3-one-zone-infrequent-access-s3-one-zone-ia)
|
||||
- [Amazon S3 Glacier Storage Classes](#amazon-s3-glacier-storage-classes)
|
||||
- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
|
||||
- [Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
|
||||
- [Amazon S3 Glacier Deep Archive - for long term storage](#amazon-s3-glacier-deep-archive---for-long-term-storage)
|
||||
- [S3 Intelligent-Tiering](#s3-intelligent-tiering)
|
||||
- [S3 Object Lock & Glacier Vault Lock](#s3-object-lock--glacier-vault-lock)
|
||||
- [Shared Responsibility Model for S3](#shared-responsibility-model-for-s3)
|
||||
- [AWS Snow Family](#aws-snow-family)
|
||||
- [Data Migrations with AWS Snow Family](#data-migrations-with-aws-snow-family)
|
||||
- [Time to Transfer](#time-to-transfer)
|
||||
- [Snowball Edge (for data transfers)](#snowball-edge-for-data-transfers)
|
||||
- [AWS Snowcone](#aws-snowcone)
|
||||
- [AWS Snowmobile](#aws-snowmobile)
|
||||
- [Snow Family - Usage Process](#snow-family---usage-process)
|
||||
- [What is Edge Computing?](#what-is-edge-computing)
|
||||
- [Snow Family - Edge Computing](#snow-family---edge-computing)
|
||||
- [AWS OpsHub](#aws-opshub)
|
||||
- [Hybrid Cloud for Storage](#hybrid-cloud-for-storage)
|
||||
- [AWS Storage Gateway](#aws-storage-gateway)
|
||||
- [Amazon S3 - Summary](#amazon-s3---summary)
|
||||
|
||||
## S3 Use cases
|
||||
|
||||
* Backup and storage
|
||||
* Disaster Recovery
|
||||
* Archive
|
||||
* Hybrid Cloud storage
|
||||
* Application hosting
|
||||
* Media hosting
|
||||
* Data lakes & big data analytics
|
||||
* Software delivery
|
||||
* Static website
|
||||
- Backup and storage
|
||||
- Disaster Recovery
|
||||
- Archive
|
||||
- Hybrid Cloud storage
|
||||
- Application hosting
|
||||
- Media hosting
|
||||
- Data lakes & big data analytics
|
||||
- Software delivery
|
||||
- Static website
|
||||
|
||||
## Amazon S3 Overview - Buckets
|
||||
|
||||
* Amazon S3 allows people to store objects (files) in “buckets” (directories)
|
||||
* Buckets must have a globally unique name (across all regions all accounts)
|
||||
* Buckets are defined at the region level
|
||||
* S3 looks like a global service but buckets are created in a region
|
||||
* Naming convention
|
||||
* No uppercase
|
||||
* No underscore
|
||||
* 3-63 characters long
|
||||
* Not an IP
|
||||
* Must start with lowercase letter or number
|
||||
- Amazon S3 allows people to store objects (files) in “buckets” (directories)
|
||||
- Buckets must have a globally unique name (across all regions all accounts)
|
||||
- Buckets are defined at the region level
|
||||
- S3 looks like a global service but buckets are created in a region
|
||||
- Naming convention
|
||||
- No uppercase
|
||||
- No underscore
|
||||
- 3-63 characters long
|
||||
- Not an IP
|
||||
- Must start with lowercase letter or number
|
||||
|
||||
## Amazon S3 Overview - Objects
|
||||
|
||||
* Objects (files) have a Key
|
||||
* The key is the FULL path:
|
||||
* s3://my-bucket/my_file.txt
|
||||
* s3://my-bucket/my_folder1/another_folder/my_file.txt
|
||||
* The key is composed of **prefix** + **object name**
|
||||
* s3://my-bucket/my_folder1/another_folder/my_file.txt
|
||||
* There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
|
||||
* Just keys with very long names that contain slashes (“/”)
|
||||
* Object values are the content of the body:
|
||||
* Max Object Size is 5TB (5000GB)
|
||||
* If uploading more than 5GB, must use “multi-part upload”
|
||||
* Metadata (list of text key / value pairs – system or user metadata)
|
||||
* Tags (Unicode key / value pair – up to 10) – useful for security / lifecycle
|
||||
* Version ID (if versioning is enabled)
|
||||
- Objects (files) have a Key
|
||||
- The key is the FULL path:
|
||||
- s3://my-bucket/my_file.txt
|
||||
- s3://my-bucket/my_folder1/another_folder/my_file.txt
|
||||
- The key is composed of **prefix** + **object name**
|
||||
- s3://my-bucket/my_folder1/another_folder/my_file.txt
|
||||
- There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
|
||||
- Just keys with very long names that contain slashes (“/”)
|
||||
- Object values are the content of the body:
|
||||
- Max Object Size is 5TB (5000GB)
|
||||
- If uploading more than 5GB, must use “multi-part upload”
|
||||
- Metadata (list of text key / value pairs – system or user metadata)
|
||||
- Tags (Unicode key / value pair – up to 10) – useful for security / lifecycle
|
||||
- Version ID (if versioning is enabled)
|
||||
|
||||
## S3 Security
|
||||
|
||||
* **User based**
|
||||
* IAM policies - which API calls should be allowed for a specific user from IAM console
|
||||
* **Resource Based**
|
||||
* Bucket Policies - bucket wide rules from the S3 console - allows cross account
|
||||
* Object Access Control List (ACL) – finer grain
|
||||
* Bucket Access Control List (ACL) – less common
|
||||
* **Note:** an IAM principal can access an S3 object if
|
||||
* the user IAM permissions allow it OR the resource policy ALLOWS it
|
||||
* AND there’s no explicit DENY
|
||||
* **Encryption:** encrypt objects in Amazon S3 using encryption keys
|
||||
- **User based**
|
||||
- IAM policies - which API calls should be allowed for a specific user from IAM console
|
||||
- **Resource Based**
|
||||
- Bucket Policies - bucket wide rules from the S3 console - allows cross account
|
||||
- Object Access Control List (ACL) – finer grain
|
||||
- Bucket Access Control List (ACL) – less common
|
||||
- **Note:** an IAM principal can access an S3 object if
|
||||
- the user IAM permissions allow it OR the resource policy ALLOWS it
|
||||
- AND there’s no explicit DENY
|
||||
- **Encryption:** encrypt objects in Amazon S3 using encryption keys
|
||||
|
||||
S3 Bucket Policies
|
||||
## S3 Bucket Policies
|
||||
|
||||
* JSON based policies
|
||||
* Resources: buckets and objects
|
||||
* Actions: Set of API to Allow or Deny
|
||||
* Effect: Allow / Deny
|
||||
- JSON based policies
|
||||
- Resources: buckets and objects
|
||||
- Actions: Set of API to Allow or Deny
|
||||
- Effect: Allow / Deny
|
||||
Principal: The account or user to apply the policy to
|
||||
* Use S3 bucket for policy to:
|
||||
* Grant public access to the bucket
|
||||
* Force objects to be encrypted at upload
|
||||
* Grant access to another account (Cross Account)
|
||||
- Use S3 bucket for policy to:
|
||||
- Grant public access to the bucket
|
||||
- Force objects to be encrypted at upload
|
||||
- Grant access to another account (Cross Account)
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -88,215 +126,216 @@ S3 Bucket Policies
|
||||
|
||||
## Bucket settings for Block Public Access
|
||||
|
||||
* Block all public access: On
|
||||
* Block public access to buckets and objects granted through new access control lists (ACLS): On
|
||||
* Block public access to buckets and objects granted through any access control lists (ACLS): On
|
||||
* Block public access to buckets and objects granted through new public bucket or access point policies: On
|
||||
* Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
|
||||
- Block all public access: On
|
||||
- Block public access to buckets and objects granted through new access control lists (ACLS): On
|
||||
- Block public access to buckets and objects granted through any access control lists (ACLS): On
|
||||
- Block public access to buckets and objects granted through new public bucket or access point policies: On
|
||||
- Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
|
||||
|
||||
* These settings were created to prevent company data leaks
|
||||
* If you know your bucket should never be public, leave these on
|
||||
* Can be set at the account level
|
||||
- These settings were created to prevent company data leaks
|
||||
- If you know your bucket should never be public, leave these on
|
||||
- Can be set at the account level
|
||||
|
||||
## S3 Websites
|
||||
|
||||
* S3 can host static websites and have them accessible on the www
|
||||
* The website URL will be:
|
||||
* bucket-name.s3-website-AWS-region.amazonaws.com
|
||||
- S3 can host static websites and have them accessible on the www
|
||||
- The website URL will be:
|
||||
- bucket-name.s3-website-AWS-region.amazonaws.com
|
||||
OR
|
||||
* bucket-name.s3-website.AWS-region.amazonaws.com
|
||||
* **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
|
||||
- bucket-name.s3-website.AWS-region.amazonaws.com
|
||||
- **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
|
||||
|
||||
## S3 - Versioning
|
||||
|
||||
* You can version your files in Amazon S3
|
||||
* It is enabled at the bucket level
|
||||
* Same key overwrite will increment the “version”: 1, 2, 3….
|
||||
* It is best practice to version your buckets
|
||||
* Protect against unintended deletes (ability to restore a version)
|
||||
* Easy roll back to previous version
|
||||
* Notes:
|
||||
* Any file that is not versioned prior to enabling versioning will have version “null”
|
||||
* Suspending versioning does not delete the previous versions
|
||||
- You can version your files in Amazon S3
|
||||
- It is enabled at the bucket level
|
||||
- Same key overwrite will increment the “version”: 1, 2, 3….
|
||||
- It is best practice to version your buckets
|
||||
- Protect against unintended deletes (ability to restore a version)
|
||||
- Easy roll back to previous version
|
||||
- Notes:
|
||||
- Any file that is not versioned prior to enabling versioning will have version “null”
|
||||
- Suspending versioning does not delete the previous versions
|
||||
|
||||
## S3 Access Logs
|
||||
|
||||
* For audit purpose, you may want to log all access to S3 buckets
|
||||
* Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
|
||||
* That data can be analyzed using data analysis tools…
|
||||
* Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
|
||||
- For audit purpose, you may want to log all access to S3 buckets
|
||||
- Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
|
||||
- That data can be analyzed using data analysis tools…
|
||||
- Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
|
||||
|
||||
## S3 Replication (CRR & SRR)
|
||||
|
||||
* Must enable versioning in source and destination
|
||||
* Cross Region Replication (CRR)
|
||||
* Same Region Replication (SRR)
|
||||
* Buckets can be in different accounts
|
||||
* Copying is asynchronous
|
||||
* Must give proper IAM permissions to S3
|
||||
* CRR - Use cases: compliance, lower latency access, replication across accounts
|
||||
* SRR – Use cases: log aggregation, live replication between production and test accounts
|
||||
- Must enable versioning in source and destination
|
||||
- Cross Region Replication (CRR)
|
||||
- Same Region Replication (SRR)
|
||||
- Buckets can be in different accounts
|
||||
- Copying is asynchronous
|
||||
- Must give proper IAM permissions to S3
|
||||
- CRR - Use cases: compliance, lower latency access, replication across accounts
|
||||
- SRR – Use cases: log aggregation, live replication between production and test accounts
|
||||
|
||||
## S3 Storage Classes
|
||||
|
||||
* [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
|
||||
* [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
|
||||
* [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
|
||||
* [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
|
||||
* [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
|
||||
* [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive-–-for-long-term-storage)
|
||||
* [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
|
||||
- [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
|
||||
- [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
|
||||
- [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
|
||||
- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
|
||||
- [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
|
||||
- [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive-–-for-long-term-storage)
|
||||
- [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
|
||||
|
||||
* Can move between classes manually or using S3 Lifecycle configurations
|
||||
- Can move between classes manually or using S3 Lifecycle configurations
|
||||
|
||||
## S3 Durability and Availability
|
||||
### S3 Durability and Availability
|
||||
|
||||
* Durability:
|
||||
* High durability (99.999999999%, 11 9’s) of objects across multiple AZ
|
||||
* If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
|
||||
* Same for all storage classes
|
||||
* Availability:
|
||||
* Measures how readily available a service is
|
||||
* Varies depending on storage class
|
||||
* Example: S3 standard has 99.99% availability = not available 53 minutes a year
|
||||
- Durability:
|
||||
- High durability (99.999999999%, 11 9’s) of objects across multiple AZ
|
||||
- If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
|
||||
- Same for all storage classes
|
||||
- Availability:
|
||||
- Measures how readily available a service is
|
||||
- Varies depending on storage class
|
||||
- Example: S3 standard has 99.99% availability = not available 53 minutes a year
|
||||
|
||||
## S3 Standard General Purpose
|
||||
### S3 Standard General Purpose
|
||||
|
||||
* 99.99% Availability
|
||||
* Used for frequently accessed data
|
||||
* Low latency and high throughput
|
||||
* Sustain 2 concurrent facility failures
|
||||
* Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
|
||||
- 99.99% Availability
|
||||
- Used for frequently accessed data
|
||||
- Low latency and high throughput
|
||||
- Sustain 2 concurrent facility failures
|
||||
- Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
|
||||
|
||||
## S3 Storage Classes – Infrequent Access
|
||||
### S3 Storage Classes - Infrequent Access
|
||||
|
||||
* For data that is less frequently accessed, but requires rapid access when needed
|
||||
* Lower cost than S3 Standard
|
||||
- For data that is less frequently accessed, but requires rapid access when needed
|
||||
- Lower cost than S3 Standard
|
||||
|
||||
### S3 Standard Infrequent Access (S3 Standard-IA)
|
||||
#### S3 Standard Infrequent Access (S3 Standard-IA)
|
||||
|
||||
* 99.9% Availability
|
||||
* Use cases: Disaster Recovery, backups
|
||||
- 99.9% Availability
|
||||
- Use cases: Disaster Recovery, backups
|
||||
|
||||
### S3 One Zone Infrequent Access (S3 One Zone-IA)
|
||||
#### S3 One Zone Infrequent Access (S3 One Zone-IA)
|
||||
|
||||
* High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
|
||||
* 99.5% Availability
|
||||
* Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
|
||||
- High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
|
||||
- 99.5% Availability
|
||||
- Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
|
||||
|
||||
## Amazon S3 Glacier Storage Classes
|
||||
### Amazon S3 Glacier Storage Classes
|
||||
|
||||
* Low-cost object storage meant for archiving / backup
|
||||
* Pricing: price for storage + object retrieval cost
|
||||
- Low-cost object storage meant for archiving / backup
|
||||
- Pricing: price for storage + object retrieval cost
|
||||
|
||||
### Amazon S3 Glacier Instant Retrieval
|
||||
#### Amazon S3 Glacier Instant Retrieval
|
||||
|
||||
* Millisecond retrieval, great for data accessed once a quarter
|
||||
* Minimum storage duration of 90 days
|
||||
- Millisecond retrieval, great for data accessed once a quarter
|
||||
- Minimum storage duration of 90 days
|
||||
|
||||
### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
|
||||
#### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
|
||||
|
||||
* Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – free
|
||||
* Minimum storage duration of 90 days
|
||||
- Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – free
|
||||
- Minimum storage duration of 90 days
|
||||
|
||||
### Amazon S3 Glacier Deep Archive – for long term storage
|
||||
#### Amazon S3 Glacier Deep Archive - for long term storage
|
||||
|
||||
* Standard (12 hours), Bulk (48 hours)
|
||||
* Minimum storage duration of 180 days
|
||||
- Standard (12 hours), Bulk (48 hours)
|
||||
- Minimum storage duration of 180 days
|
||||
|
||||
## S3 Intelligent-Tiering
|
||||
### S3 Intelligent-Tiering
|
||||
|
||||
* Small monthly monitoring and auto-tiering fee
|
||||
* Moves objects automatically between Access Tiers based on usage
|
||||
* There are no retrieval charges in S3 Intelligent-Tiering
|
||||
* Frequent Access tier (automatic): default tier
|
||||
* Infrequent Access tier (automatic): objects not accessed for 30 days
|
||||
* Archive Instant Access tier (automatic): objects not accessed for 90 days
|
||||
* Archive Access tier (optional): configurable from 90 days to 700+ days
|
||||
* Deep Archive Access tier (optional): config. from 180 days to 700+ days
|
||||
- Small monthly monitoring and auto-tiering fee
|
||||
- Moves objects automatically between Access Tiers based on usage
|
||||
- There are no retrieval charges in S3 Intelligent-Tiering
|
||||
- Frequent Access tier (automatic): default tier
|
||||
- Infrequent Access tier (automatic): objects not accessed for 30 days
|
||||
- Archive Instant Access tier (automatic): objects not accessed for 90 days
|
||||
- Archive Access tier (optional): configurable from 90 days to 700+ days
|
||||
- Deep Archive Access tier (optional): config. from 180 days to 700+ days
|
||||
|
||||
## S3 Object Lock & Glacier Vault Lock
|
||||
|
||||
* S3 Object Lock
|
||||
* Adopt a WORM (Write Once Read Many) model
|
||||
* Block an object version deletion for a specified amount of time
|
||||
* Glacier Vault Lock
|
||||
* Adopt a WORM (Write Once Read Many) model
|
||||
* Lock the policy for future edits (can no longer be changed)
|
||||
* Helpful for compliance and data retention
|
||||
- S3 Object Lock
|
||||
- Adopt a WORM (Write Once Read Many) model
|
||||
- Block an object version deletion for a specified amount of time
|
||||
- Glacier Vault Lock
|
||||
- Adopt a WORM (Write Once Read Many) model
|
||||
- Lock the policy for future edits (can no longer be changed)
|
||||
- Helpful for compliance and data retention
|
||||
|
||||
## Shared Responsibility Model for S3
|
||||
|
||||
AWS | YOU
|
||||
---- | ----
|
||||
Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup
|
||||
Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes
|
||||
Compliance validation | Data encryption at rest and in transit
|
||||
| AWS | YOU |
|
||||
| ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
|
||||
| Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup |
|
||||
| Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes |
|
||||
| Compliance validation | Data encryption at rest and in transit |
|
||||
|
||||
## AWS Snow Family
|
||||
|
||||
* Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
|
||||
* Data migration:
|
||||
* Snowcone
|
||||
* Snowball Edge
|
||||
* Snowmobile
|
||||
* Edge computing:
|
||||
* Snowcone
|
||||
* Snowball Edge
|
||||
- Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
|
||||
- Data migration:
|
||||
- Snowcone
|
||||
- Snowball Edge
|
||||
- Snowmobile
|
||||
- Edge computing:
|
||||
- Snowcone
|
||||
- Snowball Edge
|
||||
|
||||
## Data Migrations with AWS Snow Family
|
||||
### Data Migrations with AWS Snow Family
|
||||
|
||||
* **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
|
||||
- **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
|
||||
|
||||
* Challenges:
|
||||
* Limited connectivity
|
||||
* Limited bandwidth
|
||||
* High network cost
|
||||
* Shared bandwidth (can’t maximize the line)
|
||||
* Connection stability
|
||||
- Challenges:
|
||||
- Limited connectivity
|
||||
- Limited bandwidth
|
||||
- High network cost
|
||||
- Shared bandwidth (can’t maximize the line)
|
||||
- Connection stability
|
||||
|
||||
## Time to Transfer
|
||||
### Time to Transfer
|
||||
|
||||
Data | 100 Mbps | 1Gbps | 10Gbps
|
||||
10 TB | 12 days | 30 hours | 3 hours
|
||||
100 TB | 124 days | 12 days | 30 hours
|
||||
1 PB | 3 years | 124 days | 12 days
|
||||
| Data | 100 Mbps | 1Gbps | 10Gbps |
|
||||
| ------ | -------- | -------- | -------- |
|
||||
| 10 TB | 12 days | 30 hours | 3 hours |
|
||||
| 100 TB | 124 days | 12 days | 30 hours |
|
||||
| 1 PB | 3 years | 124 days | 12 days |
|
||||
|
||||
## Snowball Edge (for data transfers)
|
||||
### Snowball Edge (for data transfers)
|
||||
|
||||
* Physical data transport solution: move TBs or PBs of data in or out of AWS
|
||||
* Alternative to moving data over the network (and paying network fees)
|
||||
* Pay per data transfer job
|
||||
* Provide block storage and Amazon S3-compatible object storage
|
||||
* Snowball Edge Storage Optimized
|
||||
* 80 TB of HDD capacity for block volume and S3 compatible object storage
|
||||
* Snowball Edge Compute Optimized
|
||||
* 42 TB of HDD capacity for block volume and S3 compatible object storage
|
||||
* Use cases: large data cloud migrations, DC decommission, disaster recovery
|
||||
- Physical data transport solution: move TBs or PBs of data in or out of AWS
|
||||
- Alternative to moving data over the network (and paying network fees)
|
||||
- Pay per data transfer job
|
||||
- Provide block storage and Amazon S3-compatible object storage
|
||||
- Snowball Edge Storage Optimized
|
||||
- 80 TB of HDD capacity for block volume and S3 compatible object storage
|
||||
- Snowball Edge Compute Optimized
|
||||
- 42 TB of HDD capacity for block volume and S3 compatible object storage
|
||||
- Use cases: large data cloud migrations, DC decommission, disaster recovery
|
||||
|
||||
## AWS Snowcone
|
||||
### AWS Snowcone
|
||||
|
||||
* Small, portable computing, anywhere, rugged & secure, withstands harsh environments
|
||||
* Light (4.5 pounds, 2.1 kg)
|
||||
* Device used for edge computing, storage, and data transfer
|
||||
* **8 TBs of usable storage**
|
||||
* Use Snowcone where Snowball does not fit (space-constrained environment)
|
||||
* Must provide your own battery / cables
|
||||
* Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
|
||||
- Small, portable computing, anywhere, rugged & secure, withstands harsh environments
|
||||
- Light (4.5 pounds, 2.1 kg)
|
||||
- Device used for edge computing, storage, and data transfer
|
||||
- **8 TBs of usable storage**
|
||||
- Use Snowcone where Snowball does not fit (space-constrained environment)
|
||||
- Must provide your own battery / cables
|
||||
- Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
|
||||
|
||||
## AWS Snowmobile
|
||||
### AWS Snowmobile
|
||||
|
||||
* Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
|
||||
* Each Snowmobile has 100 PB of capacity (use multiple in parallel)
|
||||
* High security: temperature controlled, GPS, 24/7 video surveillance
|
||||
* **Better than Snowball if you transfer more than 10 PB**
|
||||
- Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
|
||||
- Each Snowmobile has 100 PB of capacity (use multiple in parallel)
|
||||
- High security: temperature controlled, GPS, 24/7 video surveillance
|
||||
- **Better than Snowball if you transfer more than 10 PB**
|
||||
|
||||
Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile
|
||||
---- | ---- | ---- | ----
|
||||
Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB
|
||||
Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline
|
||||
| Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile |
|
||||
| ---------------- | ------------------------------- | ------------------------------- | ----------------------- |
|
||||
| Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB |
|
||||
| Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline |
|
||||
|
||||
## Snow Family – Usage Process
|
||||
### Snow Family - Usage Process
|
||||
|
||||
1. Request Snowball devices from the AWS console for delivery
|
||||
2. Install the snowball client / AWS OpsHub on your servers
|
||||
@@ -307,78 +346,78 @@ Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up
|
||||
|
||||
## What is Edge Computing?
|
||||
|
||||
* Process data while it’s being created on an edge location
|
||||
* A truck on the road, a ship on the sea, a mining station underground...
|
||||
* These locations may have
|
||||
* Limited / no internet access
|
||||
* Limited / no easy access to computing power
|
||||
* We setup a **Snowball Edge / Snowcone** device to do edge computing
|
||||
* Use cases of Edge Computing:
|
||||
* Preprocess data
|
||||
* Machine learning at the edge
|
||||
* Transcoding media streams
|
||||
* Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
|
||||
- Process data while it’s being created on an edge location
|
||||
- A truck on the road, a ship on the sea, a mining station underground...
|
||||
- These locations may have
|
||||
- Limited / no internet access
|
||||
- Limited / no easy access to computing power
|
||||
- We setup a **Snowball Edge / Snowcone** device to do edge computing
|
||||
- Use cases of Edge Computing:
|
||||
- Preprocess data
|
||||
- Machine learning at the edge
|
||||
- Transcoding media streams
|
||||
- Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
|
||||
|
||||
## Snow Family – Edge Computing
|
||||
## Snow Family - Edge Computing
|
||||
|
||||
* **Snowcone (smaller)**
|
||||
* 2 CPUs, 4 GB of memory, wired or wireless access
|
||||
* USB-C power using a cord or the optional battery
|
||||
* **Snowball Edge – Compute Optimized**
|
||||
* 52 vCPUs, 208 GiB of RAM
|
||||
* Optional GPU (useful for video processing or machine learning)
|
||||
* 42 TB usable storage
|
||||
* **Snowball Edge – Storage Optimized**
|
||||
* Up to 40 vCPUs, 80 GiB of RAM
|
||||
* Object storage clustering available
|
||||
* All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
|
||||
* Long-term deployment options: 1 and 3 years discounted pricing
|
||||
- **Snowcone (smaller)**
|
||||
- 2 CPUs, 4 GB of memory, wired or wireless access
|
||||
- USB-C power using a cord or the optional battery
|
||||
- **Snowball Edge – Compute Optimized**
|
||||
- 52 vCPUs, 208 GiB of RAM
|
||||
- Optional GPU (useful for video processing or machine learning)
|
||||
- 42 TB usable storage
|
||||
- **Snowball Edge – Storage Optimized**
|
||||
- Up to 40 vCPUs, 80 GiB of RAM
|
||||
- Object storage clustering available
|
||||
- All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
|
||||
- Long-term deployment options: 1 and 3 years discounted pricing
|
||||
|
||||
## AWS OpsHub
|
||||
|
||||
* Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
|
||||
* Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
|
||||
* Unlocking and configuring single or clustered devices
|
||||
* Transferring files
|
||||
* Launching and managing instances running on Snow Family Devices
|
||||
* Monitor device metrics (storage capacity, active instances on your device)
|
||||
* Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
|
||||
- Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
|
||||
- Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
|
||||
- Unlocking and configuring single or clustered devices
|
||||
- Transferring files
|
||||
- Launching and managing instances running on Snow Family Devices
|
||||
- Monitor device metrics (storage capacity, active instances on your device)
|
||||
- Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
|
||||
|
||||
## Hybrid Cloud for Storage
|
||||
|
||||
* AWS is pushing for ”hybrid cloud”
|
||||
* Part of your infrastructure is on-premises
|
||||
* Part of your infrastructure is on the cloud
|
||||
* This can be due to
|
||||
* Long cloud migrations
|
||||
* Security requirements
|
||||
* Compliance requirements
|
||||
* IT strategy
|
||||
* S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
|
||||
* AWS Storage Gateway!
|
||||
- AWS is pushing for ”hybrid cloud”
|
||||
- Part of your infrastructure is on-premises
|
||||
- Part of your infrastructure is on the cloud
|
||||
- This can be due to
|
||||
- Long cloud migrations
|
||||
- Security requirements
|
||||
- Compliance requirements
|
||||
- IT strategy
|
||||
- S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
|
||||
- AWS Storage Gateway!
|
||||
|
||||
## AWS Storage Gateway
|
||||
|
||||
* Bridge between on-premise data and cloud data in S3
|
||||
* Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
|
||||
* Use cases: disaster recovery, backup & restore, tiered storage
|
||||
* Types of Storage Gateway:
|
||||
* File Gateway
|
||||
* Volume Gateway
|
||||
* Tape Gateway
|
||||
* No need to know the types at the exam
|
||||
- Bridge between on-premise data and cloud data in S3
|
||||
- Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
|
||||
- Use cases: disaster recovery, backup & restore, tiered storage
|
||||
- Types of Storage Gateway:
|
||||
- File Gateway
|
||||
- Volume Gateway
|
||||
- Tape Gateway
|
||||
- No need to know the types at the exam
|
||||
|
||||
## Amazon S3 – Summary
|
||||
## Amazon S3 - Summary
|
||||
|
||||
* Buckets vs Objects: global unique name, tied to a region
|
||||
* S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
|
||||
* S3 Websites: host a static website on Amazon S3
|
||||
* S3 Versioning: multiple versions for files, prevent accidental deletes
|
||||
* S3 Access Logs: log requests made within your S3 bucket
|
||||
* S3 Replication: same-region or cross-region, must enable versioning
|
||||
* S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
|
||||
* S3 Lifecycle Rules: transition objects between classes
|
||||
* S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
|
||||
* Snow Family: import data onto S3 through a physical device, edge computing
|
||||
* OpsHub: desktop application to manage Snow Family devices
|
||||
* Storage Gateway: hybrid solution to extend on-premises storage to S3
|
||||
- Buckets vs Objects: global unique name, tied to a region
|
||||
- S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
|
||||
- S3 Websites: host a static website on Amazon S3
|
||||
- S3 Versioning: multiple versions for files, prevent accidental deletes
|
||||
- S3 Access Logs: log requests made within your S3 bucket
|
||||
- S3 Replication: same-region or cross-region, must enable versioning
|
||||
- S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
|
||||
- S3 Lifecycle Rules: transition objects between classes
|
||||
- S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
|
||||
- Snow Family: import data onto S3 through a physical device, edge computing
|
||||
- OpsHub: desktop application to manage Snow Family devices
|
||||
- Storage Gateway: hybrid solution to extend on-premises storage to S3
|
||||
|
||||
Reference in New Issue
Block a user