[Modify/Add] Add ELB, ASG, S3 Doc.

This commit is contained in:
Kanani Nirav
2024-10-20 15:37:58 +09:00
parent eaf8d31e74
commit 4224f41b73
3 changed files with 558 additions and 0 deletions

View File

@@ -17,6 +17,10 @@
- What is Amazon EC2?, Introduction to Security Groups, Classic Ports to know, EC2 Instance Launch Types, Which purchasing option is right for me?, Shared Responsibility Model for EC2
- [EC2 Instance Storage](./sections/ec2_storage.md)
- EBS Volumes, EFS: Elastic File System, EFS Infrequent Access (EFS-IA), Amazon FSx Overview, EC2 Instance Store, Shared Responsibility Model for EC2 Storage
- [Elastic Load Balancing & Auto Scaling Groups](./sections/elb_asg.md)
- Scalability & High Availability, Vertical Scalability, Horizontal Scalability, High Availability, High Availability & Scalability For EC2, Scalability vs Elasticity (vs Agility), What is load balancing?, Whats an Auto Scaling Group?
- [Amazon S3](./sections/s3.md)
- S3 Use cases, Amazon S3 Overview - Buckets, Amazon S3 Overview - Objects, S3 Websites, S3 Storage Classes, S3 Object Lock & Glacier Vault Lock, Shared Responsibility Model for S3, AWS Snow Family, What is Edge Computing?, Snow Family - Edge Computing, AWS OpsHub, Hybrid Cloud for Storage, AWS Storage Gateway
## Practice Exams ( dumps )

130
sections/elb_asg.md Normal file
View File

@@ -0,0 +1,130 @@
# Elastic Load Balancing & Auto Scaling Groups
- [Elastic Load Balancing \& Auto Scaling Groups](#elastic-load-balancing--auto-scaling-groups)
- [Scalability \& High Availability](#scalability--high-availability)
- [Vertical Scalability](#vertical-scalability)
- [Horizontal Scalability](#horizontal-scalability)
- [High Availability](#high-availability)
- [High Availability \& Scalability for EC2](#high-availability--scalability-for-ec2)
- [Scalability vs Elasticity (vs Agility)](#scalability-vs-elasticity-vs-agility)
- [What is Load Balancing?](#what-is-load-balancing)
- [Why Use a Load Balancer?](#why-use-a-load-balancer)
- [Why Use an Elastic Load Balancer?](#why-use-an-elastic-load-balancer)
- [Types of ELB](#types-of-elb)
- [Whats an Auto Scaling Group?](#whats-an-auto-scaling-group)
- [Auto Scaling Group Scaling Strategies](#auto-scaling-group-scaling-strategies)
- [ELB \& ASG Summary](#elb--asg-summary)
## Scalability & High Availability
- **Scalability**: Ability of a system to handle an increase in load by adapting to the demand.
- **High Availability**: Ensures a system is operational and accessible for a high percentage of time, often achieved by reducing the impact of failures.
- There are two kinds of scalability:
- Vertical Scalability
- Horizontal Scalability (= elasticity)
- Scalability is linked but different to High Availability
## Vertical Scalability
- Increasing the capacity of a single instance (e.g., moving from t3.medium to t3.large).
- Suitable for databases or applications where upgrading a single resource is more efficient.
- Limited by hardware constraints (can only scale up to a certain point).
## Horizontal Scalability
- Adding more instances (servers) to distribute the load across multiple resources.
- Achieved through technologies like **Auto Scaling Groups (ASG)** and **Elastic Load Balancing (ELB)**.
- Preferred for applications needing resilience and distributed workloads.
- Horizontal scaling implies distributed systems.
## High Availability
- Implemented by deploying resources across multiple **Availability Zones** (AZs).
- Ensures failover and redundancy in case of failures in one AZ.
- High Availability usually goes hand in hand with horizontal scaling
## High Availability & Scalability for EC2
- Vertical Scaling: Increase instance size (= scale up / down)
- From: t2.nano - 0.5G of RAM, 1 vCPU
- To: u-12tb1.metal 12.3 TB of RAM, 448 vCPUs
- Horizontal Scaling: Increase number of instances (= scale out / in)
- Auto Scaling Group
- Load Balancer
- High Availability: Run instances for the same application across multi AZ
- Auto Scaling Group multi AZ
- Load Balancer multi AZ
## Scalability vs Elasticity (vs Agility)
| **Term** | **Definition** |
|--------------------|--------------------------------------------------------------------------------------------------|
| **Scalability** | Ability to increase or decrease the capacity to handle varying levels of traffic or load. |
| **Elasticity** | Automatically adjusts resources up or down based on the load in real-time, preventing under or over-provisioning. |
| **Agility** | The ability to deploy and manage resources quickly and efficiently in response to changing demands. |
## What is Load Balancing?
- Distributes incoming traffic across multiple targets (EC2 instances, containers, IP addresses) to ensure that no single resource is overwhelmed.
### Why Use a Load Balancer?
- Ensures application fault tolerance and high availability by spreading the load across multiple servers.
- Protects against failures in a single resource by rerouting traffic automatically.
- Do regular health checks to your instances
- Provide SSL termination (HTTPS) for your websites
### Why Use an Elastic Load Balancer?
- **Elastic Load Balancer (ELB)** is a fully managed service that automatically distributes incoming application traffic across multiple targets in one or more Availability Zones.
- It improves fault tolerance, enhances performance, and scales according to demand.
- AWS guarantees that it will be working
- AWS takes care of upgrades, maintenance, high availability
- AWS provides only a few configuration knobs
#### Types of ELB
1. **Application Load Balancer (ALB)**: For HTTP and HTTPS traffic, operates at Layer 7 (application level).
2. **Network Load Balancer (NLB)**: Handles high-performance traffic at Layer 4 (transport level).
3. **Classic Load Balancer**: (slowly retiring) Layer 4 & 7
## Whats an Auto Scaling Group?
- An **Auto Scaling Group (ASG)** ensures the right number of EC2 instances are running to handle the load.
- Automatically adjusts the number of instances based on metrics such as CPU utilization or custom-defined thresholds.
- Can span across multiple AZs to ensure high availability.
- In real-life, the load on your websites and application can change
- In the cloud, you can create and get rid of servers very quickly
- The goal of an Auto Scaling Group (ASG) is to:
- Scale out (add EC2 instances) to match an increased load
- Scale in (remove EC2 instances) to match a decreased load
- Ensure we have a minimum and a maximum number of machines running
- Automatically register new instances to a load balancer
- Replace unhealthy instances
- Cost Savings: only run at an optimal capacity (principle of the cloud)
### Auto Scaling Group Scaling Strategies
- **Manual Scaling**: Adjusting the number of instances manually based on load prediction.
- **Dynamic Scaling**: Automatically adjusts the number of instances based on demand (e.g., CPU usage).
- Simple / Step Scaling
- When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units
- When a CloudWatch alarm is triggered (example CPU < 30%), then remove 1
- Target Tracking Scaling
- Example: I want the average ASG CPU to stay at around 40%
- Scheduled Scaling
- Anticipate a scaling based on known usage patterns
- Example: increase the min. capacity to 10 at 5 pm on Fridays
- **Predictive Scaling**: Uses machine learning to predict future traffic patterns and scales proactively.
## ELB & ASG Summary
- High Availability vs Scalability (vertical and horizontal) vs Elasticity vs Agility in the Cloud
- Elastic Load Balancers (ELB)
- Distribute traffic across backend EC2 instances, can be Multi-AZ
- Supports health checks
- 3 types: Application LB (HTTP L7), Network LB (TCP L4), Classic LB (old)
- Auto Scaling Groups (ASG)
- Implement Elasticity for your application, across multiple AZ
- Scale EC2 instances based on the demand on your system, replace unhealthy
- Integrated with the ELB

424
sections/s3.md Normal file
View File

@@ -0,0 +1,424 @@
# Amazon S3
- [Amazon S3](#amazon-s3)
- [S3 Use cases](#s3-use-cases)
- [Amazon S3 Overview - Buckets](#amazon-s3-overview---buckets)
- [Amazon S3 Overview - Objects](#amazon-s3-overview---objects)
- [S3 Security](#s3-security)
- [S3 Bucket Policies](#s3-bucket-policies)
- [Bucket settings for Block Public Access](#bucket-settings-for-block-public-access)
- [S3 Websites](#s3-websites)
- [S3 - Versioning](#s3---versioning)
- [S3 Access Logs](#s3-access-logs)
- [S3 Replication (CRR \& SRR)](#s3-replication-crr--srr)
- [S3 Storage Classes](#s3-storage-classes)
- [S3 Durability and Availability](#s3-durability-and-availability)
- [S3 Standard General Purpose](#s3-standard-general-purpose)
- [S3 Storage Classes - Infrequent Access](#s3-storage-classes---infrequent-access)
- [S3 Standard Infrequent Access (S3 Standard-IA)](#s3-standard-infrequent-access-s3-standard-ia)
- [S3 One Zone Infrequent Access (S3 One Zone-IA)](#s3-one-zone-infrequent-access-s3-one-zone-ia)
- [Amazon S3 Glacier Storage Classes](#amazon-s3-glacier-storage-classes)
- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
- [Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
- [Amazon S3 Glacier Deep Archive - for long term storage](#amazon-s3-glacier-deep-archive---for-long-term-storage)
- [S3 Intelligent-Tiering](#s3-intelligent-tiering)
- [S3 Object Lock \& Glacier Vault Lock](#s3-object-lock--glacier-vault-lock)
- [Shared Responsibility Model for S3](#shared-responsibility-model-for-s3)
- [AWS Snow Family](#aws-snow-family)
- [Data Migrations with AWS Snow Family](#data-migrations-with-aws-snow-family)
- [Time to Transfer](#time-to-transfer)
- [Snowball Edge (for data transfers)](#snowball-edge-for-data-transfers)
- [AWS Snowcone](#aws-snowcone)
- [AWS Snowmobile](#aws-snowmobile)
- [Snow Family - Usage Process](#snow-family---usage-process)
- [What is Edge Computing?](#what-is-edge-computing)
- [Snow Family - Edge Computing](#snow-family---edge-computing)
- [AWS OpsHub](#aws-opshub)
- [Hybrid Cloud for Storage](#hybrid-cloud-for-storage)
- [AWS Storage Gateway](#aws-storage-gateway)
- [Amazon S3 - Summary](#amazon-s3---summary)
## S3 Use cases
- Backup and storage
- Disaster Recovery
- Archive
- Hybrid Cloud storage
- Application hosting
- Media hosting
- Data lakes & big data analytics
- Software delivery
- Static website
## Amazon S3 Overview - Buckets
- Amazon S3 allows people to store objects (files) in “buckets” (directories)
- Buckets must have a globally unique name (across all regions all accounts)
- Buckets are defined at the region level
- S3 looks like a global service but buckets are created in a region
- Naming convention
- No uppercase
- No underscore
- 3-63 characters long
- Not an IP
- Must start with lowercase letter or number
## Amazon S3 Overview - Objects
- Objects (files) have a Key
- The key is the FULL path:
- s3://my-bucket/my_file.txt
- s3://my-bucket/my_folder1/another_folder/my_file.txt
- The key is composed of **prefix** + **object name**
- s3://my-bucket/my_folder1/another_folder/my_file.txt
- Theres no concept of “directories” within buckets (although the UI will trick you to think otherwise)
- Just keys with very long names that contain slashes (“/”)
- Object values are the content of the body:
- Max Object Size is 5TB (5000GB)
- If uploading more than 5GB, must use “multi-part upload”
- Metadata (list of text key / value pairs system or user metadata)
- Tags (Unicode key / value pair up to 10) useful for security / lifecycle
- Version ID (if versioning is enabled)
## S3 Security
- **User based**
- IAM policies - which API calls should be allowed for a specific user from IAM console
- **Resource Based**
- Bucket Policies - bucket wide rules from the S3 console - allows cross account
- Object Access Control List (ACL) finer grain
- Bucket Access Control List (ACL) less common
- **Note:** an IAM principal can access an S3 object if
- the user IAM permissions allow it OR the resource policy ALLOWS it
- AND theres no explicit DENY
- **Encryption:** encrypt objects in Amazon S3 using encryption keys
## S3 Bucket Policies
- JSON based policies
- Resources: buckets and objects
- Actions: Set of API to Allow or Deny
- Effect: Allow / Deny
Principal: The account or user to apply the policy to
- Use S3 bucket for policy to:
- Grant public access to the bucket
- Force objects to be encrypted at upload
- Grant access to another account (Cross Account)
```json
{
"Version": "2012-10-17",
"Statement": [
{
"sid": "PublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3::examplebucket/*"
]
}
]
}
```
## Bucket settings for Block Public Access
- Block all public access: On
- Block public access to buckets and objects granted through new access control lists (ACLS): On
- Block public access to buckets and objects granted through any access control lists (ACLS): On
- Block public access to buckets and objects granted through new public bucket or access point policies: On
- Block public and cross-account access to buckets and objects through any public bucket or access point policies: On
- These settings were created to prevent company data leaks
- If you know your bucket should never be public, leave these on
- Can be set at the account level
## S3 Websites
- S3 can host static websites and have them accessible on the www
- The website URL will be:
- bucket-name.s3-website-AWS-region.amazonaws.com
OR
- bucket-name.s3-website.AWS-region.amazonaws.com
- **If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!**
## S3 - Versioning
- You can version your files in Amazon S3
- It is enabled at the bucket level
- Same key overwrite will increment the “version”: 1, 2, 3….
- It is best practice to version your buckets
- Protect against unintended deletes (ability to restore a version)
- Easy roll back to previous version
- Notes:
- Any file that is not versioned prior to enabling versioning will have version “null”
- Suspending versioning does not delete the previous versions
## S3 Access Logs
- For audit purpose, you may want to log all access to S3 buckets
- Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
- That data can be analyzed using data analysis tools…
- Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
## S3 Replication (CRR & SRR)
- Must enable versioning in source and destination
- Cross Region Replication (CRR)
- Same Region Replication (SRR)
- Buckets can be in different accounts
- Copying is asynchronous
- Must give proper IAM permissions to S3
- CRR - Use cases: compliance, lower latency access, replication across accounts
- SRR Use cases: log aggregation, live replication between production and test accounts
## S3 Storage Classes
- [Amazon S3 Standard - General Purpose](#s3-standard-general-purpose)
- [Amazon S3 Standard - Infrequent Access (IA)](#s3-standard-infrequent-access-s3-standard-ia)
- [Amazon S3 One Zone - Infrequent Access](#s3-one-zone-infrequent-access-s3-one-zone-ia)
- [Amazon S3 Glacier Instant Retrieval](#amazon-s3-glacier-instant-retrieval)
- [Amazon S3 Glacier Flexible Retrieval](#amazon-s3-glacier-flexible-retrieval-formerly-amazon-s3-glacier)
- [Amazon S3 Glacier Deep Archive](#amazon-s3-glacier-deep-archive--for-long-term-storage)
- [Amazon S3 Intelligent Tiering](#s3-intelligent-tiering)
- Can move between classes manually or using S3 Lifecycle configurations
### S3 Durability and Availability
- Durability:
- High durability (99.999999999%, 11 9s) of objects across multiple AZ
- If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years
- Same for all storage classes
- Availability:
- Measures how readily available a service is
- Varies depending on storage class
- Example: S3 standard has 99.99% availability = not available 53 minutes a year
### S3 Standard General Purpose
- 99.99% Availability
- Used for frequently accessed data
- Low latency and high throughput
- Sustain 2 concurrent facility failures
- Use Cases: Big Data analytics, mobile & gaming applications, content distribution…
### S3 Storage Classes - Infrequent Access
- For data that is less frequently accessed, but requires rapid access when needed
- Lower cost than S3 Standard
#### S3 Standard Infrequent Access (S3 Standard-IA)
- 99.9% Availability
- Use cases: Disaster Recovery, backups
#### S3 One Zone Infrequent Access (S3 One Zone-IA)
- High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
- 99.5% Availability
- Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
### Amazon S3 Glacier Storage Classes
- Low-cost object storage meant for archiving / backup
- Pricing: price for storage + object retrieval cost
#### Amazon S3 Glacier Instant Retrieval
- Millisecond retrieval, great for data accessed once a quarter
- Minimum storage duration of 90 days
#### Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)
- Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) free
- Minimum storage duration of 90 days
#### Amazon S3 Glacier Deep Archive - for long term storage
- Standard (12 hours), Bulk (48 hours)
- Minimum storage duration of 180 days
### S3 Intelligent-Tiering
- Small monthly monitoring and auto-tiering fee
- Moves objects automatically between Access Tiers based on usage
- There are no retrieval charges in S3 Intelligent-Tiering
- Frequent Access tier (automatic): default tier
- Infrequent Access tier (automatic): objects not accessed for 30 days
- Archive Instant Access tier (automatic): objects not accessed for 90 days
- Archive Access tier (optional): configurable from 90 days to 700+ days
- Deep Archive Access tier (optional): config. from 180 days to 700+ days
## S3 Object Lock & Glacier Vault Lock
- S3 Object Lock
- Adopt a WORM (Write Once Read Many) model
- Block an object version deletion for a specified amount of time
- Glacier Vault Lock
- Adopt a WORM (Write Once Read Many) model
- Lock the policy for future edits (can no longer be changed)
- Helpful for compliance and data retention
## Shared Responsibility Model for S3
| AWS | YOU |
| ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
| Infrastructure (global security, durability, availability, sustain concurrent loss of data in two facilities) | S3 Versioning, S3 Bucket Policies, S3 Replication Setup |
| Configuration and vulnerability analysis | Logging and Monitoring, S3 Storage Classes |
| Compliance validation | Data encryption at rest and in transit |
## AWS Snow Family
- Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
- Data migration:
- Snowcone
- Snowball Edge
- Snowmobile
- Edge computing:
- Snowcone
- Snowball Edge
### Data Migrations with AWS Snow Family
- **AWS Snow Family: offline devices to perform data migrations** If it takes more than a week to transfer over the network, use Snowball devices!
- Challenges:
- Limited connectivity
- Limited bandwidth
- High network cost
- Shared bandwidth (cant maximize the line)
- Connection stability
### Time to Transfer
| Data | 100 Mbps | 1Gbps | 10Gbps |
| ------ | -------- | -------- | -------- |
| 10 TB | 12 days | 30 hours | 3 hours |
| 100 TB | 124 days | 12 days | 30 hours |
| 1 PB | 3 years | 124 days | 12 days |
### Snowball Edge (for data transfers)
- Physical data transport solution: move TBs or PBs of data in or out of AWS
- Alternative to moving data over the network (and paying network fees)
- Pay per data transfer job
- Provide block storage and Amazon S3-compatible object storage
- Snowball Edge Storage Optimized
- 80 TB of HDD capacity for block volume and S3 compatible object storage
- Snowball Edge Compute Optimized
- 42 TB of HDD capacity for block volume and S3 compatible object storage
- Use cases: large data cloud migrations, DC decommission, disaster recovery
### AWS Snowcone
- Small, portable computing, anywhere, rugged & secure, withstands harsh environments
- Light (4.5 pounds, 2.1 kg)
- Device used for edge computing, storage, and data transfer
- **8 TBs of usable storage**
- Use Snowcone where Snowball does not fit (space-constrained environment)
- Must provide your own battery / cables
- Can be sent back to AWS offline, or connect it to internet and use **AWS DataSync** to send data
### AWS Snowmobile
- Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
- Each Snowmobile has 100 PB of capacity (use multiple in parallel)
- High security: temperature controlled, GPS, 24/7 video surveillance
- **Better than Snowball if you transfer more than 10 PB**
| Properties | Snowcone | Snowball Edge Storage Optimized | Snowmobile |
| ---------------- | ------------------------------- | ------------------------------- | ----------------------- |
| Storage Capacity | 8 TB usable | 80 TB usable | < 100 PB |
| Migration Size | Up to 24 TB, online and offline | Up to petabytes, offline | Up to exabytes, offline |
### Snow Family - Usage Process
1. Request Snowball devices from the AWS console for delivery
2. Install the snowball client / AWS OpsHub on your servers
3. Connect the snowball to your servers and copy files using the client
4. Ship back the device when youre done (goes to the right AWS facility)
5. Data will be loaded into an S3 bucket
6. Snowball is completely wiped
## What is Edge Computing?
- Process data while its being created on an edge location
- A truck on the road, a ship on the sea, a mining station underground...
- These locations may have
- Limited / no internet access
- Limited / no easy access to computing power
- We setup a **Snowball Edge / Snowcone** device to do edge computing
- Use cases of Edge Computing:
- Preprocess data
- Machine learning at the edge
- Transcoding media streams
- Eventually (if need be) we can ship back the device to AWS (for transferring data for example)
## Snow Family - Edge Computing
- **Snowcone (smaller)**
- 2 CPUs, 4 GB of memory, wired or wireless access
- USB-C power using a cord or the optional battery
- **Snowball Edge Compute Optimized**
- 52 vCPUs, 208 GiB of RAM
- Optional GPU (useful for video processing or machine learning)
- 42 TB usable storage
- **Snowball Edge Storage Optimized**
- Up to 40 vCPUs, 80 GiB of RAM
- Object storage clustering available
- All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
- Long-term deployment options: 1 and 3 years discounted pricing
## AWS OpsHub
- Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
- Today, you can use **AWS OpsHub** (a software you install on your computer / laptop) to manage your Snow Family Device
- Unlocking and configuring single or clustered devices
- Transferring files
- Launching and managing instances running on Snow Family Devices
- Monitor device metrics (storage capacity, active instances on your device)
- Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS))
## Hybrid Cloud for Storage
- AWS is pushing for ”hybrid cloud”
- Part of your infrastructure is on-premises
- Part of your infrastructure is on the cloud
- This can be due to
- Long cloud migrations
- Security requirements
- Compliance requirements
- IT strategy
- S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
- AWS Storage Gateway!
## AWS Storage Gateway
- Bridge between on-premise data and cloud data in S3
- Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
- Use cases: disaster recovery, backup & restore, tiered storage
- Types of Storage Gateway:
- File Gateway
- Volume Gateway
- Tape Gateway
- No need to know the types at the exam
## Amazon S3 - Summary
- Buckets vs Objects: global unique name, tied to a region
- S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
- S3 Websites: host a static website on Amazon S3
- S3 Versioning: multiple versions for files, prevent accidental deletes
- S3 Access Logs: log requests made within your S3 bucket
- S3 Replication: same-region or cross-region, must enable versioning
- S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
- S3 Lifecycle Rules: transition objects between classes
- S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
- Snow Family: import data onto S3 through a physical device, edge computing
- OpsHub: desktop application to manage Snow Family devices
- Storage Gateway: hybrid solution to extend on-premises storage to S3