AWS-EC2-1
Elastic Compute Cloud (EC2)
Overview Structure
Generally, EC2 is consists of three parts, hardware, hypervisor, and virtual machine instances. Like the fig below.- The hypervisor is kind of like a physical interface that is responsible for interacting with the physical components of the host, like Virtual Box, Hyper-V, VMware.
- Each virtual machine (EC2 instance) communicates with physical components through its operating system which communicates with the physical components through the hypervisor.
EC2 Instance
- "E" from "EC2" is Elastic, which means the computing power is able to adjust (scaling), through the auto-scaling group to implements its (the amount of EC2 instances) elastic scaling (scale-out or scale-in) automatically. (In short, able to adjust the amount of EC2 instances, unable to adjust the resource for the specific EC2 instance)
- The user has the root permission, and this is an un-managed service which means AWS doesn't know and doesn't care, and isn't able to check and control your data.
- The available-usage percentage is 99.95% (S3 is 99.99%), approximate downtime is 22 mins per month.
- Tenancy
- Shared: the instances of an AWS account is able to share the host with other accounts' instances.
- Dedicated Instance (Reserved Instance): all of the dedicated instances hosted in the host are belong to the same account, but after instances get stopped the host may be different once the instances started again.
- Dedicated Host (Reserved Instance): this is similar to dedicated instance, the only difference is that once the dedicated instances get stopped, the next time starts the instances, they will be hosted in the same physical host.
- Each AWS account can host up to 20 EC2 instances, but this is a soft limitation. Users can contact AWS support to extend the amount over 20.
- Operations
- Start
- Stopped
- The EC2 instance is in shut down status
- The instance ID and root volume are kept (instance-store-backed instance cannot be stopped, otherwise the data will be lost)
- Stopped instance won't be charged, but the EBS instance still is being charged
- At this state, EC2 instance is able to attach or detach an EBS volume
- IP addresses
- Private IP will be kept
- Public IP will be released
- Elastic Ip will be kept
- Reboot
- The EC2 instance is able to reboot from the AWS dashboard or the inner operating system
- Both EBS-backed and Instance-store-backed EC2 instances could be rebooted without any data and settings (IPs) lost
- Terminate
- There is an EC2 Termination Protection option available to check, if checked, only the EC2 instance with EC2 Termination Protection option unchecked is able to terminate
Amazon Machine Image (AMI)
- AMI is the image of EC2 instance
- Linux
- Amazon Linux (Redhat + CentOs)
- Amazon Linux II (Improved Amazon Linux)
- Ubuntu
- Original Linux
- Windows
- AMI marketplace
- Users can purchase any third party AMI from there for matching their requirements. Like Wordpress, Nginx, etc.
EC2 Instance Type
- General-purpose
- The most popular one is M4, 2 cores, 8 GB, performance is more like balanced compute power
- Compute-optimized
- Provide more powerful computing, 2 cores, and 4 GB memory
- GPU instance
- Provide a GPU and GPU memory, 1 GPU, 8 GPU memory, 8 cores, 61 GB memory
- Memory-optimized
- 2 cores, and 8 GB memory
- Storage-optimized
- Provide local storage(SSD), 2 cores, and 16 GB Memory (other types only have EBS-only)
Storage volume
Each EC2 instance should have at least one storage volume. Generally, storage volume has two types, instance store and EBS (Elastic Block Storage)
- Elastic Block Storage (EBS)
- Network-attached, AWS modularized different modules by separate different functionalities with different physical places. AWS places thousands of SSDs to a place and users can attach one volume from there through the AWS internet
- Persistent storage, if the instance gets stopped, terminated or failed, the data stored in EBS will be existing
- After terminating the EBS will be deleted
- Because AWS had done the redundancy for EBS when the EBS is failed users' data won't be lost. All of the client's data will be stored within multi EBS units.
- Instance Store
- Local-attached, the spaces are distributed by local storage
- None-persistent storage, if the instance gets stopped or failed, the data will be erased with storage stopping
- EBS-backed
- The root volume is an EBS instance
- Instance-Store-backed
- The root volume is an instance-store-backed instance
- Only can be configured during user create EC2 instance by using the AMI which supports instance-store-backed
- Cannot be attached to any other storage volume as an additional storage volume
EBS Categories
- Solid-State Drives (SSD)
- IOPS (how many blocks per second) and Throughput
- Generally, the data is separated as blocks which are the smallest unit for storing data in volume, and the size of the block depends on the different storing formate. (default is 4KB, 4095 bytes)
- IOPS * Block size = Throughput
- Example:
- One big file, like 1 GB
- The big file will be separated as 1 * 1024 * 1024 * 1024 (bytes) / 4 * 1024 (bytes) blocks
- 1, 024, 000 of small files, like 1 KB, totally 1 GB
- Each one small file will occupy one block, that is 1, 024, 000 blocks, but if each block is full filled it only need 1 * 1024 * 1024 * 1024 (bytes) / 4 * 1024 (bytes) blocks, that is 4 times larger than the previous scenario
- If under the case of the fixed IOPS and the block size, more fully filled blocks mean more throughput, but more small files will increase the IOs usage and less throughput, so that generally if more small files it is hard to match the max throughput
- General-purpose SSD (gp2)
- Provisioned IOPS SSD (io1)
- Optimized bandwidth for keep high available IOPS
- Because the EBS is network-attached storage, IOPS will be affected by the bandwidth factor. Provisioned IOPS SSD EBS will reserve the certain of bandwidth for the user, it won't be affected by it.
- Hard-Disk Drives (HHD)
- Fixed IOPS only but always of 500
- Cold HHD (sc1)
- Fixed throughput of 250MiB/s
- Throughput optimized HHD (st1)
- Optimized throughput of 500 MiB/s
Instance Storage vs EBS
- Because the instance storage is bounded into the physical host, the IOPS will be extremely high (Read 3.3 Million, Write 1.4 Million), compared to EBS (Read/Write 6,400)
- Same reason, the throughput is extremely high too (16 GB/s), compared to EBS (1,000 MB/s)
- EBS is able to be attached additional storage volumes, the max number of depends on the instance type
EBS Snapshot
It is a time-based data cloning. At the time the user does EBS snapshot, AWS will clone all the data from the EBS which the user wants to snapshot and store it to S3 (Simple Storage Service).
- Because copying the data and storing it to S3 need time, once the user executes snapshot, all the data which are modified after executes snapshot operation won't be stored into the snapshot.
- Each snapshot is an incremental update (the difference between the time spot of snapshot and the previous snapshot) compare to the previous snapshot
- The snapshot will be stored in S3, but the user cannot acquire it directly.
- The snapshot only works in its local region, if the user in the different region wants to restore the storage, the snapshot should be copied from its region to the region where the storage gets restored.
- The EBS which used for restoring from a snapshot should have a larger or equal capacity than the EBS which snapshot needs
- The IO will be slower than normal during creating a snapshot because some of the bandwidth is being used for creating the snapshot
- Doing Snapshot
- Non-root volume
- Stop writing data to the volume
- Detach the volume
- Start snapshot
- Reattach the volume (do not need waiting for snapshot complete)
- Root volume
- Stop the EC2 instance
- Strat snapshot
- Start the EC2 instance
- Create a new EBS volume
- Binding the EBS to the EC2 instance
- Format and mount EBS volume to the OS of EC2
- By using Disk Management Tool or other migration tools to copy or migrate the data which from the instance store to the EBS
EBS Encryption
The whole entity of EBS instance be encrypted by EC2 instance which supports EBS encrypting function (some ECS doesn't support to encrypt EBS based on its physical computing capability)
- EBS Encryption Approaches
- AWS EBS Encryption (use KMS to encrypt EBS)
- Encryption Key, using KMS (Key Management Service) to manage the encryption keys.
- By default, AWS EBS Encryption uses AWS owned CMK to encrypt EBS, but the user is able to use their own CMK to encrypt EBS
- Third-party encrypting tool
- Encrypted file system
- Key Management Service (KMS)
- An AWS Service which is for managing encryption keys for user
- Key types
- CMK (Customer Master Key
- Customer Managed CMK
- the user can See and is able to Manage
- generated by the user from Outside of KMS
- AWS Managed CMK
- the user can See but is not able to manage
- generated by the user from Inside of KMS
- AWS Owned CMK
- the user cannot see and cannot manage
- generated by AWS
- Data key (the system is using this key to encrypt EBS, and the user is responsible for managing and storing it)
- Switch between Encrypted and Unencrypted EBS
- There is no way to switch between encrypt and decrypt for EBS
- Encrypt EBS - 1
- Attach an Encrypted brand new EBS to the EC2 instance which is being attached by the unencrypted EBS
- Manually copy or migrate data from unencrypted EBS to the encrypted EBS
- Encrypt EBS - 2
- Snapshot the unencrypted EBS with encrypted option check
- Create a new EBS by using the snapshot
- Connect the new encrypted EBS to the EC2 instance
- Detach the unencrypted EBS from EC2 instance and delete it
Sharing EBS Snapshot
- Users can public any unencrypted EBS snapshot (Encrypted EBS snapshot cannot be published)
- Sharing an encrypted EBS snapshot
- The EBS snapshot has to be a private EBS snapshot
- The encrypted EBS snapshot was not be encrypted through AWS managed CMK
- The shared user should have permission to access the CMK from the sharing user
- During the sharing of the snapshot, the sharing user could allow the copied snapshot to be encrypted by using the shared user's own CMK
- The shared user should copy the EBS snapshot before using it.
- During copying the snapshot, AWS will decrypt the snapshot by using the sharing user's CMK (the shared user already got the permission to use it) and encrypting by using the shared user's own CMK to encrypt the snapshot.
- Any changes on the copies EBS snapshot won't affect the original EBS snapshot
- Only completed snapshotting EBS snapshot could be copied
Creating Amazon Machine Image (AMI)
- Instance-store-backed EC2 instance
- The user should specify an S3 bucket to store the created AMI
- After the AMI created, the user should register the AMI for making it available to use
- If the AMI is not needed, the user should deregister it and the delete it in S3 bucket
- EBS-backed EC2 instance
- Stop the EC2 instance makes sure for data consistency (at this time, all of EBS will create its own EBS snapshot automatically)
- Create AMI (at this time, the user doesn't need to specify an S3 bucket to store the created AMI, and AWS will do the snapshot for all attached EBSs and register the AMI automatically)
- If the AMI is not needed, the user should deregister it and the delete it in S3 bucket
Redundant Array of Independent Disks (RAID)
Most of operating system support RAID, and EBS supports it too
- RAID 0 (stripe, like "sharding")
- series multi EBSs for increasing IO throughput (non-redundancy)
- if any of the partial EBS failed, then all RAID failed
- the user should balance the throughput between RAID and EC2 instance, otherwise, the RAID is meaningless
- RAID 1 (mirror, like "replica")
- parallel multi EBSs, writing the same data to multiple EBSs at the same time (redundancy)
- improve redundancy, but IO throughput keeps the same
- if any of the partial EBS failed, all RAID keeps working well
Key pair
using for login ssh (public key - username, private key - password)
- The public key will be registered in the EC2 instance, but the private key should be kept by the user's self.
- A same key pair could be used in multiple EC2 instances
- AWS is not able to log the user in the EC2 instance if the user lost the private key which is downloaded when the user creating the EC2 instance
Placement Group
It is a logical grouping which defined how to allocate the instances to the different physical hosts
- The placement group could be crossing multiple VPCs by using VPC peering
- The placement group could be crossing multiple AZs but not regions
- The placement group Types
- Cluster: all EC2 instances are in the same AZ
- Partition: the instances which in different partitions hosted in the different hosts
- Spread: each instance hosted in the different host
Spot Instance
According to different factors AWS could start or terminate specific EC2 instances automatically
- Configurations
- Minimum computing unit, minimum CPU performance
- Total target compacity, the most amount of EC2 instances
- Optional on-demand portion, the amount of EC2 instances after over the target price
- Request valid from/to, request period
EC2 Monitoring
AWS will be doing the status check by a time per minute, and this functionality cannot be disabled
- AWS evaluates the status of EC2, PASS or Fail, based on hardware and software
- After couple of times continued failed, the status for this EC2 will become Impaired
- If the impaired EC2 is an EBS-backed instance, AWS will create a brand new EC2 instance in the different hosts and attach the old EBS instance.
- The checking result will be sent by SNS service to CloudWatch, and the user could use CloudWatch to do the related operations
EC2 Metadata and User data
- Metadata
- Configuration information, which includes IP addresses, DNS name, image ID, instance ID, etc.
- User data
- After the instance get established, the script in user data will be executed automatically
- The user data won't be encrypted
- The developer could use AWS API to get the content of the user data or the metadata
Elastic Network Interface (ENI)
- Each ENI could be assigned multiple IPs
- When an EC2 instance get established, there is only one ENI set up by default, Eth0 as the primary ENI which cannot be detached
- The user is able to attach one more ENI, Eth1 which could be existed in a different subnet
- If an EC2 instance is bounded two ENIs, which is Eth0 and Eth1, AWS doesn't assign public IP to this EC2 automatically, but the user is able to add Elastic IP to it manually.
Bastion Host
- IssueS
- If all of EC2 instances could be access through SSH/RDP with key pair, that will raise a security issue.
- Solution
- Therefore, the user could use the bastion host as a "sprinboard", set all of the security groups about this EC2 instance only accept the SSH/RDP inbound from the bastion host, and bastion host only accept the SSH/RDP inbound from a specific IP of the company.
- Example
- the user uses a key pair with the company's IP address access the bastion host, and through the bastion host to access the EC2 instance
EC2 import and export
- Import
- the instance settings from Hypervisors like VMWare, Hyper-V, Xen, etc. to Amazon EC2
- Export
- same instance settings export like importing
- only allow export the settings which coming from the EC2 instance which established by using the imported instance setting
Comments
Post a Comment