Question#1
A company collects data for temperature, humidity, and atmospheric pressure in cities across multiple continents. The average volume of data that the company collects from each site daily is 500 GB. Each site has a high-speed Internet connection. The company wants to aggregate the data from all these global sites as quickly as possible in a single Amazon S3 bucket. The solution must minimize operational complexity. Which solution meets these requirements?
- A. Turn on S3 Transfer Acceleration on the destination S3 bucket. Use multipart uploads to directly upload site data to the destination S3 bucket.
- B. Upload the data from each site to an S3 bucket in the closest Region. Use S3 Cross-Region Replication to copy objects to the destination S3 bucket. Then remove the data from the origin S3 bucket.
- C. Schedule AWS Snowball Edge Storage Optimized device jobs daily to transfer data from each site to the closest Region. Use S3 Cross-Region Replication to copy objects to the destination S3 bucket.
- D. Upload the data from each site to an Amazon EC2 instance in the closest Region. Store the data in an Amazon Elastic Block Store (Amazon EBS) volume. At regular intervals, take an EBS snapshot and copy it to the Region that contains the destination S3 bucket. Restore the EBS volume in that Region.
Why S3 Storage Acceleration?
Amazon S3 Transfer Acceleration can speed up content transfers to and from Amazon S3 by as much as 50ā500% for long-distance transfer of larger objects. With S3TA, you pay only for transfers that are accelerated.S3TA improves transfer performance by routing traffic over AWS backbone networks, and by using network protocol optimizations.
Speed-up Transfer by Aws Private Network from Edge Location Credit toĀ StĆ©phane Maarek Go to the Properties Tab in Your Bucket Here is S3 TF. By default it is disabled
Arguments About Other Options:
B: This Option Increase Complexity and Operational Overhead**,**. First of all You need to enable versioning on both buckets then you need Iam Permission. āĀ The minimum configuration must provide the following: The destination bucket or buckets where you want Amazon S3 to replicate objects. An AWS Identity and Access Management (IAM) role that Amazon S3 can assume to replicate objects on your behalf.ā
Replicating objects
C: This Option is Out of Scope. Snowball Edge Storage Optimized devices are for migration of data ranging from 80 tb to 210 tb and then S3 Cross-Region Replication holds its own overhead as we discuss above.
AWS Snowball Features | Amazon Web Services
D: This Option 5x Complex than others and over task is to fast transfer and remove operational overhead.
Question #2
A company needs the ability to analyze the log files of its proprietary application. The logs are stored in JSON format in an Amazon S3 bucket. Queries will be simple and will run on-demand. A solutions architect needs to perform the analysis with minimal changes to the existing architecture. What should the solutions architect do to meet these requirements with the LEAST amount of operational overhead?
- A. Use Amazon Redshift to load all the content into one place and run the SQL queries as needed.
- B. Use Amazon CloudWatch Logs to store the logs. Run SQL queries as needed from the Amazon CloudWatch console.
- C.Ā Use Amazon Athena directly with Amazon S3 to run the queries as needed.
- D. Use AWS Glue to catalog the logs. Use a transient Apache Spark cluster on Amazon EMR to run the SQL queries as needed
Reference:
Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standardĀ SQL. With a few actions in the AWS Management Console, you can point Athena at your data stored in Amazon S3 and begin using standard SQL to run ad-hoc queries and get results in seconds.
You Run Direct Queries on S3 Bucket
Question #3
A company uses AWS Organizations to manage multiple AWS accounts for different departments. The management account has an Amazon S3 bucket that contains project reports. The company wants to limit access to this S3 bucket to only users of accounts within the organization in AWS Organizations. Which solution meets these requirements with the LEAST amount of operational overhead?
- A.Ā Add the aws PrincipalOrgID global condition key with a reference to the organization ID to the S3 bucket policy.
- B. Create an organizational unit (OU) for each department. Add the aws:PrincipalOrgPaths global condition key to the S3 bucket policy.
- C. Use AWS CloudTrail to monitor the CreateAccount, InviteAccountToOrganization, LeaveOrganization, and RemoveAccountFromOrganization events. Update the S3 bucket policy accordingly.
- D. Tag each user that needs access to the S3 bucket. Add the aws:PrincipalTag global condition key to the S3 bucket policy.
Reference:
ā you can define theĀ aws:PrincipalOrgIDĀ condition and set the value to yourĀ organization IDĀ in the bucket policy.ā
An easier way to control access to AWS resources by using the AWS organization of IAM principals
Question #4
An application runs on an Amazon EC2 instance in a VPC. The application processes logs that are stored in an Amazon S3 bucket. The EC2 instance needs to access the S3 bucket without connectivity to the internet. Which solution will provide private network connectivity to Amazon S3?
- A.Ā Create a gateway VPC endpoint to the S3 bucket.
- B. Stream the logs to Amazon CloudWatch Logs. Export the logs to the S3 bucket.
- C. Create an instance profile on Amazon EC2 to allow S3 access.
- D. Create an Amazon API Gateway API with a private link to access the S3 endpoint.
Reference:
āA VPC endpoint enables customers to privately connect to supported AWS services and VPC endpoint services powered by AWS PrivateLink. Amazon VPC instances do not require public IP addresses to communicate with resources of the serviceā
What are VPC endpoints?
Question #5
A company is hosting a web application on AWS using a single Amazon EC2 instance that stores user-uploaded documents in an Amazon EBS volume. For better scalability and availability, the company duplicated the architecture and created a second EC2 instance and EBS volume in another Availability Zone, placing both behind an Application Load Balancer. After completing this change, users reported that, each time they refreshed the website, they could see one subset of their documents or the other, but never all of the documents at the same time. What should a solutions architect propose to ensure users see all of their documents at once?
- A. Copy the data so both EBS volumes contain all the documents
- B. Configure the Application Load Balancer to direct a user to the server with the documents
- C.Ā Copy the data from both EBS volumes to Amazon EFS. Modify the application to save new documents to Amazon EFS
- D. Configure the Application Load Balancer to send the request to both servers. Return each document from the correct server
Reference:
Option C is generally a better solution for ensuring data consistency between multiple instances in different Availability Zones.
EFS does not require manual effort to keep the data synchronized in the long term.
Question #6
A company uses NFS to store large video files in on-premises network attached storage. Each video file ranges in size from 1 MB to 500 GB. The total storage is 70 TB and is no longer growing. The company decides to migrate the video files to Amazon S3. The company must migrate the video files as soon as possible while using the least possible network bandwidth. Which solution will meet these requirements?
- A. Create an S3 bucket. Create an IAM role that has permissions to write to the S3 bucket. Use the AWS CLI to copy all files locally to the S3 bucket.
- B.Ā Create an AWS Snowball Edge job. Receive a Snowball Edge device on premises. Use the Snowball Edge client to transfer data to the device. Return the device so that AWS can import the data into Amazon S3.
- C. Deploy an S3 File Gateway on premises. Create a public service endpoint to connect to the S3 File Gateway. Create an S3 bucket. Create a new NFS file share on the S3 File Gateway. Point the new file share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.
- D. Set up an AWS Direct Connect connection between the on-premises network and AWS. Deploy an S3 File Gateway on premises. Create a public virtual interface (VIF) to connect to the S3 File Gateway. Create an S3 bucket. Create a new NFS file share on the S3 File Gateway. Point the new file share to the S3 bucket. Transfer the data from the existing NFS file share to the S3 File Gateway.
Reference/Solution:
Option B is most suitable for this case Because:
Minimizing Network Bandwidth: AWS Snowball Edge is designed for large-scale data transfers in situations where moving data over the network can be time-consuming and costly. By using a Snowball Edge device, you can copy the data locally, which eliminates the need for significant network bandwidth usage.
AWS Snowball Features | Amazon Web Services
Arguments about others:
Option A, This Option also includes Much More Network Bandwidth Let suppose if you calculate it is
Time (hours) = 700,000 seconds / 3,600 seconds per hour
Time (hours) ā 194.44 hours
So, it would take approximately 194.44 hours to transfer 70 TB of data at a constant network speed of 100 MB per second using AWS CLI. This is equivalent to about 8.1 days.
Option C, which involves deploying an S3 File Gateway on premises, does not minimize network bandwidth usage as it relies on the network to transfer data to S3 via the S3 File Gateway.
Option D, which involves using AWS Direct Connect and an S3 File Gateway, adds complexity and potential costs associated with Direct Connect, which may not be necessary for a one-time data migration.
Question #7
A company has an application that ingests incoming messages. Dozens of other applications and microservices then quickly consume these messages. The number of messages varies drastically and sometimes increases suddenly to 100,000 each second. The company wants to decouple the solution and increase scalability. Which solution meets these requirements?
- A. Persist the messages to Amazon Kinesis Data Analytics. Configure the consumer applications to read and process the messages.
- B. Deploy the ingestion application on Amazon EC2 instances in an Auto Scaling group to scale the number of EC2 instances based on CPU metrics.
- C. Write the messages to Amazon Kinesis Data Streams with a single shard. Use an AWS Lambda function to preprocess messages and store them in Amazon DynamoDB. Configure the consumer applications to read from DynamoDB to process the messages.
- D.Ā Publish the messages to an Amazon Simple Notification Service (Amazon SNS) topic with multiple Amazon Simple Queue Service (Amazon SOS) subscriptions. Configure the consumer applications to process the messages from the queues.
Reference/Solution:
SQS Fan-out Pattern is ideal for this Case. Sns + Sqs are both used together for decoupling sending and receiving components.
Fanout to Amazon SQS queues
Application Integration are common scenario for these two.
Common Amazon SNS scenarios
Arguments about others:
Option A, using Amazon Kinesis Data Analytics, is designed for stream processing and analytics rather than decoupling and message distribution, making it less suitable for this specific use case.
Option B, deploying the ingestion application on EC2 instances in an Auto Scaling group, might not provide the same level of flexibility and scalability as the publish-subscribe model offered by SNS and SQS. It could be more challenging to handle sudden spikes in message loads effectively with this option.
Option C, which suggests using Amazon Kinesis Data Streams and DynamoDB, could also work but may introduce unnecessary complexity for this use case. Kinesis Data Streams are typically used for real-time stream processing, and DynamoDB may not be the most efficient storage solution for this scenario, especially given the varying message loads.
Therefore, Option D is the recommended choice for this scenario.
Question #8
A company is migrating a distributed application to AWS. The application serves variable workloads. The legacy platform consists of a primary server that coordinates jobs across multiple compute nodes. The company wants to modernize the application with a solution that maximizes resiliency and scalability. How should a solutions architect design the architecture to meet these requirements?
- A. Configure an Amazon Simple Queue Service (Amazon SQS) queue as a destination for the jobs. Implement the compute nodes with Amazon EC2 instances that are managed in an Auto Scaling group. Configure EC2 Auto Scaling to use scheduled scaling.
- B**. Configure an Amazon Simple Queue Service (Amazon SQS) queue as a destination for the jobs. Implement the compute nodes with Amazon EC2 instances that are managed in an Auto Scaling group. Configure EC2 Auto Scaling based on the size of the queue.**
- C. Implement the primary server and the compute nodes with Amazon EC2 instances that are managed in an Auto Scaling group. Configure AWS CloudTrail as a destination for the jobs. Configure EC2 Auto Scaling based on the load on the primary server.
- D. Implement the primary server and the compute nodes with Amazon EC2 instances that are managed in an Auto Scaling group. Configure Amazon EventBridge (Amazon CloudWatch Events) as a destination for the jobs. Configure EC2 Auto Scaling based on the load on the compute nodes.
Reference/Solution:
Sqs is more resilience and high-available
Resilience in Amazon SQS
For Scalibilty we use an Auto Scaling group with SQS with appropriate metric
Scaling based on Amazon SQS
Arguments about others:
Option A, Why we used scheduled scaling for this scenario.
Option C, configuring AWS CloudTrail as a destination for jobs, is not suitable for handling job coordination and workload scalability. CloudTrail is primarily used for auditing and monitoring AWS API activities, not for managing distributed application workloads.