Get Instant Access to AWS-Certified-Machine-Learning-Specialty Practice Exam Questions [Q92-Q117]

Share

Get Instant Access to AWS-Certified-Machine-Learning-Specialty Practice Exam Questions

Reliable Study Materials & Testing Engine for AWS-Certified-Machine-Learning-Specialty Exam Success!


Achieving the AWS Certified Machine Learning - Specialty certification can help individuals advance their careers in the field of machine learning and increase their earning potential. AWS Certified Machine Learning - Specialty certification is recognized by industry leaders and can open up new opportunities for professionals in various industries, including healthcare, finance, and retail, among others.

 

NEW QUESTION # 92
A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements:
* Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum.
* Support event-driven ETL pipelines.
* Provide a quick and easy way to understand metadata.
Which approach meets trfese requirements?

  • A. Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Batch job, and an AWS Glue Data Catalog to search and discover metadata.
  • B. Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata.
  • C. Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Batch job, and an external Apache Hive metastore to search and discover metadata.
  • D. Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Glue ETL job, and an external Apache Hive metastore to search and discover metadata.

Answer: B


NEW QUESTION # 93
A Data Science team is designing a dataset repository where it will store a large amount of training data commonly used in its machine learning models. As Data Scientists may create an arbitrary number of new datasets every day the solution has to scale automatically and be cost-effective. Also, it must be possible to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?

  • A. Store datasets as files in Amazon S3.
  • B. Store datasets as global tables in Amazon DynamoDB.
  • C. Store datasets as files in an Amazon EBS volume attached to an Amazon EC2 instance.
  • D. Store datasets as tables in a multi-node Amazon Redshift cluster.

Answer: D


NEW QUESTION # 94
A large JSON dataset for a project has been uploaded to a private Amazon S3 bucket The Machine Learning Specialist wants to securely access and explore the data from an Amazon SageMaker notebook instance A new VPC was created and assigned to the Specialist How can the privacy and integrity of the data stored in Amazon S3 be maintained while granting access to the Specialist for analysis?

  • A. Launch the SageMaker notebook instance within the VPC with SageMaker-provided internet access enabled Use an S3 ACL to open read privileges to the everyone group
  • B. Launch the SageMaker notebook instance within the VPC and create an S3 VPC endpoint for the notebook to access the data Define a custom S3 bucket policy to only allow requests from your VPC to access the S3 bucket
  • C. Launch the SageMaker notebook instance within the VPC and create an S3 VPC endpoint for the notebook to access the data Copy the JSON dataset from Amazon S3 into the ML storage volume on the SageMaker notebook instance and work against the local dataset
  • D. Launch the SageMaker notebook instance within the VPC with SageMaker-provided internet access enabled. Generate an S3 pre-signed URL for access to data in the bucket

Answer: C


NEW QUESTION # 95
A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s) Which visualization will accomplish this?

  • A. A scatter plot with points colored by target variable that uses (-Distributed Stochastic Neighbor Embedding (I-SNE) to visualize the large number of input variables in an easier-to-read dimension.
  • B. A scatter plot showing the correlation between maximum tree depth and the objective metric.
  • C. A histogram showing whether the most important input feature is Gaussian.
  • D. A scatter plot showing (he performance of the objective metric over each training iteration

Answer: A


NEW QUESTION # 96
A health care company is planning to use neural networks to classify their X-ray images into normal and abnormal classes. The labeled data is divided into a training set of 1,000 images and a test set of 200 images.
The initial training of a neural network model with 50 hidden layers yielded 99% accuracy on the training set, but only 55% accuracy on the test set.
What changes should the Specialist consider to solve this issue? (Choose three.)

  • A. Enable early stopping
  • B. Include all the images from the test set in the training set
  • C. Choose a smaller learning rate
  • D. Choose a lower number of layers
  • E. Choose a higher number of layers
  • F. Enable dropout

Answer: B,E,F


NEW QUESTION # 97
A machine learning specialist is running an Amazon SageMaker endpoint using the built-in object detection algorithm on a P3 instance for real-time predictions in a company's production application. When evaluating the model's resource utilization, the specialist notices that the model is using only a fraction of the GPU.
Which architecture changes would ensure that provisioned resources are being utilized effectively?

  • A. Redeploy the model on a P3dn instance.
  • B. Deploy the model onto an Amazon Elastic Container Service (Amazon ECS) cluster using a P3 instance.
  • C. Redeploy the model on an M5 instance. Attach Amazon Elastic Inference to the instance.
  • D. Redeploy the model as a batch transform job on an M5 instance.

Answer: B


NEW QUESTION # 98
A company will use Amazon SageMaker to train and host a machine learning (ML) model for a marketing campaign. The majority of data is sensitive customer dat a. The data must be encrypted at rest. The company wants AWS to maintain the root of trust for the master keys and wants encryption key usage to be logged.
Which implementation will meet these requirements?

  • A. Use encryption keys that are stored in AWS Cloud HSM to encrypt the ML data volumes, and to encrypt the model artifacts and data in Amazon S3.
  • B. Use customer managed keys in AWS Key Management Service (AWS KMS) to encrypt the ML data volumes, and to encrypt the model artifacts and data in Amazon S3.
  • C. Use AWS Security Token Service (AWS STS) to create temporary tokens to encrypt the ML storage volumes, and to encrypt the model artifacts and data in Amazon S3.
  • D. Use SageMaker built-in transient keys to encrypt the ML data volumes. Enable default encryption for new Amazon Elastic Block Store (Amazon EBS) volumes.

Answer: B


NEW QUESTION # 99
A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric.
This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours.
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s).
Which visualization will accomplish this?

  • A. A scatter plot with points colored by target variable that uses t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the large number of input variables in an easier-to-read dimension.
  • B. A scatter plot showing the performance of the objective metric over each training iteration.
  • C. A scatter plot showing the correlation between maximum tree depth and the objective metric.
  • D. A histogram showing whether the most important input feature is Gaussian.

Answer: A

Explanation:
https://medium.com/all-things-ai/in-depth-parameter-tuning-for-random-forest-d67bb7e920d


NEW QUESTION # 100
The displayed graph is from a foresting model for testing a time series.

Considering the graph only, which conclusion should a Machine Learning Specialist make about the behavior of the model?

  • A. The model does not predict the trend or the seasonality well.
  • B. The model predicts the trend well, but not the seasonality.
  • C. The model predicts both the trend and the seasonality well.
  • D. The model predicts the seasonality well, but not the trend.

Answer: A


NEW QUESTION # 101
A Machine Learning Specialist is working with a large cybersecurily company that manages security events in real time for companies around the world The cybersecurity company wants to design a solution that will allow it to use machine learning to score malicious events as anomalies on the data as it is being ingested The company also wants be able to save the results in its data lake for later processing and analysis What is the MOST efficient way to accomplish these tasks'?

  • A. Ingest the data and store it in Amazon S3. Have an AWS Glue job that is triggered on demand transform the new data Then use the built-in Random Cut Forest (RCF) model within Amazon SageMaker to detect anomalies in the data
  • B. Ingest the data and store it in Amazon S3 Use AWS Batch along with the AWS Deep Learning AMIs to train a k-means model using TensorFlow on the data in Amazon S3.
  • C. Ingest the data using Amazon Kinesis Data Firehose, and use Amazon Kinesis Data Analytics Random Cut Forest (RCF) for anomaly detection Then use Kinesis Data Firehose to stream the results to Amazon S3
  • D. Ingest the data into Apache Spark Streaming using Amazon EMR. and use Spark MLlib with k-means to perform anomaly detection Then store the results in an Apache Hadoop Distributed File System (HDFS) using Amazon EMR with a replication factor of three as the data lake

Answer: A


NEW QUESTION # 102
A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as the cost of churn is far greater than the cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:

Based on the model evaluation results, why is this a viable model for production?

  • A. The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the false negatives.
  • B. The model is 86% accurate and the cost incurred by the company as a result of false negatives is less than the false positives.
  • C. The precision of the model is 86%, which is greater than the accuracy of the model.
  • D. The precision of the model is 86%, which is less than the accuracy of the model.

Answer: D


NEW QUESTION # 103
A credit card company wants to build a credit scoring model to help predict whether a new credit card applicant will default on a credit card payment. The company has collected data from a large number of sources with thousands of raw attributes. Early experiments to train a classification model revealed that many attributes are highly correlated, the large number of features slows down the training speed significantly, and that there are some overfitting issues.
The Data Scientist on this project would like to speed up the model training time without losing a lot of information from the original dataset.
Which feature engineering technique should the Data Scientist use to meet the objectives?

  • A. Run self-correlation on all features and remove highly correlated features
  • B. Normalize all numerical values to be between 0 and 1
  • C. Use an autoencoder or principal component analysis (PCA) to replace original features with new features
  • D. Cluster raw data using k-means and use sample data from each cluster to build a new dataset

Answer: B


NEW QUESTION # 104
Given the following confusion matrix for a movie classification model, what is the true class frequency for Romance and the predicted class frequency for Adventure?

  • A. The true class frequency for Romance is 77.56% * 0.78 and the predicted class frequency for Adventure is 20 85% ' 0.32
  • B. The true class frequency for Romance is 57.92% and the predicted class frequency for Adventure is 1312%
  • C. The true class frequency for Romance is 77.56% and the predicted class frequency for Adventure is 20 85%
  • D. The true class frequency for Romance is 0 78 and the predicted class frequency for Adventure is (0 47 - 0.32).

Answer: B

Explanation:
https://docs.aws.amazon.com/machine-learning/latest/dg/multiclass-model-insights.html


NEW QUESTION # 105
A real estate company wants to create a machine learning model for predicting housing prices based on a historical dataset. The dataset contains 32 features.
Which model will meet the business requirement?

  • A. Principal component analysis (PCA)
  • B. K-means
  • C. Linear regression
  • D. Logistic regression

Answer: C


NEW QUESTION # 106
A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provided.

Based on this information, which model would have the HIGHEST accuracy?

  • A. Support vector machine (SVM) with non-linear kernel
  • B. Long short-term memory (LSTM) model with scaled exponential linear unit (SELU)
  • C. Single perceptron with tanh activation function
  • D. Logistic regression

Answer: A


NEW QUESTION # 107
A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data.
Which solution requires the LEAST effort to be able to query this data?

  • A. Use AWS Glue to catalogue the data and Amazon Athena to run queries.
  • B. Use AWS Batch to run ETL on the data and Amazon Aurora to run the queries.
  • C. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
  • D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run queries.

Answer: A


NEW QUESTION # 108
A Machine Learning Specialist is working with a large company to leverage machine learning within its products. The company wants to group its customers into categories based on which customers will and will not churn within the next 6 months. The company has labeled the data available to the Specialist.
Which machine learning model type should the Specialist use to accomplish this task?

  • A. Clustering
  • B. Reinforcement learning
  • C. Linear regression
  • D. Classification

Answer: D

Explanation:
The goal of classification is to determine to which class or category a data point (customer in our case) belongs to. For classification problems, data scientists would use historical data with predefined target variables AKA labels (churner/non-churner) ?answers that need to be predicted ?to train an algorithm.
With classification, businesses can answer the following questions:
Will this customer churn or not?
Will a customer renew their subscription?
Will a user downgrade a pricing plan?
Are there any signs of unusual customer behavior?
https://www.kdnuggets.com/2019/05/churn-prediction-machine-learning.html


NEW QUESTION # 109
A Machine Learning Specialist is building a prediction model for a large number of features using linear models, such as linear regression and logistic regression During exploratory data analysis the Specialist observes that many features are highly correlated with each other This may make the model unstable What should be done to reduce the impact of having such a large number of features?

  • A. Create a new feature space using principal component analysis (PCA)
  • B. Use matrix multiplication on highly correlated features.
  • C. Apply the Pearson correlation coefficient
  • D. Perform one-hot encoding on highly correlated features

Answer: A


NEW QUESTION # 110
A company offers an online shopping service to its customers. The company wants to enhance the site's security by requesting additional information when customers access the site from locations that are different from their normal location. The company wants to update the process to call a machine learning (ML) model to determine when additional information should be requested.
The company has several terabytes of data from its existing ecommerce web servers containing the source IP addresses for each request made to the web server. For authenticated requests, the records also contain the login name of the requesting user.
Which approach should an ML specialist take to implement the new security feature in the web application?

  • A. Use Amazon SageMaker to train a model using the Object2Vec algorithm. Schedule updates and retraining of the model using new log data nightly.
  • B. Use Amazon SageMaker Ground Truth to label each record as either a successful or failed access attempt. Use Amazon SageMaker to train a binary classification model using the IP Insights algorithm.
  • C. Use Amazon SageMaker to train a model using the IP Insights algorithm. Schedule updates and retraining of the model using new log data nightly.
  • D. Use Amazon SageMaker Ground Truth to label each record as either a successful or failed access attempt. Use Amazon SageMaker to train a binary classification model using the factorization machines (FM) algorithm.

Answer: B


NEW QUESTION # 111
A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The Specialist needs to understand whether the model is more frequently overestimating or underestimating the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target value?

  • A. Residual plots
  • B. Area under the curve
  • C. Root Mean Square Error (RMSE)
  • D. Confusion matrix

Answer: B


NEW QUESTION # 112
A company is converting a large number of unstructured paper receipts into images. The company wants to create a model based on natural language processing (NLP) to find relevant entities such as date, location, and notes, as well as some custom entities such as receipt numbers.
The company is using optical character recognition (OCR) to extract text for data labeling. However, documents are in different structures and formats, and the company is facing challenges with setting up the manual workflows for each document type. Additionally, the company trained a named entity recognition (NER) model for custom entity detection using a small sample size. This model has a very low confidence score and will require retraining with a large dataset.
Which solution for text extraction and entity detection will require the LEAST amount of effort?

  • A. Extract text from receipt images by using Amazon Textract. Use the Amazon SageMaker BlazingText algorithm to train on the text for entities and custom entities.
  • B. Extract text from receipt images by using a deep learning OCR model from the AWS Marketplace. Use Amazon Comprehend for entity detection, and use Amazon Comprehend custom entity recognition for custom entity detection.
  • C. Extract text from receipt images by using a deep learning OCR model from the AWS Marketplace. Use the NER deep learning model to extract entities.
  • D. Extract text from receipt images by using Amazon Textract. Use Amazon Comprehend for entity detection, and use Amazon Comprehend custom entity recognition for custom entity detection.

Answer: D


NEW QUESTION # 113
A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The Specialist needs to understand whether the model is more frequently overestimating or underestimating the target.
What option can the Specialist use to determine whether it is overestimating or underestimating the target value?

  • A. Residual plots
  • B. Area under the curve
  • C. Root Mean Square Error (RMSE)
  • D. Confusion matrix

Answer: B


NEW QUESTION # 114
A Marketing Manager at a pet insurance company plans to launch a targeted marketing campaign on social media to acquire new customers Currently, the company has the following data in Amazon Aurora
* Profiles for all past and existing customers
* Profiles for all past and existing insured pets
* Policy-level information
* Premiums received
* Claims paid
What steps should be taken to implement a machine learning model to identify potential new customers on social media?

  • A. Use clustering on customer profile data to understand key characteristics of consumer segments Find similar profiles on social media.
  • B. Use a decision tree classifier engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media
  • C. Use a recommendation engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media
  • D. Use regression on customer profile data to understand key characteristics of consumer segments Find similar profiles on social media.

Answer: D


NEW QUESTION # 115
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combined The model needs lo be retrained daily Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort?

  • A. Spin up a fleet of Amazon EC2 instances with the transformation logic, have them transform the data records accumulating on Amazon S3, and output the transformed records to Amazon S3.
  • B. Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data Firehouse stream that transforms raw record attributes into simple transformed values using SQL.
  • C. Deploy an Amazon EMR cluster running Apache Spark with the transformation logic, and have the cluster run each day on the accumulating records in Amazon S3, outputting new/transformed records to Amazon S3
  • D. Require that the stores to switch to capturing their data locally on AWS Storage Gateway for loading into Amazon S3 then use AWS Glue to do the transformation

Answer: C


NEW QUESTION # 116
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose. To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combined. The model needs to be retrained daily.
Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort?

  • A. Spin up a fleet of Amazon EC2 instances with the transformation logic, have them transform the data records accumulating on Amazon S3, and output the transformed records to Amazon S3.
  • B. Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data Firehose stream that transforms raw record attributes into simple transformed values using SQL.
  • C. Require that the stores to switch to capturing their data locally on AWS Storage Gateway for loading into Amazon S3, then use AWS Glue to do the transformation.
  • D. Deploy an Amazon EMR cluster running Apache Spark with the transformation logic, and have the cluster run each day on the accumulating records in Amazon S3, outputting new/transformed records to Amazon S3.

Answer: B


NEW QUESTION # 117
......


Understanding functional and technical aspects of AWS Certified Machine Learning Specialty Exam Data Engineering

The following will be dicussed here:

  • Create data repositories for machine learning
  • Identify and implement a data-transformation solution
  • Identify and implement a data-ingestion solution

The AWS Certified Machine Learning - Specialty Exam covers a wide range of topics, including data preparation, feature engineering, model selection and evaluation, deep learning, and deployment. It is designed to test an individual's ability to design, implement, deploy, and maintain machine learning solutions using AWS services. AWS-Certified-Machine-Learning-Specialty exam also covers various AWS services such as Amazon SageMaker, Amazon Rekognition, and Amazon Comprehend, which are essential tools for machine learning on AWS. By passing AWS-Certified-Machine-Learning-Specialty exam, individuals can demonstrate their ability to design and implement effective machine learning solutions on the AWS platform, which can help them advance their careers in the field.

 

Validate your Skills with Updated AWS-Certified-Machine-Learning-Specialty Exam Questions & Answers and Test Engine: https://www.itpassleader.com/Amazon/AWS-Certified-Machine-Learning-Specialty-dumps-pass-exam.html

Tested & Approved AWS-Certified-Machine-Learning-Specialty Study Materials Download: https://drive.google.com/open?id=1r8MgP_MolP45bjWe3u08HOVP-ukthQI2

0
0
0
0