# Troubleshooting

## Introduction

Using CloudFormation deployment saves a lot of potential errors that may occur during installation, especially in the configuration of security groups and other connectivity-related issues. However, the following errors related to the following subjects may occur during installation:

* Installation logs
* AWS account limits
* AWS instance launch error
* Launch in placement group error
* Instance type not supported in AZ
* ClusterBootCondition timeout
* Clients failed to join cluster

## Installation Logs

As explained in [Self-Service Installation](https://docs.weka.io/3.14/install/aws/self-service-portal), each instance launched in a Weka CloudFormation template starts by installing Weka on itself. This is performed using a script named `wekaio-instance-boot.sh` and launched by cloud-init. All logs generated by this script are written to the instance’s Syslog.

Additionally, the [CloudWatch Logs Agent](http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/EC2NewInstanceCWL.html) is installed on each instance, dumping Syslog to CloudWatch under a log-group named`/wekaio/<stack-name>`. For example, if the stack is named`cluster1,` a log-group named `/wekaio/cluster1` should appear in CloudWatch a few moments after the template shows the instances have reached CREATE\_COMPLETE state.

Under the log-group, there should be a log-stream for each instance Syslog matching the instance name in the CloudFormation template. For example, in a cluster with 6 backend instances, log-streams named `Backend0-syslog` through `Backend5-syslog` should be observed.

## AWS Account Limits

When deploying the stack, this error may be received in the description of a CREATE\_FAILED event for one or more instances, indicating that more instances (N) have been requested than that permitted by the current instance limit of L for the specified instance type. To request an adjustment to this limit, go to [aws.amazon.com](http://aws.amazon.com/contact-us/ec2-request) to open a support case with AWS.

## AWS Instance Launch Error

If the error *Instance i-0a41ba7327062338e failed to stabilize. Current state: shutting-down. Reason: Server.InternalError: Internal error on launch* is received, one of the instances was unable to start. This is an internal AWS error and it is necessary to try to deploy the stack again.

## Launch in Placement Group Error

If the error *We currently do not have sufficient capacity to launch all of the additional requested instances into Placement Group 'PG'* is received, it was not possible to place all the requested instances in one placement-group.

The CloudFormation template creates all instances in one placement-group to guarantee best performance. Consequently, if the deployment fails with this error, try to deploy in another AZ.

## Instance Type Not Supported In AZ

If the error *The requested configuration is currently not supported. Please check the documentation for supported configurations* or *Your requested instance type (T) is not supported in your requested Availability Zone (AZ). Please retry your request by not specifying an Availability Zone or choosing az1, az2, az3* is received, the instance type that you tried to provision is not supported in the specified AZ. Try selecting another subnet to deploy the cluster in, which will implicitly select another AZ.

## ClusterBootCondition Timeout

When a *ClusterBootCondition timeout* occurs, there was a problem creating the initial Weka system cluster. To debug this error, look in the `Backend0-syslog` log-stream (as described above). The first backend instance is responsible for creating the cluster and therefore, its log should provide the information necessary to debug this error.

## Clients Failed to Join

When the message *Clients failed to join for uniqueId: ClientN* is received while in the WaitCondition, one of the clients was unable to join the cluster. Look at the Syslog of the client specified in uniqueId as described above.

{% hint style="success" %}
**For Example:** If the error message specifies that client 3 failed to join, a message ending with `uniqueId: Client3` should be displayed. Look at the log-stream named `Client3-syslog`.
{% endhint %}

## Health Monitoring&#x20;

You can monitor the cluster instances by checking the cluster EC2 instances in the AWS EC2 service. You can set up [Cloud Watch](https://aws.amazon.com/documentation/cloudwatch/) as external monitoring to the cluster.

Connecting to the Weka system cluster GUI will provide the[ Weka cluster health](https://docs.weka.io/3.14/getting-started-with-weka/gui), where you can see if any component is not functioning well, view system [alerts](https://docs.weka.io/3.14/usage/alerts), [events](https://docs.weka.io/3.14/usage/events), and [statistics](https://docs.weka.io/3.14/usage/statistics).&#x20;
