As you wake up at 2 AM in the morning to PagerDuty incident notification, first question that runs through your mind
How do I troubleshoot this problem?
You and your team have practiced this before. You have a list of things to check. What if a workflow can do all these automatically, as soon as a problem happens?
Every team that manages a complex SaaS application goes through a series of troubleshooting steps that can be automated.
Why aren't these automated already?
Challenge in automating these troubleshooting workflows is due to the interaction with many different services (e.g. DataDog, AppDynamic, Splunk etc.) and codifying logic needed to connect these services in a unified fashion. Fylamynt is built exactly to solve this problem. Use our powerful connectors and actions to build any troubleshooting workflow that you can imagine.
A common approach is to quickly check all the containers in your EKS clusters, grab logs for later analysis. How do you do it? Check out our EKS troubleshooting workflow in the library at <link to EKS-troubleshooting workflow>
Ok, you identified the problem. How do you fix it? You determined that the size of the VMs hosting EKS containers is not large enough for the load you are receiving. You can resize VMs, but how to do it safely?
Sign up and checkout our Performance Management workflows in the Fylamynt Workflows library.