Skylar AI - The Future of AIOps and Observability
Part 1- An All Too Familiar Crisis
Everything’s down. Frustrated customers are calling, emailing and posting on social media. You assemble a war room. News of the problem is spreading like wildfire and has just reached the CEO at the most inopportune moment (think dinner/vacation/out with the kids). Your cell phone and Slack feed are going nuts.
Meanwhile, the front-line technical teams are looking at dashboards and digging through logs, events and metrics, trying to understand what happened. They’re not making much progress, so you escalate and call in more experts. An hour later there’s still no obvious answer. You escalate it again, but this time to the development team. And, eventually, 14 very long hours later, a “rockstar” developer figures it out.
The immediate crisis is over, but there will be painful days of work ahead for you and your team dealing with disgruntled customers, driving a detailed postmortem process and then coming up with an action plan to prevent a repeat occurrence of what just happened.
Enter Skylar AI - The Future of AIOps and Observability
Now imagine a different scenario:
Skylar: “Your orders server just went down because it hit a race condition in one of the open source components. Good news, there’s a quick solution. You are running v2.3.1.2 of the component and the issue has been fixed in v2.3.1.4. Would you like me to show you how to upgrade to the fixed version?”
The goal of Skylar AI is to reason over not just telemetry, but also the stored knowledge of an organization to deliver accurate insights, recommendations, and predictions, so that:
- When something fails, it will tell you in plain language what happened and how to fix it.
- If something is going to fail, it will tell you how to prevent it from failing and impacting production.
- It will also be able to deliver all insights and answer any question asked of it by drawing on telemetry together with relevant information from a company’s stored knowledge (e.g. KB articles, support tickets, bug databases, product documentation, etc.)
Gen AI is Everywhere, how is Skylar AI Different?
The Fundamental Challenge
Before explaining how Skylar AI is different, it’s important to understand the problem Skylar AI was designed to address: that today’s best-of-breed AIOps, monitoring and observability tools place too much reliance on human expertise.
The situation above is a prime example. The problem had never been seen before (an “unknown/unknown”) and so there were no rules in place to catch it. This meant human experts with the right tribal knowledge were needed to figure things out. The tools provided all the information needed to ultimately solve the problem, but at each step of the way, the right human expert(s) needed to interpret and analyze the data, decide on the next course of action and keep iterating until the problem was finally solved. Ultimately, this required multiple escalations and a lot of wasted time and frustration.
The goal of Skylar AI is to make Level 1 and 2 teams more effective and productive and able to solve a broader range of problems more quickly. So that situations like the above can be handled without the pain and wasted time.
So why not Follow the Industry Trend and Build an AI Assistant?
The industry is quickly moving towards the use of Generative AI (GenAI) and large language models (LLMs) in the form of “AI Assistants” or “AI Chatbots”. These appear as small chat panels in a product’s UI and allow a user to ask questions of the tool in plain language. So, instead of learning how to construct a complex query and visualize it on a chart, with an assistant, you can simply ask, “create a dashboard showing average latency for the top 10 devices over the last 3 months”, and it will do just that.
However, although assistants are useful in simplifying the way a user interacts with a tool, it is important to understand that a skilled user still needs to know what questions to ask of the assistant at each step of the way – in other words reliance on deep expertise is still required to get the most out of the tool.
Our Breakthrough Invention: An AI Advisor, rather than an AI Assistant
To deliver on our vision, a new paradigm was needed: an “AI Advisor.”
The concept of Skylar Advisor is that instead of a skilled user having to always know what questions to ask of the tool, the Advisor automatically tells the user the answers to the curated questions tailored to the user role without the user having to even ask them in the first place!
In other words, Skylar Advisor automatically tells the user what the user needs to know in the form of easy-to-understand insights, predictions, and actions to take. This allows teams of all levels to be far more effective, without wasting precious time and perform more tasks with less effort.
The creation of Skylar Advisor, didn’t just necessitate building a pipeline of AI and machine learning (ML) technologies including a self-hosted LLM, — it also required a completely reimagined user experience:
An uncluttered and intuitive UI (not a traditional dashboard/event list display)
A curated list of what’s most important to user based on the user’s role and areas of responsibility
A multi-modal interface to describe a problem
The use of contextual “Quick Prompts,” where Skylar suggests what actions a skilled expert would take at each step of the way
To see a demonstration of Skylar Advisor, please visit here.
Make sure to read Part 2 of this blog that explains the components of Skylar AI and how it works with SL1