![]() ![]() Because you must choose percentile threshold values between 0 and 1, this is the only algorithm of the three that will never produce thresholds above or below your historical data min and max values. As a simple example, we might choose to set critical severity for data points falling below the 1 st percentile (0.01) and above the 99 th percentile (0.99). The quantile algorithm allows you to put threshold bounds at various percentiles based on historic data. Keep in mind that each algorithm will re-compute future threshold values each and every night based on historical data spanning the configured training window for that KPI. What I want to do is call out the general behavior and practical pros and cons for each algorithm, so that you can choose the one that best suites your needs. We won't go deep into the mathematics of each equation because it is pretty straightforward and plenty of general information can be found online. Here, we simply want to discuss each of the three algorithms and which might be best for your situation. Which Algorithm Should I Use?įor clarification, this section does not focus on which pre-configured thresholding template to use we’ll talk about that in a moment. This distinction-while subtle-is very important it will shape your mindset on threshold values and alert configurations when using adaptive thresholds. However, adaptive thresholds aren’t necessarily about "working" and "broken"-they are about "normal" and "abnormal." Therefore, it will help if you mentally redefine these states to something more like extremely high, abnormally high, normal, abnormally low, and extremely low. With static thresholds, we tend to associate these states with some degree of "working" or "broken," and we attempt to threshold them accurately so that a KPI is only critical when we’re pretty sure something is really broken. In Splunk IT Service Intelligence (ITSI), as I’m sure you recall, we have 6 severities: critical, high, medium, normal, low, and info. ![]() Severities Aren’t What They Used to Beīefore we get into configurations, we need to ammend our definition and understanding of KPI severity. Once again, getting the right configurations in place to generate meaningful thresholds and alerts with adaptive thresholding requires both guidance and experience. Awesome!īuilding on foundational guidance from Ensuring Success with Splunk ITSI Part 1 and Part 2, let's get deep into the best practices around adaptive thresholding. Okay, so you’ve decided it’s time to use adaptive thresholding for one or more of your service KPIs. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |