Browsed by
Month: March 2018

Dialogue: What do We Mean by Predictive Analytics?

Dialogue: What do We Mean by Predictive Analytics?

Predictive analytics is a phrase that gets used often as a feature in many tech offerings. Predicting when a problem is likely to occur allows either a human or an automated system to take some action to mitigate a potential issue before it occurs and has potentially catastrophic effects. If we take data storage, for example, the worst thing that can happen on your storage array is the loss of data. Data loss is permanent (unless you have a backup), so when it’s gone, it’s gone1

Mitigating problems proactively involves modeling system behavior (whatever that system happens to be) according to some metric or set of metrics, then evaluating new data as it arrives and deciding if this new data is “abnormal” or indicative of a potential problem. This means there are three steps –modeling the behavior of the system or aspect of the system you are trying to monitor, determining what is normal (and by complement, abnormal), and then deciding how you will handle an abnormal data point/event/other metric. 

These three steps, while simple to articulate, are three very difficult and distinct problems that each involve a whole host of considerations that are themselves interrelated. For instance2:

  • How do we decide which metrics to use in our modeling? 
  • How do we verify that this model is an accurate representation of system behavior?
  • Once we have a model, how do we define unusual or anomalous behavior
  • Once we define anomalous behavior, how do we decide our courses of action? Do we act on any “weird” point that crosses some threshold, or should we see the threshold crossed repeatedly, or something else?

Proactively mitigating system issues is a well-justified desire of many companies, because it increases system reliability. I watched Starwind present their Virtual Tape Library at Storage Field Day 15 , and they, like many other companies, strive to create a way to detect impending failure patterns and take preventative measures before a catastrophic failure. The presentation is only two hours long, and covered their entire architecture, not just the specific feature regarding failure pattern detection, so we were unable to take the time to discuss the specifics of Starwind’s Proactive Support, as they call it. 

Detecting any kind of pattern in data is difficult, especially a failure pattern. There are always tradeoffs. If we set our tolerance for what we consider “normal behavior” to be too low, we risk alerting on potential issues too often. When this happens, alerts get ignored, and real problems are assumed to be just another “false alarm.” On the other hand, if we set the tolerance for what we consider normal too high, we run the risk of not detecting an issue at all. 

At this point, I’d like to open dialogue in the comments, particularly because these subjects are deep; in many cases so deep an entire two-hour presentation can be devoted to just a few aspects of this very large challenge. How do we balance these tradeoffs? How do we decide whether an unusual data point or set of points is really something bad? Is it possible if we are generating too many “false alarms”, that our original behavior model is off?  

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Commentary: White Papers Don’t Impress Me Much

Commentary: White Papers Don’t Impress Me Much

I spent the last week at an event called Tech Field Day (my second time). In a nutshell, it’s a traveling panel of 12-15 delegates who are generally IT professionals (and me) that visits 8-10 companies over three days to hear various presentations on their technology. Sometimes it’s storage tech, sometimes networking, or cloud, or a mixture of all sorts of things. The common thread, in theory, is that these presentations are supposed to be “deep dives”, to use an industry buzzword. The delegates around the table are all highly proficient in their fields, and are expected to ask questions to drill into claims made and get more details about various IT architectures presented. In my case, I am obviously interested in uncovering the interesting mathematics behind various enterprise technologies. From erasure coding to graph theory to the statistics underneath the vague “analytics” every company claims to do, my interest lies in discussing how they’re employing mathematics to make their tech better or drive business decisions.

Typically, most companies release white papers that claim to detail their architecture (or math, as one claimed). In reality, and with rare exception (Datrium actually comes to mind here), they’re little more than five to seven pages of marketing-style technical claims with no citations or justification. As an overview, I understand keeping the lengths shorter, but references to more detailed publications and reports should be available when making certain claims. Therefore, as part of the Tech Field Day panel, I felt a responsibility to press the presenters on some of these claims, earnestly hoping for more details. My thought was that they were putting out a “teaser”, so to speak, and just waiting excitedly for someone interested to ask about technology they built and are proud of.1 For the most part, my initial thought was wrong. From dismissing my questions to hiding behind the curtain of “secret sauces” and “proprietary” code, I was left disappointed for the most part. 

My frustration can be traced to the very Silicon Valley style idea that flashy marketing must pervade everything, which blurs opinion and fact. White papers which should contain technical details and references become little more than press releases disguised as objective reports. I debated how to really articulate my opinion, and decided to do something a bit out of character for my typical article. With apologies to Shania Twain, I present my version of the song “That Don’t Impress Me Much”:

That Don’t Impress Me Much (Tech Edition)

I’ve noticed in tech they think they’re pretty smart
They’ve clearly got their marketing down to an art.
The white papers are “genius”; it drives me up a wall
There’s nothing original, not at all

Oh-oo-oh, you think you’re special
Oh-oo-oh you think you’re something else

Okay, so the erasure coding’s novel
That don’t impress me much
So you made the claim, but have you got the proof?
Don’t get me wrong, yea, I think you’re all right
But that won’t give me inspiration in the night
That don’t impress me much.

Every white paper says they’re the best on the market
“Independently verified”—just in case
Writing uncited claims, publishing as fact (I want to vomit)
Cause we all know tech’s really a private arms race

Oh-oo-oh, you think you’re special
Oh-oo-oh you think you’re something else

Okay, so it’s “secret sauce”
That don’t impress me much
So you got some code, but have you got some proof?
Don’t get me wrong, yea, I think you’re all right
But that won’t give me inspiration in the night
That don’t impress me much.

So you’re one of those firms using learning machines
But you’ve no earthly clue what’s going on underneath
I can’t believe you think that it’s all right
Come on baby tell me, you must be joking right?

Oh-oo-oh, you think you’re special
Oh-oo-oh you think you’re something else

Okay, so you’ve got analytics
That don’t impress me much
So you can “predict” but have you got some proof?
Don’t get me wrong, yea, I think you’re all right
But that won’t give me inspiration in the night
That don’t impress me much.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Isomorphisms: Making Mathematics More Convenient

Isomorphisms: Making Mathematics More Convenient

Much of pure mathematics exists to simplify our world, even if it means entering an abstract realm (or creating one) to do it. The isomorphism is one of the most powerful tools for discovering structural similarities (or that two groups are identical structurally) between two groups that on the surface look completely unrelated. In this post, we’ll look at what an isomorphism is.

Read More Read More