Woah, what a title…
What I want to talk about isn’t as complicated as it sounds. Basically, in collecting daily site specific metrics for the purposes of SEO, you sometimes get some bad data from your third-party sources.
These bad data points can skew your graphs and make it almost impossible to visually derive any useful information from them.
For example:
The graph above (click to expand) shows the number of pages for a 6 month date range. In early January the data spikes dramatically upwards. This is clearly an anomaly, as the day after, the data is back in its normal range. The trouble here is that this anomalous point skews the entire graph, making it impossible to derive any real insights from it. The above graph is rendered useless by that one piece of anomalous data.
Here is the same graph WITHOUT that one bad piece of data (click to expand)
It’s amazing how one bad data point can skew your entire graph, isn’t it? Now we have an image that is useful!
Here’s some (hackish & sketchy) code to identify these statistical anomalies and replace them:



This is a little over my head but I want a backlink to my electronics store so I’m commenting anyway.
Very useful information. I like how you put up the graph and sketch codes. One bad data can really mess up your strategy and graph. Thanks!