
So, today I learned that Amazon S3 limits the amount of buckets you are allowed to have at any one time. This was a HUGE pain in the ass for me, as I just rewrote my S3 caching libraries to separate my caching into daily buckets.

The idea I had was that I would keep 30 days worth of buckets before automatically shuffling all those buckets into long term archival at the end of each month.
You see, Amazon S3 doesn’t allow for folders within buckets (not technically, I’ll get to that later), and I think it’s bloody ridiculous for me to store the ~15 millions xml cache files I need daily access to in just one massive bucket. So, I thought my plan to create daily buckets was pretty damn good. Apparently Amazon disagrees, as hey have set a limit on the number of buckets I can have associated with my account (FYI, I have ~90 buckets right now and that’s where the limit is).
So, I’ve been looking into my options, and I have decided that I’m going to have to go back to my original set up of storing all 15 million (and growing) xml files in one bucket. As some of you may know, S3 GUIs like S3Fox are able to show sub-folders within buckets, so I figure I will go this route. My first step was to take a crack at the Ruby S3 library from Amazon (not the AWS gem, which is crap for multi threaded environments) to see how I can create folders within buckets. Turns out, you can’t. At least, not out of the box. You see, AmazonS3 doesn’t actually SUPPORT sub-folders within buckets.
Stupid, right? I know.
So, how does S3Fox do it? It turns out they create virtualized folders by creating a special object that acts as a folder, then you access your stored objects by appending that folder object name to the actual file key. For a directory named “/foo”, you would create an object with the key “foo_$folder$”. Then, to get a directory listing of all files stored under the foo path, you just query S3 for objects with keys that start with “/foo”, and you ignore any objects that end with “_$folder$”.
I’m about to waste my day setting this up, and I’m none to happy about it. It seems like a hackish and shitty work around for an obvious service flaw. I’m sure there is some technically-reasonable answer for why Amazon has set an arbitrary limit for buckets and also why they don’t allow you to create folders, but I don’t know what it is. Anyone have any answers?
Hat-Tip the Dead Programmer Society for the code to create virtualized folders in S3.
UPDATE: I’ve had to remove the presentation due to a DMCA complaint. *sigh*
I wanted to share some interesting tid-bits with you all from a recent Ipsos Reid presentation (June 18th) on Social Network Marketing.
This presentation was made to BC Hydro, the major Power authority in British Columbia (that’s in Canada, dummy).
The average Canadian:
- 5.4 hrs/week spent on Social network sites, about 1/3 of total Online time.
- LinkedIn captures 70% of the share of Social Networking in the workplace.
- The stuff teens do online is very different from adults, and VERY limited: socializing, music, and gaming.
- 88% of Adults use the internet to visit a new or informational website, versus only 44% of teens.
- 68% of Adults have clicked a website advertisement versus 28% of teens.
- 70% teens are weekly social network users, versus 36% of adults
Highlights:
- Advertising budgets are not aligned to where people are spending their time – online expenditure lags significantly behind traditional mediums
- 58% of Canadians receive 51+ unsolicited commercial emails a week (17% don’t even know)
- 64% of people don’t open ANY unsolicited emails
- of the people that do open unsolicited emails, 53% do so out of curiosity, 28% because they thought it was legitimate
- Permission-based (opt-in) emails are still one of the best ways to market to your websites audience. Be diligent about collecting them
- 77% of Canadians are registered to receive some kind of opt-in email (2008 average number of sites registered with: 15.3)
- Top 3 reasons for opting-in: Personal Interest (42% ), Entertainment (38%), New and Information (32%)
- It’s worth noting that E-Commerce and Retail is the 7th most common reason (27%)
- Activities resulting from opting-in to an email list:
- 60% entered an advertiser’s contest
- 52% visited the advertiser’s website
- 17% purchase or received products at a later date
- 68% percent of Canadians are willing to provide their email address depending on reasoning
I’ve uploaded the full presentation for you all, which can be viewed here:
My good friends over at The Search Agency sent me some interesting data yesterday.
Frank Lee, Senior VP of Client Services, and his team found that their paid search clients are seeing dramatically improved results on Bing compared to Live Search. Even though impressions in the first 3 weeks of Bing have dropped 22% compared to the last 3 weeks of Live Search, Microsoft now seems to be serving far more relevant ads on each keyword.
The result for The Search Agency’s advertisers:
Click Through Rate up 15%
Conversions up 6%
Conversion rate up 18%
Cost per acquisition down 3%
Until now, all the industry reports on Bing have covered the increase in search volume, share of search, or total clicks. Their data is the first analysis of Bing’s impact on paid search advertising performance. With this growth in search volume, along with these improved metrics, advertisers might want to consider shifting more of their paid search spend to Bing.
You can read Frank’s full post their their blog The Search Agents.
I suggest you subscribe to their blog, as these guys really know what they’re talking about.
While many agencies seem to base marketing strategy on the sole-opinion of some conference-hopping SEO ‘guru’, The Search Agency always surprises me with their dedication to gathering real, actionable & measurable market data. They are one of those rare Search Marketing agencies that build strategies based on real, hard-earned market research, not a magic eight-ball.
One of my in development projects was getting triggering a ton of “Mysql::Error: Lost connection” errors recently, and I couldn’t figure out the root cause of it.
There are many of different reasons why this error can occur, and in my case, it was a reason that no one else has written about (that I could find), so I figured I’d post up the cause of my problem (and the solution).
This application was on a shared development server with a few other applications. They all used the same mySQL server. One of my other applications had a corruption in a few of its tables, and whenever this application was called, it would crash the mySQL server, triggering a “Mysql::Error: Lost connection” for all my other applications connected to the mySQL server at that same time.
I tracked down this issue by check the mysql logs, and then I simply reloaded last nights backup of the corrupt database, and everything was fine! It just goes to show you how important it is to always perform regular backups of your databases.
I hope this helps someone!
OK, I’ll admit, my last post was downright shameful…
The code was poorly presented, and poorly formatted. It wasn’t at ALL my normal caliber of work, so I went ahead and rewrote the concept into a tighter little wrapper.
I am using the Gattica gem to interact with the Google Analytics API.
This little script provides a wrapper around the gem that easily allows you to extract your sites organic keywords and sort them based on the number of unique entrances to your site via each keyword.