As part of Pulse’s recent announcement of crossing the 11M user mark (up 10x since last year!), we’ve written a set of blog posts to share how we’ve scaled our backend infrastructure to keep up with our new users and support some powerful new features. Here’s a quick recap of our systems on both Amazon Web Services (AWS) and Google App Engine (GAE), along with links to the detailed posts describing each.
Livecount
Livecount is an implementation of real-time counters that leverages the performance of memcache and task queues on Google AppEngine.
Building a solid analytics platform is often a combination of real-time and batch processing. Batch processing, with a tool like Hadoop, is great for digging into large amounts of past data and asking questions that cannot be anticipated. In contrast, when it is known ahead of time that certain aggregates will be required, the best solution is usually to count each event as it happens. Livecount makes it easier to address this second use-case.
New Eng Blog / Using Data Analysis to Discover Top Stories
In addition to the regular Pulse Blog where we regularly share updates about our latest features and new content, Pulse now has an Engineering Blog! The goal is to share some of the exciting engineering work that goes into bringing users the Pulse experience they’ve come to expect. To kick things off I added a post about Using Data Analysis to Discover Top Stories. In the post I share a bit about how we use AWS to collect and analyse our data, along with how we serve up the feeds we build via AppEngine. Check it out!
Map(Reduce) Analytics on Google AppEngine
Google AppEngine is a great tool for building simple web applications which are automatically scalable. All of the basic building blocks are readily available and accessible from both python and java. This includes a database, a caching layer, and support for background tasks.
What about the big data analytics and informatics that made Google famous? Does AppEngine help us there as well? The answer is yes; although with some serious limitations.
The End of Paper-based Bills
Question / Pain #1: Why must bills come in the mail and be paid by snail-mail check (or sometimes over the phone during business hours)?
Question / Pain #2: Why must the bill payment workflow be manual, resulting in waisted time and forgotten bills?
Answer: Legacy systems & dinosaur companies
Use Case: Bill from medical lab arrives via snail-mail 6 weeks after appointment. No further reminders. Did the customer get/pay the bill?
Solution Idea: Continue Reading »