Lessons Learned in Engineering & Management

Posted November 11th, 2013 in Observations, Pulse by Greg Bayer
                     

This post is also available on LinkedIn’s new Professional Publishing Platform.

In September 2010, I joined Pulse as the 2nd engineer with the goal of building a backend for what was a small but already very successful mobile app. As Pulse grew, regularly doubling in both user base and team size, we were repeatedly presented with new challenges.

Pulse Team

The challenges that arose in this environment were extremely varied. We tackled scalable distributed systems architecture from the beginning, without allowing ourselves time to slow down. Having launched out of the Stanford d.school, Pulse’s culture was built around an unusual approach. Every team member was encouraged to take part in user experience research and product design. Both our engineering and our management techniques were constantly tested as we grew, and we were forced to improvise and iterate regularly.

Having had this amazing opportunity to learn from many mistakes, as well as from a extraordinary set of peers and mentors, it’s interesting to look back at some of the lessons.

Engineering

Don’t over-engineer. One of the most important lessons I’ve learned at Pulse is to keep it simple. The simpler, the better. This doesn’t mean skipping system architecture/design and diving directly into the code. It’s always worth putting some thought into the architecture of whatever system you are building. Sometimes it can take quite a bit of thought to find a simple solution. But it does mean that if it seems too complicated, it probably is.

When building any system, even a backend platform, start with the minimum viable product and then test (with users) to see if you need more. Repeat this process as many time as necessary, but you’ll be surprised how often the simplest solution is actually what you want. Next steps often take a new direction that couldn’t have been anticipated without testing with users. At Pulse, we never built a system that was too simple. But there were definitely a few that were over-engineered.

Focus on building the core product. In the early days at Pulse we outsourced as much as we could out of necessity (infrastructure, ops, secondary features, frameworks). In the long run, we benefited greatly from this approach, especially outsourcing most of our infrastructure to Google App Engine and Amazon Web Services. With very little infrastructure ops work and the almost no related emergencies to worry about, the team was able to stay focused on the product, even as our systems were required to scale much faster than expected.

Pulse 2011

Similarly, using open source frameworks and libraries was essential to staying focused on the value provided to users. Some of our biggest engineering opportunities came from leveraging recent, freely available engineering advancements to make it possible to build something that used to be hard in a simple way.

For more detail on Pulse’s architecture and engineering approach, see Scaling with the Kindle FireScaling to 10M on AWS and the Pulse Engineering Blog.

Product

Create something unique. Pulse was created by Akshay and Ankit as a Stanford class project. It was simple and solved a very personal pain. It was also well-timed with the release of the first iPad. From the beginning, Pulse tried to differentiate itself by making mobile content reading a delightful experience.

Pulse-on-Kindle-Fire

Competing with slow mobile web pages with tons of distracting chrome around the content, Pulse put the content first. It provided a fast, native experience that worked seamlessly even if the device was disconnected from the internet. It also allowed users to access content very efficiently, without the stressful, inbox-like experience common to RSS readers at the time. Each of these features was relatively simple, but together they created a product that stood out in the market.

Set big goals. Since speed is critical in startups (and in most tech companies), it’s important to focus on approaches that have a chance of getting you to your long term goals and skip ones that don’t. As Pulse grew, we set our sights on building a content platform, not just an app. We wanted users to be able to access the Pulse experience on any device and to connect users with the best content available anywhere.

To that end, we built critical features like user accounts, cross-device syncing, and a rich, personalized catalog where user’s could discover thousands of vetted content sources. We built a platform that allowed users to share and discuss content with each other. We worked hard to build a brand that people recognized for quality and that both business partners and users loved being associated with. We also said no to many small features that would have distracted us from our vision and to other potentially promising directions like building a shopping app on top of the Pulse experience.

Team / Project Management

Balance quality / speed / features. From the beginning, Pulse had an extremely high quality bar. At the same time, we prided ourselves on a very short release cycle. Since those two goals are often at odds, testing features early and “failing fast” was critical. Together with minimizing features, we worked hard to maximize both quality and speed by building an amazing team and keeping them happy and efficient.

Pulse Team

Pulse’s approach was to maximize team member ownership and build a positive, open culture, where communication was rarely an issue. Traditions like Show & Tell and I like / I wish / I wonder, along with the right office space design played a critical role in maintaining this culture.

Collaboration / Mentoring

Make sure team members feel ownership. This was a central tenet of our management style at Pulse. We tried to ensure that each team member felt ownership for their tasks, all the way from technical implementation to user experience. The team was encouraged to take ownership in the product direction, system architecture, and the team’s overall success.

We hired new team members who appreciated this approach and regularly communicated ownership by assigning a Directly Responsible Individual (DRI) for every feature. The team was always intimately involved with the product direction through user testing, ideation and prototyping. We encouraged everyone to learn the Stanford d.school’s design thinking process. If someone didn’t feel ownership of the team’s goals, it was very important to find a way to restore this feeling through open communication.

Not everyone is the same. At Pulse we always made time for 1-on-1s and learning more about each other. One of the first painful lessons I learned in my early pre-Pulse years leading engineering teams was to stop assuming everyone was like me. I had to learn to be aware of my own and other people’s social styles. This may sound easy, but there’s a lot to learn. A good social styles class was well worth the investment. Recognizing other social styles is just the beginning. The real challenge is being able to communicate effectively with all types of people and learning how to help others do the same.

Really listen when talking with someone. Beyond awareness of different social styles, it’s important to really listen and to try to see things from the other person’s point of view. If you can’t, ask more questions. Stop thinking about what you’re going to say next. It seems so simple, but it took me a long time to realize that I wasn’t always doing this.

When you clear your mind, listen fully, and ask questions instead of ‘telling’, that’s when people open up. Only after really hearing someone can you know how to help and guide them. Walking or eating during a 1-on-1 can also help people open up and feel more comfortable.

Give more than you take. As a young team, outside relationships with mentors and allies were critical to our success. One of the things I like most about Silicon Valley is the culture of helping others without asking for anything in return. Every time I reached out for help, I found an overwhelmingly positive response.

I still reach out regularly, but I also make time to help others. It’s a virtuous cycle. Help someone and you will start to build a relationship. Care about them and their challenges. Build a team of allies.

Full Pulse Team

As those on the Pulse team know, we’re all still working on these lessons. We continue to make mistakes and hopefully learn from them. As someone who loves learning and growing, I’m thankful for opportunity to be faced with such a steady stream of challenges.

Have you faced similar challenges? Do you have a lesson that could help others in your position? I invite you to share and discuss in the comments below.

 

Learn iOS Faster

Posted September 27th, 2013 in Development, Opportunities by Greg Bayer
                     

iOS is one of the most sought after skills in the software industry. More importantly, it’s a lot of fun to work on a major app with the potential to impact millions of users, and it can be even more rewarding to launch your own.

I started learning iOS several years ago, as many developers do, by diving in to Xcode. If you have experience in other languages, it’s easy to work off of a few examples and look up anything else via Google. Right?

Unfortunately, what I got was a mess of an app that works but is very difficult to maintain and iterate on. In retrospect, it’s a good idea to learn the fundamentals and best practices first.

Top iOS developers (like Ankit Gupta) suggest starting with the Stanford iOS class. The course content is well structured and easy to follow. It’s also available completely free on iTunes U. Just open iTunes, navigate to iTunes U, and search for the Stanford class listed as Coding Together – Developing Apps for iPhone and iPad (Winter 2013). You’ll find an excellent recording of Stanford’s CS193p and along with lecture slides, assignments, etc.

cs193p

All of the lectures contain worthwhile content, but even watching the first few will help you do things the right way the first time.

Tip: If the lectures are too slow for you, you can speed them up. After you download a lecture in iTunes U (you can do this for all lectures ahead of time if you want to watch offline), control-click it and select Show in Finder. Then control-click the file in Finder and select Open with QuickTime Player. From here you can watch the lecture in fast forward!

Most people will probably find the 2x speed to be a bit fast, so increase the speed in 10% increments by option-clicking on the fast forward button. I find that 1.7x works well if I’m giving the lecture my full attention.

Updates from Google IO 2013

Posted May 21st, 2013 in Development by Greg Bayer
                     

I’m happy to see that Google has been continuing to invest in App Engine and the broader Google Cloud platform. At this year’s Google IO there were many exciting new feature announcements from across Google. There were also a strong set of new features announced for Google Cloud. Here are some of the highlights from my perspective. To see how things have evolved, check out my wish list from Google IO 2012.

Google-Cloud-Platform

App Engine

Here’s the full list of production and experimental features for App Engine.

Google Compute Engine

 

App Engine Datastore: How to Efficiently Export Your Data

Posted November 8th, 2012 in Big Data, Development by Greg Bayer
                     

While Google App Engine has many strengths, as with all platforms, there are some some challenges to be aware of. Over the last two years, one of our biggest challenges at Pulse has been how difficult it can be to export large amounts of data for migration, backup, and integration with other systems. While there are several options and tools, so far none have been feasible for large datasets (10GB+).

Since we have many TBs of data in Datastore, we’ve been actively looking for a solution to this for some time. I’m excited to share a very effective approach based on Google Cloud Storage and Datastore Backups, along with a method for converting the data to other fomats!

Existing Options For Data Export

These options that have been around for some time. They are often promoted as making it easy to access datastore data, but the reality can be very different when dealing with big data.

  1. Using the Remote API Bulk Loader. Although convenient, this official tool only works well for smaller datasets. Large datasets can easily take 24 hours to download and often fail without explanation. This tool has pretty much remained the same (without any further development) since App Engine’s early days. All official Google instructions point to this approach.
  2. Writing a map reduce job to push the data to another server. This approach can be painfully manual and often requires significant infrastructure elsewhere (eg. on AWS).
  3. Using the Remote API directly or writing a handler to access datastore entities one query at a time, you can run a parallelizable script or map reduce job to pull the data to where you need it. Unfortunately this has the same issues as #2.

A New Approach – Export Data via Google Cloud Storage

The recent introduction of Google Cloud Storage has finally made exporting large datasets out of Google App Engine’s datastore possible and fairly easy. The setup steps are annoying, but thankfully it’s mostly a one-time cost. Here’s how it works.

One-time setup

  • Create a new task queue in your App Engine app called ‘backups’ with the maximum 500/s rate limit (optional).
  • Sign up for a Google Cloud Storage account with billing enabled. Download and configure gsutil for your account.
  • Created a bucket for your data in Google Cloud Storage. You can use the online browser to do this. Note: There’s an unresolved bug that causes backups to buckets with underscores to fail.
  • Use gsutil to set the acl and default acl for that bucket to include your app’s service account email address with WRITE and FULL_CONTROL respectively.

 Steps to export data

  • Navigate to the datastore admin tab in the App Engine console for your app. Click the checkbox next to the Entity Kinds you want to export, and push the Backup button.
  • Select your ‘backups’ queue (optional) and Google Cloud Storage as the destination. Enter the bucket name as /gs/your_bucket_name/your_path.
  • A map reduce job with 256 shards will be run to copy your data. It should be quite fast (see below).

Steps to download data

  • On the machine where you want the data, run the following command. Optionally you can include the -m flag before cp to enable multi-threaded downloads.
gsutil cp -R /gs/your_bucket_name/your_path /local_target

 

Reading Your Data

Unfortunately, even though you now have an efficient way to export data, this approach doesn’t include a built-in way to convert your data to common formats like CSV or JSON. If you stop here, you’re basically stuck using this data only to backup/restore App Engine. While that is useful, there are many other use-cases we have for exporting data at Pulse. So how do we read the data? It turns out there’s an undocumented, but relatively simple way of converting Google’s level db formated backup files into simple python dictionaries matching the structure of your original datastore entities. Here’s a Python snippet to get you started.

# Make sure App Engine APK is available
#import sys
#sys.path.append('/usr/local/google_appengine')
from google.appengine.api.files import records
from google.appengine.datastore import entity_pb
from google.appengine.api import datastore

raw = open('path_to_a_datastore_output_file', 'r')
reader = records.RecordsReader(raw)
for record in reader:
        entity_proto = entity_pb.EntityProto(contents=record)
        entity = datastore.Entity.FromPb(entity_proto)
        #Entity is available as a dictionary!

Note: If you use this approach to read all files in an output directory, you may get a ProtocolBufferDecodeError exception for the first record. It should be safe to ignore that error and continue reading the rest of the records.

Performance Comparison

Remote API Bulk Loader

  • 10GB / 10 hours ~ 291KB/s
  • 100GB – never finishes!

Backup to Google Cloud Storage + Download with gsutil

  • 10GB / 10 mins + 10 mins ~ 8.5MB/s
  • 100GB / 35 mins + 100 mins ~ 12.6MB/s

App Engine Wish List – Updates From Google IO 2012

Posted June 28th, 2012 in Development by Greg Bayer
                     

We’ve been using Google App Engine at Pulse since 2010, back when we had only one backend engineer. In that time, App Engine has served us very well. There are many things Google App Engine does very well; the most obvious advantage is saving us lots of Ops work and letting us stay focused on our application. Over the last two years, it has grown with us both in terms of scale (from 200k users, to 15M+) and in terms of features.

As I’m writing this post (from Google I/O 2012), I’m happy to report that App Engine continues to grow with us. This year, Google’s App Engine team has announced that they are fixing our number one wish list item! They have also started addressing several other important concerns. For some context, here is Pulse’s App Engine wish list as of about a month ago.

  1. SSL support for custom domains
  2. Faster bulk import & export of datastore data
  3. Faster datastore snapshotting
  4. Tunable memcache eviction policies & capacity
  5. Improved support for searching / browsing / downloading high volume application logs
  6. Faster (diff-based) deployment for large applications
  7. Support for naked domains (without www. in front)
  8. Unlimited developer accounts per application

Barb Darrow from GigaOm published part of this list earlier this week (before I/O started). Check out the article Google App Engine: What developers want at Google I/O to see more common wish list items from other developers.

As of yesterday, (with the release of SDK version 1.7.0), SSL for custom domains is now officially supported either via SNI for $9/month or via a custom IP for $99/month. This means that you can now host a domain like www.pulse.me on App Engine and support https throughout your site. Previously it had only been possible to use http with your domain, and any secure transactions had to be routed to the less appealing xxxxx.appspot.com domain. This meant you had to break the user’s flow or use some complicated hacks to hide the domain switching. Now it is finally possible to present a seamless, secure experience without ever leaving your custom domain.

There were many other great features released with 1.7.0 (see the link above). As for the rest of our wish list, here’s how it stands now!

  1. SSL support for custom domains
    – Supported now!
  2. Faster bulk import & export of datastore data
    – Update 2: App Engine Datastore: How to Efficiently Export Your Data
  3. Faster datastore snapshotting
    – Update 3: The internal settings for map reduce-based snapshotting have been increased to use 256 shards. It’s actually pretty fast now! Still hoping for incremental backups in the future.
  4. Tunable memcache eviction policies & capacity
    – I hear that we will soon be able to segment applications and control capacity. Eviction policy controls are likely to take longer.
  5. Improved support for searching / browsing / downloading high volume application logs
    – It was announced that this is coming very soon!!
  6. Faster (diff-based) deployment for large applications
    – Update 4: This is supporting and working for us now!
  7. Support for naked domains (without www. in front)
    – Pending. No ETA.
  8. Unlimited developer accounts per application
    – This is now supported for premier accounts!

Let me know in the comments if you have any questions about these or want to share some of your wish list items. I’m always happy to discuss App Engine issues with other developers.

Update: Just now, at the second Google I/O keynote, Urs Hölzle has announced Google’s push into the IaaS space with Google Compute Engine. It should be interesting to see if this offers serious competition to Amazon’s EC2 for future Pulse systems and features. 771886 cores available to the demo Genome app was pretty impressive! I’ll post here and/or at eng.pulse.me when we get a chance to try it out!

How to Buy a Basic SSL Certificate

Posted April 6th, 2012 in Development by Greg Bayer
                     

In order to support SSL for a simple Tornado server on EC2, a certificate is required. This process seems harder than it should be, so I thought I’d share the process that recently worked for me.

There are several tradeoffs to consider:

  • Certificate Authority (CA) Reputation (‘Self Sign’VeriSign)
  • Price (Free – $3000/year)
  • Domain Coverage: (Single, Multi, Wildcard)

After considering these options and reading about other people’s experiences, I concluded that GoDaddy is the least expensive, reasonably well respected CA. At GoDaddy the wildcard option is 15 times as expensive as the standard single domain certificate (with discount), so it’s a better deal to buy single domain certs even if you need a few.

Steps I took:

  1. Search Google for GoDaddy SSL deal.
  2. Login to GoDaddy and buy a single domain certificate for $12.99/year.
  3. Go to ‘My Account’, click SSL Certificates. Activate your purchased token. Wait a few minutes.
  4. Configure your cert. Choose “Third party server”. Provide a Certificate Signing Request (CSR) for your domain (see below).
  5. Download the cert. Use the cert along with your .key file from the CSR generation process to setup SSL on your server(s).

Continue Reading »

Scaling Pulse to 11M Users

Posted February 16th, 2012 in Pulse by Greg Bayer
                     

As part of Pulse’s recent announcement of crossing the 11M user mark (up 10x since last year!), we’ve written a set of blog posts to share how we’ve scaled our backend infrastructure to keep up with our new users and support some powerful new features. Here’s a quick recap of our systems on both Amazon Web Services (AWS) and Google App Engine (GAE), along with links to the detailed posts describing each.

Continue Reading »

Scaling with the Kindle Fire

Posted December 1st, 2011 in Pulse by Greg Bayer
                     

Earlier this week I wrote a guest post for the Google App Engine Blog on how Pulse has scaled-up our backend infrastructure to prepare for the recent Kindle Fire launch.

The Kindle Fire includes Pulse as one of the only preloaded apps and is projected to sell over five million units this quarter alone. This meant we had to prepare for nearly doubling our user-base in a very short period. We also needed to be ready for spikes in load due to press events and the holiday season.

Continue Reading »

Livecount

Posted July 11th, 2011 in Projects by Greg Bayer
                     

Livecount is an implementation of real-time counters that leverages the performance of memcache and task queues on Google AppEngine.

Building a solid analytics platform is often a combination of real-time and batch processing. Batch processing, with a tool like Hadoop, is great for digging into large amounts of past data and asking questions that cannot be anticipated.  In contrast, when it is known ahead of time that certain aggregates will be required, the best solution is usually to count each event as it happens. Livecount makes it easier to address this second use-case.

Continue Reading »

How To Use a Commuter Check Card to Purchase a Caltrain Monthly Pass

Posted June 27th, 2011 in Observations by Greg Bayer
                     

Warning

Before I start, let me recommend that you don’t try this. The potential savings you gain from using pre-tax Commuter Check cards likely won’t be worth the pain of actually trying to buy something with them. Return the cards to your employer and ask them to enroll in another option for funding your commute costs pre-tax!

Update: The Autoload program via Clipper works great. Instead of buying a pass in person with a commuter check card, you tag on/off once at the beginning of the month to load a new pass.

Goal

Use two Commuter Check cards issued by an employer (each containing $100) to purchase a zone 1-3 monthly Caltrain pass on a Clipper card (for $179).
Continue Reading »