As the Chief Technology Officer (CTO) and one of the four founders of Bold Commerce, Eric Boisjoli has both the real-life experience of making Black Friday a success, and a wealth of knowledge surrounding the technical side of running an industry-leading software development company.
I’ve got a quote I sometimes say - well, I don’t actually say it, but it’s mounted on my wall:
“Merchants first, then tech.”
This quote keeps something very important in the forefront of our minds: although we are always trying to build amazing, innovative tech, our mission is to make sure merchants are able to make a living while doing what they love and are passionate about.
It gives us a proper perspective when deciding whether or not something should be done, because we can ask ourselves “Does it ultimately benefit the merchant?”
Having a server outage on Black Friday or Cyber Monday doesn’t benefit them.
During Black Friday 2013, shit hit the fan; we had some pretty significant issues with a few of our apps that we didn’t foresee, and it felt awful. However, we learned a lot from that experience and it has drastically changed how we prepare for Black Friday. Hopefully I’ll be able to share some of what we learned so that you can avoid the pain we suffered through.
Black Friday planning starts early at Bold
We begin our Black Friday preparation around June. That might sound crazy, but it’s extremely important we start this early because Black Friday is such a busy time for us, and it’s very important to the merchants who rely on our apps and solutions.
We start by reviewing the previous year’s Black Friday performance and determine what we did well, what exceeded our expectations, and what didn’t go so well. We don’t ignore those things, though; we improve and learn from failures.
By August, we’ve made our predictions about the volume of sales, traffic, and support we’re expecting based on last year’s numbers, and trends we’ve seen so far this year.
With this information in mind, we put together a single, department-wide calendar spanning the next three months, which includes all initiatives that are geared towards the Black Friday & Cyber Monday weekend. We put everything on one calendar because this isn’t a single team initiative; it’s an event that requires collaboration from all departments in our company, and we need everyone working together to make this event successful.
Measure and optimize everything
One of the most glaring reasons we had some setbacks during Black Friday 2013 was because we had insufficient insights into the systems we built:
- We didn’t put enough importance on paying attention to traffic.
- We couldn’t anticipate the influx of server requests, and ensure they were being served fast enough.
- We were unsure if we could keep up with the demand that we hadn’t experienced; and,
- We didn’t know that we didn’t know.
Every year we introduce a lot of new systems and processes, and we do our best to track the relevant stats and key performance indicators (KPIs) for those systems.
Leading up to Black Friday, we go over everything to ensure we have all the needed KPIs, and to ensure that there are systems in place to monitor and alert us to any looming problems.
In addition to giving us insights, I like to challenge myself and the team to see how much we can improve our KPIs in order to position ourselves better for Black Friday/Cyber Monday. Although the majority of what I’m personally looking at is ‘requests per minute’, ‘response time’, ‘server load’, ‘cache hit %’, and ‘uptime’, this extends to the support and marketing teams as well.
Between August and late October, my personal goal is to reduce the overall server time usage by roughly 30%, and we’ve successfully accomplished that goal four years in a row.
Testing for scale
When I go camping, regardless of the forecast I set up a rain shelter; I’d rather be prepared than have to set it up during a storm. Bold uses this same logic during its Black Friday preparations.
It’s important to note that it takes roughly the whole year to reach the amount of traffic that we experienced over the previous Black Friday/Cyber Monday weekend; each subsequent holiday we’ll see double or triple the amount of traffic… and it’s hitting us all at once.
We have to take on the very difficult task of putting measures in place to prepare us for traffic we’ve never experienced before. Plus, it’s hard to mimic real traffic because we can’t predict which apps will be used most, and we don’t know which of our newer releases will get the most attention from our customers.
Despite these unknowns, we leverage tools such as ‘Hey’ which allows us to stress test our servers, sending tremendous amounts of fabricated traffic to our apps in order to mimic real traffic.
Picture: Hey sending traffic to our servers
By monitoring these tools, we get insight into how our environment will handle the volume and scale it to meet the needs of the upcoming holiday, as well as determine any bottlenecks in our systems.
Our intention is to slowly (and safely) increase the amount of traffic weekly, so once it hits Black Friday, we will be confident and ready for the real deal.
When we set up processes, we try to put redundancies or backup mechanisms in place with the intent that if something were to break, we have the confidence in knowing there’s going to be a backup system that should be able to pick up the slack.
That’s all well and good, but as everyone should know, a backup is only valid if it’s tested. Because of this, we try to have planned service outages every few months to ensure that the backups kick in and perform as expected.
Be ready; sometimes shit goes wrong
Even though you tested for everything you can think of, you still don’t know what you don’t know. It’s likely impossible to know every possible scenario that can potentially occur, so it’s important to have an “Oh, F*&%” plan.
In a time of chaos, it’s nice to have a simple cheat sheet of priorities. Deciding which limb to cut off in order to save the rest of the body is never a fun conversation, but it’s better to discuss it when things are going well, rather than while you’re stressed and panicking because you’re bleeding out.
At Bold we have a few such priority lists:
- Shoppers should be able to check out on a store.
- Shoppers should be able to leverage the functionality of the app.
- Merchants should be able to manage the apps.
- Bold’s support team should be able to help merchants.
- Background tasks should complete in a timely manner.
Within that, we’ve got a prioritized list of applications, sorted by the magnitude of the impact that it would have on the merchants, should the app not function properly.
We want to be as prepared as possible, and even when we can’t be, we’ve got a guide to help us respond in the correct order.
Keep your finger on the pulse
Although we work on the initial plan together, the collaboration doesn’t end there.
We have a Black Friday/Cyber Monday committee that meets regularly. As we get closer to the date, the stand-ups we do become more frequent (in fact, daily). We do this because it’s critical to ensure that everyone is aware of how each team and department is doing so that we can make smarter decisions regarding our plan.
It’s important to know that we have the support capacity before sending out a big marketing campaign. As well, if a specific Black Friday app feature is going to be delayed, the other teams should be made aware sooner rather than later, so they can change their strategies accordingly.There have been several instances where the demands on our onboarding and installs teams were so high that we have had to put out a request for volunteers from within the company, and we received an overwhelming response from other teams to help out.
This year, I’m actually planning to start a weekly Black Friday/Cyber Monday status email that details our weekly initiatives to make sure the company is better aware and aligned.
Implement a soft code freeze
Bold is a very agile company and we release code very frequently. We’ve put in place a lot of measures to ensure that application changes can be deployed very easily, and we believe that releasing smaller, less complicated changes will have less unexpected side effects.
However, during the holiday season we get a lot more particular and we begin to scrutinize what kind of releases we’re putting out. We ask ourselves: “Is this going to help the merchant?”; “Is this needed for Black Friday?”; “Does this add potential additional risk that we don’t need to take on?”
We need to make sure that the things we’re putting out around this time are going to provide enough value to the merchant, or are very much Black Friday-related. That way, we can prioritize the important pieces for Black Friday and hold off on other releases, so as not to add incidental bugs.
It’s showtime! Preparing for the main event
Every year we set up our Black Friday/Cyber Monday “Command Center” in our main office atrium. Each team has staff working around the clock, and others are on-call.
We are actively answering tickets, watching our KPIs, investigating any little blips, and communicating with all of our teams; that way, we have a continuous pulse on how things are going. And if things are going smooth at 3 a.m., we’ll probably also be watching Billy Madison.
We use the previous year’s traffic to anticipate and plan for all the spikes to help us determine the staffing requirements. When do we need the most support, and when can I sleep? (I have a hammock in my office over the weekend).
If there’s anything to take away from this, it's that Black Friday should be treated like any other project.
- Have a detailed plan.
- Find out what your risks are and what you can do to eliminate or mitigate them.
- Test and improve your systems.
- Have contingencies.
- Get an office hammock.
Bold's 2017 Black Friday Traffic
Even though I’m the CTO and my responsibility is to make sure our apps and solutions are built with high quality, are running properly, and we don’t go down (especially around the holidays), I feel the role extends way beyond just that.
Because communication is so important not only year-round, but at this time of year, my role extends into making sure there is good communication between all departments, because Black Friday isn’t just about us, our code, or our servers; it’s about the merchants being successful. It’s a people event.
The thing I love most about Black Friday is that brings the entire company together with the unified mission of helping our merchants to be successful.
No matter what happens, we’re going to be ready to act. Even when we can’t anticipate, we have a plan.