100 Million Rows, One Master Key
Levchin: The sacred thing at PayPal was the database — roughly 100 million rows of credit cards and bank account credentials. Very, very scary to lose it. We encrypted everything. My background was in computer security — it was my obsession.
But we had one big vulnerability — a master key that would decrypt the entire database. That just bugged the hell out of me. So one day I read this paper called 'Sharing Secrets Among Friends' and implemented sharded key sharing. Proudly got everybody together to replace the master key.
One master key for 100 million credit cards bugged the hell out of me. I implemented secret sharing.
Five and a Half Hours of Pandemonium
Levchin: We took down the site. Decrypted with the single master key. Re-encrypted with the sharded key. Brought the site back up. People typed in their shards. It didn't decrypt correctly.
For the next five and a half hours, we were beating our heads against the wall. We were so secure, there was no backup. Peter would call at 5 AM — hey, why is the site still down? CNN was running a headline: PayPal went down last night, nobody knows when it's coming back. Total pandemonium.
It didn't decrypt. No backup. Peter calling at 5 AM. CNN headline: PayPal is down. Total pandemonium.
8 vs. 256
Levchin: I went to my cubicle and stared into the black CRT for 30 minutes trying to figure out what could have possibly gone wrong. The next thing I did was type 'man getpass' on both machines.
getpass on Solaris can capture up to 256 characters. On Linux, it cuts off after 8. So our 24-character passwords were being reassembled with only the first 8 characters. Ten minutes later we were back up.
The embarrassing part wasn't the bug — it's that it took me so long to figure it out. I was the one who coded it. There weren't enough people to pair-program with me.
getpass: 256 characters on Solaris, 8 on Linux. Our 24-character passwords only used the first 8. That was it.
How You Behave in a Crisis
Levchin: One of our engineers literally screamed — we're going to have to start brute-forcing this key, it'll only take a few billion years. You get to watch how people behave under extreme pressure.
We were a year from IPO. 100 million credentials. Everything going our way. I was about to pull the pin out of the hand grenade. My proudest part was that the only right thing to do was to think like an engineer — put out the fear of total failure and try to reason what went wrong.
I didn't tell Peter the real story for over a decade. The first five years, I told no one.
The only right thing: put out the fear of total failure and reason what went wrong. I told no one for 5 years.