1
00:00:00,984 --> 00:00:02,275
My job at Twitter

2
00:00:02,275 --> 00:00:04,253
is to ensure user trust,

3
00:00:04,253 --> 00:00:07,090
protect user rights and keep users safe,

4
00:00:07,090 --> 00:00:08,350
both from each other

5
00:00:08,350 --> 00:00:12,249
and, at times, from themselves.

6
00:00:12,249 --> 00:00:16,524
Let's talk about what scale looks like at Twitter.

7
00:00:16,524 --> 00:00:19,394
Back in January 2009,

8
00:00:19,394 --> 00:00:22,725
we saw more than two million new tweets each day

9
00:00:22,725 --> 00:00:24,489
on the platform.

10
00:00:24,489 --> 00:00:30,397
January 2014, more than 500 million.

11
00:00:30,397 --> 00:00:32,889
We were seeing two million tweets

12
00:00:32,889 --> 00:00:35,065
in less than six minutes.

13
00:00:35,065 --> 00:00:42,049
That's a 24,900-percent increase.

14
00:00:42,049 --> 00:00:45,302
Now, the vast majority of activity on Twitter

15
00:00:45,302 --> 00:00:46,805
puts no one in harm's way.

16
00:00:46,805 --> 00:00:48,740
There's no risk involved.

17
00:00:48,740 --> 00:00:54,493
My job is to root out and prevent activity that might.

18
00:00:54,493 --> 00:00:56,466
Sounds straightforward, right?

19
00:00:56,466 --> 00:00:57,618
You might even think it'd be easy,

20
00:00:57,618 --> 00:00:59,788
given that I just said the vast majority

21
00:00:59,788 --> 00:01:03,598
of activity on Twitter puts no one in harm's way.

22
00:01:03,598 --> 00:01:05,767
Why spend so much time

23
00:01:05,767 --> 00:01:08,510
searching for potential calamities

24
00:01:08,510 --> 00:01:11,410
in innocuous activities?

25
00:01:11,410 --> 00:01:14,350
Given the scale that Twitter is at,

26
00:01:14,350 --> 00:01:16,707
a one-in-a-million chance happens

27
00:01:16,707 --> 00:01:21,583
500 times a day.

28
00:01:21,583 --> 00:01:23,028
It's the same for other companies

29
00:01:23,028 --> 00:01:24,499
dealing at this sort of scale.

30
00:01:24,499 --> 00:01:26,207
For us, edge cases,

31
00:01:26,207 --> 00:01:29,832
those rare situations that are unlikely to occur,

32
00:01:29,832 --> 00:01:32,454
are more like norms.

33
00:01:32,454 --> 00:01:36,396
Say 99.999 percent of tweets

34
00:01:36,396 --> 00:01:38,284
pose no risk to anyone.

35
00:01:38,284 --> 00:01:39,350
There's no threat involved.

36
00:01:39,350 --> 00:01:42,304
Maybe people are documenting travel landmarks

37
00:01:42,304 --> 00:01:44,267
like Australia's Heart Reef,

38
00:01:44,267 --> 00:01:47,188
or tweeting about a concert they're attending,

39
00:01:47,188 --> 00:01:51,935
or sharing pictures of cute baby animals.

40
00:01:51,935 --> 00:01:56,444
After you take out that 99.999 percent,

41
00:01:56,444 --> 00:01:59,973
that tiny percentage of tweets remaining

42
00:01:59,973 --> 00:02:02,362
works out to roughly

43
00:02:02,362 --> 00:02:05,837
150,000 per month.

44
00:02:05,837 --> 00:02:08,293
The sheer scale of what we're dealing with

45
00:02:08,293 --> 00:02:10,605
makes for a challenge.

46
00:02:10,605 --> 00:02:11,783
You know what else makes my role

47
00:02:11,783 --> 00:02:14,890
particularly challenging?

48
00:02:14,890 --> 00:02:20,013
People do weird things.

49
00:02:20,013 --> 00:02:21,842
(Laughter)

50
00:02:21,842 --> 00:02:24,233
And I have to figure out what they're doing,

51
00:02:24,233 --> 00:02:26,482
why, and whether or not there's risk involved,

52
00:02:26,482 --> 00:02:28,650
often without much in terms of context

53
00:02:28,650 --> 00:02:30,497
or background.

54
00:02:30,497 --> 00:02:32,574
I'm going to show you some examples

55
00:02:32,574 --> 00:02:34,579
that I've run into during my time at Twitter --

56
00:02:34,579 --> 00:02:36,199
these are all real examples —

57
00:02:36,199 --> 00:02:38,852
of situations that at first seemed cut and dried,

58
00:02:38,852 --> 00:02:40,495
but the truth of the matter was something

59
00:02:40,495 --> 00:02:42,045
altogether different.

60
00:02:42,045 --> 00:02:44,022
The details have been changed

61
00:02:44,022 --> 00:02:45,279
to protect the innocent

62
00:02:45,279 --> 00:02:48,512
and sometimes the guilty.

63
00:02:48,512 --> 00:02:51,517
We'll start off easy.

64
00:02:51,517 --> 00:02:53,310
["Yo bitch"]

65
00:02:53,310 --> 00:02:56,538
If you saw a Tweet that only said this,

66
00:02:56,538 --> 00:02:58,232
you might think to yourself,

67
00:02:58,232 --> 00:02:59,885
"That looks like abuse."

68
00:02:59,885 --> 00:03:02,992
After all, why would you
want to receive the message,

69
00:03:02,992 --> 00:03:05,210
"Yo, bitch."

70
00:03:05,210 --> 00:03:09,873
Now, I try to stay relatively hip

71
00:03:09,873 --> 00:03:12,385
to the latest trends and memes,

72
00:03:12,385 --> 00:03:15,089
so I knew that "yo, bitch"

73
00:03:15,089 --> 00:03:18,243
was also often a common greeting between friends,

74
00:03:18,243 --> 00:03:22,505
as well as being a popular "Breaking Bad" reference.

75
00:03:22,505 --> 00:03:24,992
I will admit that I did not expect

76
00:03:24,992 --> 00:03:27,833
to encounter a fourth use case.

77
00:03:27,833 --> 00:03:30,937
It turns out it is also used on Twitter

78
00:03:30,937 --> 00:03:33,999
when people are role-playing as dogs.

79
00:03:33,999 --> 00:03:39,278
(Laughter)

80
00:03:39,278 --> 00:03:40,944
And in fact, in that case,

81
00:03:40,944 --> 00:03:42,553
it's not only not abusive,

82
00:03:42,553 --> 00:03:45,692
it's technically just an accurate greeting.

83
00:03:45,692 --> 00:03:48,581
(Laughter)

84
00:03:48,581 --> 00:03:50,652
So okay, determining whether or not

85
00:03:50,652 --> 00:03:52,500
something is abusive without context,

86
00:03:52,500 --> 00:03:54,092
definitely hard.

87
00:03:54,092 --> 00:03:56,809
Let's look at spam.

88
00:03:56,809 --> 00:03:58,769
Here's an example of an account engaged

89
00:03:58,769 --> 00:04:00,437
in classic spammer behavior,

90
00:04:00,437 --> 00:04:01,996
sending the exact same message

91
00:04:01,996 --> 00:04:03,800
to thousands of people.

92
00:04:03,800 --> 00:04:06,593
While this is a mockup I put
together using my account,

93
00:04:06,593 --> 00:04:09,594
we see accounts doing this all the time.

94
00:04:09,594 --> 00:04:11,573
Seems pretty straightforward.

95
00:04:11,573 --> 00:04:13,626
We should just automatically suspend accounts

96
00:04:13,626 --> 00:04:16,933
engaging in this kind of behavior.

97
00:04:16,933 --> 00:04:20,143
Turns out there's some exceptions to that rule.

98
00:04:20,143 --> 00:04:23,026
Turns out that that message
could also be a notification

99
00:04:23,026 --> 00:04:26,915
you signed up for that the International
Space Station is passing overhead

100
00:04:26,915 --> 00:04:28,761
because you wanted to go outside

101
00:04:28,761 --> 00:04:30,709
and see if you could see it.

102
00:04:30,709 --> 00:04:31,934
You're not going to get that chance

103
00:04:31,934 --> 00:04:33,781
if we mistakenly suspend the account

104
00:04:33,781 --> 00:04:36,047
thinking it's spam.

105
00:04:36,047 --> 00:04:39,573
Okay. Let's make the stakes higher.

106
00:04:39,573 --> 00:04:41,489
Back to my account,

107
00:04:41,489 --> 00:04:44,994
again exhibiting classic behavior.

108
00:04:44,994 --> 00:04:47,637
This time it's sending the same message and link.

109
00:04:47,637 --> 00:04:50,411
This is often indicative of 
something called phishing,

110
00:04:50,411 --> 00:04:53,589
somebody trying to steal another
person's account information

111
00:04:53,589 --> 00:04:55,792
by directing them to another website.

112
00:04:55,792 --> 00:04:59,986
That's pretty clearly not a good thing.

113
00:04:59,986 --> 00:05:01,916
We want to, and do, suspend accounts

114
00:05:01,916 --> 00:05:04,540
engaging in that kind of behavior.

115
00:05:04,540 --> 00:05:07,787
So why are the stakes higher for this?

116
00:05:07,787 --> 00:05:10,786
Well, this could also be a bystander at a rally

117
00:05:10,786 --> 00:05:12,696
who managed to record a video

118
00:05:12,696 --> 00:05:15,966
of a police officer beating a non-violent protester

119
00:05:15,966 --> 00:05:18,941
who's trying to let the world know what's happening.

120
00:05:18,941 --> 00:05:20,584
We don't want to gamble

121
00:05:20,584 --> 00:05:23,101
on potentially silencing that crucial speech

122
00:05:23,101 --> 00:05:26,030
by classifying it as spam and suspending it.

123
00:05:26,030 --> 00:05:28,909
That means we evaluate hundreds of parameters

124
00:05:28,909 --> 00:05:30,597
when looking at account behaviors,

125
00:05:30,597 --> 00:05:32,613
and even then, we can still get it wrong

126
00:05:32,613 --> 00:05:34,849
and have to reevaluate.

127
00:05:34,849 --> 00:05:38,557
Now, given the sorts of challenges I'm up against,

128
00:05:38,557 --> 00:05:41,253
it's crucial that I not only predict

129
00:05:41,253 --> 00:05:45,037
but also design protections for the unexpected.

130
00:05:45,037 --> 00:05:47,379
And that's not just an issue for me,

131
00:05:47,379 --> 00:05:49,466
or for Twitter, it's an issue for you.

132
00:05:49,466 --> 00:05:51,872
It's an issue for anybody who's building or creating

133
00:05:51,872 --> 00:05:53,797
something that you think is going to be amazing

134
00:05:53,797 --> 00:05:56,586
and will let people do awesome things.

135
00:05:56,586 --> 00:05:59,452
So what do I do?

136
00:05:59,452 --> 00:06:02,770
I pause and I think,

137
00:06:02,770 --> 00:06:04,865
how could all of this

138
00:06:04,865 --> 00:06:08,658
go horribly wrong?

139
00:06:08,658 --> 00:06:13,111
I visualize catastrophe.

140
00:06:13,111 --> 00:06:15,574
And that's hard. There's a sort of

141
00:06:15,574 --> 00:06:18,422
inherent cognitive dissonance in doing that,

142
00:06:18,422 --> 00:06:20,234
like when you're writing your wedding vows

143
00:06:20,234 --> 00:06:22,880
at the same time as your prenuptial agreement.

144
00:06:22,880 --> 00:06:24,576
(Laughter)

145
00:06:24,576 --> 00:06:26,949
But you still have to do it,

146
00:06:26,949 --> 00:06:31,395
particularly if you're marrying 
500 million tweets per day.

147
00:06:31,395 --> 00:06:34,492
What do I mean by "visualize catastrophe?"

148
00:06:34,492 --> 00:06:37,254
I try to think of how something as

149
00:06:37,254 --> 00:06:40,482
benign and innocuous as a picture of a cat

150
00:06:40,482 --> 00:06:41,586
could lead to death,

151
00:06:41,586 --> 00:06:43,912
and what to do to prevent that.

152
00:06:43,912 --> 00:06:46,295
Which happens to be my next example.

153
00:06:46,295 --> 00:06:49,405
This is my cat, Eli.

154
00:06:49,405 --> 00:06:51,386
We wanted to give users the ability

155
00:06:51,386 --> 00:06:53,459
to add photos to their tweets.

156
00:06:53,459 --> 00:06:55,056
A picture is worth a thousand words.

157
00:06:55,056 --> 00:06:57,065
You only get 140 characters.

158
00:06:57,065 --> 00:06:58,265
You add a photo to your tweet,

159
00:06:58,265 --> 00:07:01,303
look at how much more content you've got now.

160
00:07:01,303 --> 00:07:02,980
There's all sorts of great things you can do

161
00:07:02,980 --> 00:07:04,987
by adding a photo to a tweet.

162
00:07:04,987 --> 00:07:07,267
My job isn't to think of those.

163
00:07:07,267 --> 00:07:10,014
It's to think of what could go wrong.

164
00:07:10,014 --> 00:07:11,906
How could this picture

165
00:07:11,906 --> 00:07:15,445
lead to my death?

166
00:07:15,445 --> 00:07:18,605
Well, here's one possibility.

167
00:07:18,605 --> 00:07:21,691
There's more in that picture than just a cat.

168
00:07:21,691 --> 00:07:23,783
There's geodata.

169
00:07:23,783 --> 00:07:25,995
When you take a picture with your smartphone

170
00:07:25,995 --> 00:07:27,294
or digital camera,

171
00:07:27,294 --> 00:07:28,948
there's a lot of additional information

172
00:07:28,948 --> 00:07:30,564
saved along in that image.

173
00:07:30,564 --> 00:07:32,496
In fact, this image also contains

174
00:07:32,496 --> 00:07:34,301
the equivalent of this,

175
00:07:34,301 --> 00:07:37,380
more specifically, this.

176
00:07:37,380 --> 00:07:39,336
Sure, it's not likely that someone's going to try

177
00:07:39,336 --> 00:07:41,621
to track me down and do me harm

178
00:07:41,621 --> 00:07:43,405
based upon image data associated

179
00:07:43,405 --> 00:07:45,353
with a picture I took of my cat,

180
00:07:45,353 --> 00:07:49,004
but I start by assuming the worst will happen.

181
00:07:49,004 --> 00:07:51,342
That's why, when we launched photos on Twitter,

182
00:07:51,342 --> 00:07:55,163
we made the decision to strip that geodata out.

183
00:07:55,163 --> 00:08:01,010
(Applause)

184
00:08:01,010 --> 00:08:03,623
If I start by assuming the worst

185
00:08:03,623 --> 00:08:04,570
and work backwards,

186
00:08:04,570 --> 00:08:07,123
I can make sure that the protections we build

187
00:08:07,123 --> 00:08:08,891
work for both expected

188
00:08:08,891 --> 00:08:10,969
and unexpected use cases.

189
00:08:10,969 --> 00:08:13,914
Given that I spend my days and nights

190
00:08:13,914 --> 00:08:16,455
imagining the worst that could happen,

191
00:08:16,455 --> 00:08:20,712
it wouldn't be surprising if 
my worldview was gloomy.

192
00:08:20,712 --> 00:08:22,495
(Laughter)

193
00:08:22,495 --> 00:08:23,912
It's not.

194
00:08:23,912 --> 00:08:27,788
The vast majority of interactions I see --

195
00:08:27,788 --> 00:08:31,689
and I see a lot, believe me -- are positive,

196
00:08:31,689 --> 00:08:33,613
people reaching out to help

197
00:08:33,613 --> 00:08:37,061
or to connect or share information with each other.

198
00:08:37,061 --> 00:08:40,384
It's just that for those of us dealing with scale,

199
00:08:40,384 --> 00:08:44,184
for those of us tasked with keeping people safe,

200
00:08:44,184 --> 00:08:46,730
we have to assume the worst will happen,

201
00:08:46,730 --> 00:08:50,957
because for us, a one-in-a-million chance

202
00:08:50,957 --> 00:08:53,706
is pretty good odds.

203
00:08:53,706 --> 00:08:55,570
Thank you.

204
00:08:55,570 --> 00:08:59,570
(Applause)