Backblaze submitting names and sizes of files in B2 buckets to Facebook

benja123 · on March 22, 2021

It’s bad that Backblaze did not do their due diligence while integrating with Facebook pixel, but the bigger problem is the tendency of third party integrations having default settings that are overly aggressive when it comes to access of user/site data.

This should be a warning to every developer, when you integrate with any third party - don’t just copy the default snippet of code they recommend, read the docs thoroughly and then test/monitor what is being sent back to the third party. And if you are writing a third party widget, be considerate and make the defaults the least aggressive possible when it comes to accessing user/site/server data.

beshrkayali · on March 22, 2021

At this point, every developer should (and needs) to be aware of how invasive any tracking code they blindly copy/paste into a website. I think it's required from people who operate normal websites, blogs, etc... but missing it is somewhat forgivable, especially from people just starting.

But the level of irresponsibility with customers' most private data from a company who's MAIN JOB is to protect it is absolutely shocking. Yes, it's the freaking pixel that does the tracking, BUT IT'S THEIR RESPONSIBILITY TO KNOW WHAT THE HECK IS HAPPENING ON THEIR WEBSITE. Don't they have any sort of vulnerability assessment or security code review? Their reply tweet is almost as infuriating as how something like that could even happen to begin with. Like yeah, sorry, ain't our fault! It's astounding.

I considered using backblaze a couple of times and now I'm very happy that I eventually didn't.

csnover · on March 22, 2021

> Don't they have any sort of vulnerability assessment or security code review?

I haven’t seen any evidence that they do. The last time I brought up their history of bad security practices on HN, one of their co-founders decided that the correct course of action was to come on here, accuse me of being a bad actor, and repeatedly make up quotes I didn’t say.[0] All because I tried to warn others in the community that something just like this was likely to happen again. And now it has. So, you know.

[0] https://news.ycombinator.com/item?id=25919105

tidepod12 · on March 22, 2021

Wow. After reading about the FB tracking I was wary about Backblaze but teetering on the edge of willing to give it a pass if they fixed it. But after reading brianwski's comments in that thread, the arrogance and unprofessionalism (especially in his last comment) just completely turned me off. That attitude goes beyond a technical fuckup or bad marketing move. I'm moving my backups out of Backblaze today and won't be looking back.

metalliqaz · on March 22, 2021

not to me. sounds like someone who is tired of dealing with a user that has an axe to grind for years

csnover · on March 23, 2021

OK, except I’m not a Backblaze user and I never have been (except for the 14-day free trial). I haven’t had any private correspondence with them since reporting the vulnerability I discovered in 2019. This exchange on HN was the first time I’ve ever knowingly[0] interacted with this guy. If he has actually been ‘dealing with [me] for years’ in the way you imply, it has been a very one-sided relationship.

As far as having an axe to grind goes… if wanting to protect others is grinding an axe, I guess I’m guilty of that. I don’t feel like a handful of topical messages warning people of a legitimate and clearly ongoing problem is some abusive behaviour on my part, but maybe I’m wrong. I’m happy to learn from others’ perspectives, since I’m sure I could be a more effective communicator.

[0] I suspect he was the one who replied to my vulnerability report since the same attitude was on display in those messages too, but I don’t know since that account was just named “bbqa”.

metalliqaz · on March 23, 2021

for years

beshrkayali · on March 22, 2021

Astonishing arrogance. Their replies on current incident are also dismissive and arrogant. I can’t even imagine what kind of culture they have internally.

vesinisa · on March 22, 2021

> Don't they have any sort of vulnerability assessment or security code review?

I bet people just don't realize that the frontend code could end up being a source of major data confidentiality vulnerability. The threat modeling, auditing etc. usually just concentrates on attack scenarios involving the backend to save money and keep frontend development a bit lighter on the security review process side.

Does not make it excusable of course, just means their threat modeling was inadequate. But probably explains how this was able to sip into production.

Aeolun · on March 22, 2021

I dunno, you don’t need to be particularly concerned with security to understand that you do not need a facebook tracking pixel on your ‘paying customer’ UI.

jeffasinger · on March 22, 2021

Facebook will be pitching to your Marketing folks that's exactly where you need it.

Facebook want data on what actions users took before signing up, which users actually signed up and started paying, and how that relates to revenue. This UI is exactly where they can determine these types of actions.

Whether this actually makes Facebook better at marketing or not is a good question.

beshrkayali · on March 22, 2021

That's why it's a killer mistake to let "marketing folk" dictate _anything_ that has to do with the app. They can suggest, but not dictate. If this points to anything it'd the dysfunctional development process at Backblaze.

coldtea · on March 23, 2021

>That's why it's a killer mistake to let "marketing folk" dictate _anything_ that has to do with the app.

Killer mistake of your revenue, or killer mistake of what we'd prefer apps to be?

dnzm · on March 23, 2021

In this case it's certainly going to cost them some revenue, not to mention the EU might slap them with a nice GDPR fine for leaking PII.

hertzrat · on March 22, 2021

> I considered using backblaze a couple of times and now I'm very happy that I eventually didn't.

Same. I ended up going with spideroak because it was too hard to figure out what other providers were even promising for privacy, and it’s been fine. I’m thinking maybe of trying rsync.net with duplicity one day

magicalhippo · on March 22, 2021

> It’s bad that Backblaze did not do their due diligence while integrating with Facebook pixel

Here's the thing though, being a third party, due diligence means you constantly have to check what they're doing.

Because otherwise how do you know when they suddenly change their script to do something entirely different?

This is why having any 3rd party scripts on a dashboard service like this is in my view entirely inexcusable.

When I go to visit my cloud-based dashboard for Acronis' backup service[1] there's one single domain involved, and that's how it should be.

[1]: https://www.acronis.com/en-us/products/true-image/

the8472 · on March 22, 2021

> Because otherwise how do you know when they suddenly change their script to do something entirely different?

https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

magicalhippo · on March 22, 2021

Not bad, but as far as I can see it requires you are _really_ sure no further scripts are dynamically loaded.

Or is there a way for the server to specify all resources must have SRI?

the8472 · on March 22, 2021

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Co...

You can combine CSP it with SRI hashes and also report violations to the backend.

thecopy · on March 22, 2021

Yes, by using CSP

magicalhippo · on March 22, 2021

Nice, at least it's possible.

thereddaikon · on March 22, 2021

Why would you allow third party scripts to be updated by anyone but you?

This trend of dynamically linking to other people's shit needs to stop.

dexterdog · on March 22, 2021

If you're updating them then they're not 3rd party scripts.

barbazoo · on March 22, 2021

> Because otherwise how do you know when they suddenly change their script to do something entirely different?

Even though it's inconvenient maybe we should treat it as just another 3rd party dependency that needs to be downloaded, screened, and then used from the internal store. Pretty dangerous to dynamically load a script from a site like facebook.com.

coldtea · on March 23, 2021

>Because otherwise how do you know when they suddenly change their script to do something entirely different?

(a) Sandbox it.

(b) Have them sign a contract they won't do so but only do these N things, and if change those without telling you, sue them.

gruturo · on March 22, 2021

As much as I loathe and despise Facebook, this one is on Backblaze: they integrated with a well-known evil (seriously this surprises absolutely nobody) and should have been more careful with their settings. Or, you know, not have done it at all.

tornato7 · on March 22, 2021

Why does Backblaze even need Facebook integration anyway? That seems absurd to me.

bambax · on March 22, 2021

> but the bigger problem is the tendency of third party integrations having default settings

It's certainly a problem, but the biggest problem is simply that Backblaze doesn't need to integrate Facebook pixel to their web interface, especially when users are logged in.

There is absolutely zero benefit to this for their users, but I also fail to see what it could bring to them as a company??

They have now created a major PR problem for themselves, and what did they get in exchange?

(Ok, as the saying goes, there's no such thing as bad publicity. But still.)

rciorba · on March 22, 2021

The FB pixel is how FB gets to know about conversions. If your company runs FB adds, FB encourages you to let them know about which users converted to paying customers so it can improve the targeting of your ad campaign. And if you offer a free trial and the upgrade to paying customer happens inside the user dashboard, presto-bingo Marketing now needs you to install the FB pixel there.

It's disappointing that this is what the internet has become, and I'm happy to see issues like this brought forward. I can see value in feeding back conversion events to an ad network, but the "just let us run code on your page" style of integration needs to stop. Give developers some API to explicitly send such an event, if they really need to.

jeffasinger · on March 22, 2021

Almost every ad network does give you some type of API like what you're talking about, but unfortunately they usually actively discourage use of it.

jaywalk · on March 22, 2021

Well yeah, the API isn't going to let them keep tracking your users.

tomaskafka · on March 22, 2021

The real warning is this one: Disable any third party tracking scripts on all user account pages. Assume they are hostile party's attack vector.

If you want to retarget me on FB, fine, do it on a marketing pages of a website, but don't let the trackers anywhere close to my private data.

agilob · on March 22, 2021

>And if you are writing a third party widget, be considerate and make the defaults the least aggressive possible when it comes to accessing user/site/server data.

Unless your business model is literally collecting all personal and private information, then don't do it.

syshum · on March 22, 2021

>>It’s bad that Backblaze did not do their due diligence while integrating with Facebook pixel

How much "due diligence" should be required to know that adding anything from facebook INSIDE your backup / file storage product is a bad idea... hell adding anything from facebook at any product should be considered a bad idea...

qeternity · on March 22, 2021

> and then test/monitor what is being sent back to the third party.

Trust, but verify. Especially FB.

I see a lot of comments about a paid service using tracking pixels. I understand the backlash but this is also how companies that are taking your hard earned money improve and iterate on their products. I know people don’t like tracking but usage analytics are invaluable to improving the product. There’s only so much customer engagement you can do, and often that leaves really unexpected outcomes in the dark (if you asked most customers, they’d ask for a faster horse).

The issue here is that Backblaze injected untrusted code into their core product which by its very nature is extremely sensitive. I’m not really sure what they were thinking. Rolling your own tracking is a pain, but some security contexts demand it.

fabbari · on March 22, 2021

Pet peeve of mine: "Trust but verify" is an oxymoron. If you "trust" someone/something you don't need to verify. I understand the sentiment, but it's based on a faulty assumption that trust is a binary state.

It's most exemplary case is the use done by the security teams in most IT shops: "Yes, we did put endpoint protection on your laptop, it's not that we don't trust you -- but you know what they say...". I wish they would simply say:"Of course we don't trust you, not on this. You are going to click on that shady link or download and run random executables from the interwebs. But we do trust you're good at your job".

PeterisP · on March 22, 2021

The historical usage of the saying "trust but verify" is that the verification happens after they had the chance to do the thing (possibly badly), as opposed to mistrust where you'd verify everything that they'd do before allowing the thing to be done.

It's essentially "audits instead of pre-approval".

skinkestek · on March 22, 2021

> Trust, but verify. Especially FB.

In the particular case of Facebook many of us simplify to "don't trust if not absolutely necessary" ;-)

bigiain · on March 22, 2021

I totally trust Facebook. To fuck me over at every possible opportunity, even if they’ve pinky promised “we’d never do that!”. Because they have a track record of doing exactly that.

qubex · on March 22, 2021

‘Trust’ is defined as wilful forbearance toward those agencies that we can conceive of as being sources of harm to us.

If you “trust Facebook to fuck [you] over” then you’re not trusting Facebook. You’re expecting Facebook to fuck you over and minimising your exposure to that process.

mnw21cam · on March 22, 2021

If you ask a computing security (or indeed any security) professional, "trust" has a different definition. If A trusts B, then B has the capability of doing something bad to A. This is regardless of whether A has granted B that capability. So, when you drive down the road, you trust the oncoming car not to swerve into your path and crash into you. This is just another way of saying that the oncoming car has the capability to swerve into your path and crash into you. There's no wilful forbearance involved.

qubex · on March 22, 2021

I disagree insofar as the scenarios you describe (implicit trust in computer security, trust that another driver won’t swerve into your vehicle’s path) are perfectly good examples of the overall definition I posited above.

Indeed, if one didn’t have forbearance of (say) a software vendor or perhaps a dependency then one would be entirely free to not use that vendor’s product or that particular package. Similarly, if you don’t swerve right to prudentially make space for an approaching vehicle to serve into your path then you’re showing them forbearance and trusting them (or swerve left, depending on one’s location on the Earth what side of the road is legally mandate).

So, yep: those are perfect examples of Trust as embodied in the definition I presented above. And of course it lies at the root of initiatives such as Trusted Computing Initiative & cetera.

happymellon · on March 22, 2021

I removed all Facebook many years ago.

In the particular case of Facebook many of us simplify to "don't trust"

There is no upside for Facebook to have anything about me.

qubex · on March 22, 2021

Unfortunately Facebook has lots of information about you whether you have an account or not. You’re a big black hole with known characteristics inferred from those acquaintances of yours who interact with you in a manner Facebook can track. You might not have a name and an account, but there’s a big you-shaped golem in their data and there’s absolutely nothing you can do about it, unfortunately, and they’ll use it as they see fit with or without your consent.

happymellon · on March 22, 2021

Completely agree. I block Facebook at the router but I can't stop other people giving away my PII.

But I do make choices where I can.

ljm · on March 22, 2021

Makes me think of that Junji Ito comic with the person-shaped holes in the rock after an earthquake. Only now, it's an allegory for social media giants and their corrupting influence on the people they lure into their system.

sofixa · on March 22, 2021

You can I'd you live in a jurisdiction which gives you the right to demand for all information a company has on you, and the right to be forgotten ( GDPR in the EU, maybe California as well?)

qubex · on March 22, 2021

A necessary evil is no less evil merely because it is necessary.

tiborsaas · on March 22, 2021

> I’m not really sure what they were thinking.

My guess would be that developers installed a single Google Tag Manager script and left the tracking to the marketing/analytics team. From that point they manage what third party scripts are added to the site, not engineers.

This situation cases a tunnel vision which can unfortunately lead to situations like this.

borispavlovic · on March 22, 2021

Fine, then use Matomo (https://matomo.org/) which even offers a free tier.

stefan_ · on March 22, 2021

Facebook doesn't make an usage analytics service. And I really, really doubt 99.9% of the people using these abundance of tracking services ever discover any worthwhile corner cases you extol their virtues for. "Analytics" is just the 2000 word for "Reporting" - clueless middlemanagers demand shiny graphics.

happymellon · on March 22, 2021

> I know people don’t like tracking but usage analytics are invaluable to improving the product.

I wasn't aware that Facebook offered a Google Analytics style service, I was only aware of the Facebook Analytics for your Facebook page.

Can you provide links to this service, as my Google-fu doesn't return anything.

wildy · on March 22, 2021

Is it not this? https://www.facebook.com/business/learn/facebook-ads-pixel

happymellon · on March 22, 2021

Could be, I wouldn't touch a Facebook property to test it, but the description implies this is all advertising tracking rather than usage analytics.

goatinaboat · on March 22, 2021

this is also how companies that are taking your hard earned money improve and iterate on their products.

What analytics do you think they can do using Facebook that they cannot do from their own httpd access logs, a technology that's 30 years old at this point?

dnzm · on March 23, 2021

Or, if that is not sufficient, one of the increasing number of ethical and/or self-hosted services, like Plausible? There is no need whatsoever to ship this kind of data to an outside party.

deadbytes · on March 22, 2021

Tracking is not ethical. Corporations only get away with it because most users don't realise it's even happening.

If you went into a retail store and an employee followed you around the whole time with a notebook and stopwatch writing down everywhere you walked and every product you looked at, you would rightly be creeped the fuck out and tell him to stop.

This is exactly what online tracking is, but done virtually.

sbarre · on March 22, 2021

> If you went into a retail store and an employee followed you around the whole time with a notebook and stopwatch writing down everywhere you walked and every product you looked at

I hate to break it to you, but retail companies are doing this today using security camera footage, to figure out what parts of the store customers start at, or spend the most time in..

prirun · on March 22, 2021

They're also doing it with Bluetooth beacons / pings, explaining why every store has an app they want you to install:

https://www.nytimes.com/interactive/2019/06/14/opinion/bluet...

PeterisP · on March 22, 2021

This is one of key applications of GDPR in Europe - the fact that you can collect data for one purpose (e.g. security cameras) does not necessarily imply that you're permitted to use the same data for any other purpose (e.g. marketing analysis of customer movements).

For the former purpose, it would generally be sufficient to inform visitors with a sign on the entrance with legitimate interest clause; for the latter example IMHO the only practical compliant solution would require anonymization of the data, so you could make and store density data iff you don't have any way to tie them back to customer identities including the purchases they made, which is a key difference from the facebook example, which (as far as I understand) uses unique IDs to link the conversions to specific FB accounts.

charcircuit · on March 22, 2021

The difference is that with a human near by I feel like I am being judged and when they are not I do not feel that kind of pressure. The grocery store I've gone to my whole life has always had security cameras tracking people for reasons I assume to deal with theft. It does not effect me at all and the store has useful data that they can use. It benefits all parties.

deadbytes · on March 22, 2021

>The difference is that with a human near by I feel like I am being judged and when they are not I do not feel that kind of pressure

I will offer you another perspective to consider.

Technology is extremely subversive in that it bypasses all of our brain's instinctual responses. Someone or something monitoring and tracking you should be setting off warning sirens in your brain. At best they are trying to study you, at worse they are trying to exploit or harm you.

Through hundreds of thousands of years of evolution our brains have built up warning systems to make us feel fear and unease when we realise we are being tracked. But since humans have spent 99.99% of evolution entirely in the physical world these systems have no concept of the digital.

The reason you feel extremely uneasy when being monitored by a person, but not when being monitored by a computer system that is collecting the exact same information (or more), is that your subconscious brain doesn't understand computers.

charcircuit · on March 22, 2021

>our brains have built up warning systems to make us feel fear and unease when we realise we are being tracked

>The reason you feel extremely uneasy when being monitored by a person

I don't though. If someone walked up to me and asked me what my favorite color was and they wrote it down I don't feel any negative feeling.

Even if it was a true thing just because we have a warning system that doesn't mean that something is actually bad. Tracking just gives people more information to allow for better decisions to be made and it can make things more efficient.

atYevP · on March 24, 2021

Yev from Backblaze here -> I chimed in below but wanted to let you know hat we've finished our root cause analysis and have updated our blog post with some additional information: https://www.backblaze.com/blog/privacy-update-third-party-tr....

spaetzleesser · on March 22, 2021

"but the bigger problem is the tendency of third party integrations having default settings that are overly aggressive when it comes to access of user/site data. "

Our permission schemes are way too broad in general. I have done some tests with Zapier and IFTTT. For every integration you consent with them seeing ALL your data, being able to modify and so on. You also don't get a log how often the permission was used.

Sephr · on March 22, 2021

We're looking to invert this equation at Transcend. All client side network emissions can be regulated in accordance with tracking consent following a block/quarantine-by-default model.

This means that any network emissions you add to your site would have to be categorized by URL or domain instead of relying on shaky special-cased integrations that can fail as soon as an API changes.

xorcist · on March 22, 2021

Due diligence when installing a tracking script? Is this not a third party script which runs in the page context?

cosmie · on March 22, 2021

PSA for any devs out there implementing FB Pixels:

– Facebook's pixel will, by default, attach click listeners to the page and send back associated metadata. This simplifies implementation needs, but can create unintentional information leaks in privileged contexts. To disable this behavior, there's a flag[1] you can use. After which, you can manually trigger FB Pixel hits and control both when they're fired and what information is included in them.

– There's a feature for the FB Pixel called Advanced Matching[2] that allows you to send hashed PII as parameters with your FB events. "Automatic Advanced Matching" can be enabled at any time via a toggle in the FB interface. I believe that setting autoConfig to false as mentioned above will similarly prevent Automatic Advanced Matching from working (since it disables the auto-creation of all those listeners to begin with). When manually triggering pixel calls like above, you can use this functionality via "Manual Advanced Matching"[3].

As a general rule, I'd strongly encourage anyone implementing a Facebook pixel to also include the autoConfig = false flag. This makes it work like most other pixels, where the base tag just instantiates an object. After which, hits only occur when explicitly defined in the site code, and include specifically what details you include in it. That way you're fully aware of the scope of data disclosure happening and any need from marketing to include sensitive (or potentially sensitive) information in these calls has to be explicitly requested (and theoretically vetted) as part of the standard dev process.

[1] https://developers.facebook.com/docs/facebook-pixel/advanced...

[2] https://www.facebook.com/business/help/611774685654668

[3] https://developers.facebook.com/docs/facebook-pixel/advanced...

miked85 · on March 22, 2021

This is good info, but Backblaze shouldn't even be using FB pixels to begin with.

cosmie · on March 22, 2021

I don't necessarily disagree with you, and whether a pixel should be there at all is definitely a discussion in itself.

But for those who are implementing FB Pixels, I wanted to put out some potentially useful information that can help protect against unintended data disclosure, after mentioning the auto-listener behavior in a reply to another comment and being met with surprise[1].

[1] https://news.ycombinator.com/item?id=26537078

nerdponx · on March 22, 2021

Seems like it's designed to make it easy to accidentally send more of your users'/customers' data back to FB than intended. Oopsie!

azernik · on March 22, 2021

Given their (paying) customer base, which skews more towards content producers, I suspect it's more likely intended to ease setup for less-technically-savvy users.

ie they don't really want your truly-security-critical customer data. But if they can boost their conversion rate with sites like dogfoodreviews.com by 5%, and the price is sending backblaze.com's fantastically-sensitive paid customer data into an unsecured data path, they will absolutely do it.

Comparable to the absolute havoc that Zoom wreaked on browser security to save one click on starting a call.

cosmie · on March 22, 2021

> I suspect it's more likely intended to ease setup for less-technically-savvy users.

> ie they don't really want your truly-security-critical customer data.

It's both. It eases implementation with a one-and-done snippet, and then slaps a user-friendly GUI on the other side for marketers to sort through the firehose and use what they want.

While making it also trivially easy for marketers to toggle a button that OKs the turbo-boost mode that siphons up (hashed) sensitive customer information, which can then be used to claim credit for additional conversions by cross-referencing the (hashed) PII siphoned up against what Facebook has for those exposed to your ads.

OneLeggedCat · on March 22, 2021

Literally every external privacy setting of fb across its entire business is designed that way.

KingOfCoders · on March 22, 2021

Exactly. When the most important thing for a backup provider is the trust people put in you, don't do anything that risks that trust.

gigatexal · on March 22, 2021

Edit: my biases against Facebook keep me from making cogent points.

mvzvm · on March 22, 2021

This feels like an almost intentional misunderstanding.

gigatexal · on March 22, 2021

Pixel tracking on an internal customer page... scanning and uploading metadata about users backups? How much am I misunderstanding?

mvzvm · on March 22, 2021

> I guess they can’t afford to cover costs given current prices so they sell customer metadata to Facebook

This sentence, your thesis, is absurd. Where does it say they make money from selling data?

Furthermore, "I guess they can’t afford to cover costs given current prices" is a really strange foundation to leap from. Do you have any facts? Or are you just speculating on BackBlaze and making wild assumptions?

> now that other like-minded crazies can find each other faster than ever

You are right, there is nowhere else online where "crazies" gather - not 4chan, not reddit, not voat, not twitter, just Facebook?

goatinaboat · on March 22, 2021

Where does it say they make money from selling data?

What would be their motivation for just giving their customers data to Facebook for free?

suprfsat · on March 22, 2021

Apparently it's incompetence, if they're unaware that they could have been making money on the deal.

gigatexal · on March 22, 2021

The last point I’ll admit exposes my bias against Facebook.

But the previous two points make no difference on why tracking is baked into an internal portal so I speculated as to the reasoning as anyone would do — it’s not a stretch to think that data owners could sell customer data to an aggregator like Facebook for an additional line item of revenue.

manigandham · on March 22, 2021

Facebook does not, and never will, pay for data like this.

mnw21cam · on March 22, 2021

Which raises the question - why would anyone ever include a Facebook tracking pixel in their web page, if it provides no benefits, and has a cost?

manigandham · on March 22, 2021

> "if it provides no benefits"

Because it does? There are valid reasons for using the pixel in advertising.

busymom0 · on March 22, 2021

> I guess they can’t afford to cover costs given current prices

This is plain wrong. One of the reasons I was a fan of B2 was their tech and how they achieved such low costs:

https://www.backblaze.com/blog/design-thinking-b2-apis-the-h...

https://www.backblaze.com/blog/backblaze-and-cloudflare-part...

While this FB pixel debacle is obviously a very big screw up, it's pretty much a "screw up" and unintentional from what I understand so far. And they have fixed it already which is a positive step towards redemption.

From my speculation - the screw up seems to have happened from including the googletagmanager. They probably only wanted it to stay on the home page of B2 (for ad conversion tracking if I were to guess), not on the dashboard itself after login. The screw up caused it to be on the dashboard too.

mvzvm · on March 22, 2021

Why is that? Perhaps they get business value out of it?

croon · on March 22, 2021

If that business value destroys trust in their main business, that's a problem.

brightball · on March 22, 2021

This is the thing that a lot of executives don’t ever seem to understand.

tga · on March 22, 2021

It looks like they’re about to lose a bunch of business value because of it.

bschwindHN · on March 22, 2021

Won't somebody please think of the business value???

manigandham · on March 22, 2021

That's how businesses make decisions.

bschwindHN · on March 22, 2021

> Backblaze shouldn't even be using FB pixels to begin with.

> Why is that? Perhaps they get business value out of it?

Oh, they get value out of invading my privacy? Carry on then!

manigandham · on March 22, 2021

Running and measuring ads is one of many things that delivers value to a business, yes. The privacy issue in this case is clearly an implementation mistake and seems to have been resolved.

Ignoring the situation and context to make a comical statement doesn't really add anything to the discussion.

ksec · on March 22, 2021

The overwhelmingly mentality on HN is that All Ads are bad.

And all targeting Ads is bad. Because by their definition, All ads are tracking ads.

This mentality also fits the current Internet and Twitter narrative. Especially true when it is from Facebook. Which happens to be pure evil on HN, twitter sphere and MainStream Media.

Nextgrid · on March 22, 2021

> The overwhelmingly mentality on HN is that All Ads are bad.

Ads will at best waste my time/bandwidth/processing power and at worst compromise my privacy and/or convince me to make a bad financial decision.

I don't see why this mentality is wrong?

sokoloff · on March 22, 2021

Ads are micropayments that work. I don’t like them per se and run an ad blocker and PiHole, but the fact that others don’t allows me to micropay for a lot of content with my time.

ignoramous · on March 22, 2021

> Ads are micropayments that work.

One thing if those micro-payments end up enriching the creators, but another when blackholed into the coffers of a few tech cos.

ectopod · on March 22, 2021

Surely you meant to say:

> but the fact that others don’t allows them to micropay for a lot of content with their time.

I'm not objecting. I do the same. But in what sense are you paying? I'm sure I'm not.

sokoloff · on March 22, 2021

When I read a 5 minute Medium article, I'm paying with 5 minutes of my time. If I decide to bail out 1 minute into the article, I have still lost 1 minute of my time.

The creator isn't getting any benefit from it, but I'm still paying.

efreak · on March 29, 2021

＞ Because by their definition, All ads are tracking ads.

John Gruber has an ad at daringfireball.net, currently for a company called Simris. IIRC the ad is pure text, not loaded by a script, and does not track you. Other blogs (usually security professionals ime) have text-based ads that are probably part of the theme in a static site generator.

ksec · on April 5, 2021

I should have put /s at the end of that sentence.

I know all of what you said, but somehow HN still thinks Ads are bad.

tomaskafka · on March 22, 2021

I assume they get business value by retargeting site visitors - to do this, just run the FB pixel (properly configured!) on a marketing pages of the website. Not in the logged-in part!

manigandham · on March 22, 2021

Why? Using ads to increase business is completely valid. This issue is data leakage due to an implementation error and has nothing to do with using advertising services from Facebook or other companies.

HelloNurse · on March 22, 2021

By this reasoning, using guns to shoot people is completely valid; the issue is stray shots due to inadequate aim and has nothing to do with being a criminal engaged in a drug war.

I'll resist the temptation to draw a parallel between "advertising services from Facebook or other companies" and a crime syndicate.

manigandham · on March 22, 2021

No, that's not the same reasoning at all. It's an irrelevant and outrageous strawman where you compared the use of advertising to "using guns ... as a criminal engaged in a drug war". Ridiculous at best and I'm not sure what temptation you resisted.

If you have a real rebuttal against advertising then reply with that instead and we can discuss how technical implementations can be fraught with security mistakes and errors, regardless of industry or product.

HelloNurse · on March 22, 2021

The point is that "technical implementations", such as how to shoot properly, shouldn't be discussed "regardless of industry or product", such as being a gangster: sharing PII with Facebook is something most web sites should avoid, not something they should do properly.

manigandham · on March 22, 2021

> "shouldn't be discussed"

Why not? Technical implementations can always be discussed separately from the context they're used in, and even your extreme example of guns has perfectly valid uses in the police and military. Yet you're making the strange comparison to being "a gangster". Why? What's the point of this convoluted analogy?

> "sharing PII with Facebook is something most web sites should avoid, not something they should do properly"

Again, why? You seem to claim a lot without any basis. Data has valid uses, and being used properly is foundational to providing privacy.

HelloNurse · on March 22, 2021

It shouldn't be controversial that not sharing sensitive data with Facebook is "foundational to providing privacy" and therefore using a Facebook tracker to fuck users needs a solid, extraordinary justification like "valid" gun use by the police and military.

You seem to believe that this particular breach is accidental, but reckless incompetence on Backblaze's part isn't much better than deliberate disregard for user privacy: any online service from Facebook should raise a red flag.

manigandham · on March 22, 2021

If you agree that the sensitive part was in error, then you're just against sharing data with Facebook for ads? That's certainly not some unanimous global perspective as I'm sure you know, since that's their actual business used by millions of other companies.

Nextgrid · on March 22, 2021

There are ways to use ads without violating privacy nor breaking the law (remember that this practice is illegal under the GDPR).

Either way, if you must do ad tracking, do so on your homepage. Once the user is logged in and has paid you money for a service there shouldn’t be any ads nor tracking.

manigandham · on March 22, 2021

> "without violating privacy"

Yes, that's covered by this being a mistake in implementation as I said.

> "there shouldn’t be any ads nor tracking"

Again, based on what exactly? Finding new users that are similar to your existing customers is a completely valid strategy.

Most people in this thread are making wild statements from the typical emotional/outrage driven pile-on when anything happens.

tobr · on March 22, 2021

> Finding new users that are similar to your existing customers is a completely valid strategy.

What on earth does “valid” mean here? It’s certainly not acceptable (to me as a customer) if it involves exposing your existing customers to these risks. Those ends can not justify those means.

manigandham · on March 22, 2021

Valid as in it's a common, reliable and efficient way to gain new customers.

Customers weren't intentionally exposed to that risk nor was it part of a trade-off, it was an implementation mistake for many reasons, something I've repeated 3 times now. What is so complicated to understand here?

tobr · on March 22, 2021

Customers were intentionally exposed to the risk, because they intentionally added this third-party code. If they’re not thinking in terms of risk management when they add third-party trackers to their site they do not have an adequate security process. There is a trade-off to security whenever you allow code like that in your product. They can’t just wave it off as a mistake, because it’s a mistake that is very telling about their priorities.

manigandham · on March 22, 2021

The mistake was allowing code into that specific part of the product.

Under your definition, there can never be mistakes at all; which is impossible.

tobr · on March 22, 2021

It’s very simple: if you include un-vettable third-party code in your system, and system also handles sensitive data, you are dealing with a huge risk. You need to make sure that the code is unable to touch the sensitive data. As it turns out, it’s a lot harder than not having untrusted code and sensitive data in the same system in the first place. The direct mistake was probably that the wrong code was included on the wrong page, but if the risks involved had been taken seriously, such a small mistake would not have been able to have such a catastrophic effect.

Nextgrid · on March 22, 2021

Based on respect, common sense and the GDPR?

> Finding new users that are similar to your existing customers is a completely valid strategy.

But this can be achieved with tracking in the homepage without embedding trackers in the actual product right next to sensitive data?

> Most people in this thread are making wild statements from the typical emotional/outrage driven pile-on when anything happens.

This doesn't make these statements any less valid though? Most people are indeed outraged that a paid professional product is ratting them out to Facebook which makes total sense as nobody would've expected that.

manigandham · on March 22, 2021

> "But this can be achieved ... without embedding trackers in the actual product right next to sensitive data?"

Yes, it was an implementation mistake. How many times do I have to repeat that? See, this is the outrage that doesn't even read the actual comment.

Corrado · on March 22, 2021

The thing that concerns me about the FB Pixel (and GTM) is that the host is completely free to do anything and everything to the page. Even if they don't do anything "evil" today, tomorrow is a different story completely. This scares the pants off of me and makes me want to rip out any "tracking" that I've ever installed on any site anywhere. Actually, that's probably not a bad idea.

Are there no browser level protections for this type of thing? I thought CORS was supposed to prevent these activities from happening.

cosmie · on March 22, 2021

Virtually all tracking boils down to 1x1 sized images getting embedded on the page, with various metadata being attached in that image call. The javascript libraries may include other functionality (like additional fingerprinting and such), but are primarily just convenient abstractions that generate and embed the the tracking images for you. Most provide the details needed[1] to build your own generator function, which would allow you to integrate the tracking you want while reducing your security exposure to third party code.

As for GTM – a deployed container is self-contained. If you don't want to expose your site to third party code, but want to use GTM as a convenient control plane for configuration of tags and tagging rules, you can do that. Instead of using the standard snippet that loads the container from Google, you can just grab the generated javascript file for the container after a new deploy and self-host it. It gives you the convenience of GTM (central control plane for tagging-related stuff, versioning and commenting, etc) but without the security exposure of embedding externally hosted scripts.

[1] https://developers.facebook.com/docs/facebook-pixel/advanced...

tga · on March 22, 2021

The actual 1x1 pixel is a leftover from the previous generation tracking tools, and even the page you liked to recommends _against_ using that method because it can’t spy on users enough.

Here we are talking about a tracking _script_ embedded in the page and sending to Facebook everything the user does (“standard or custom events triggered by UI interactions”).

Using only a pixel to track how users move around the app wouldn’t have landed Backblaze in as much hot water. Instead, it looks like the Facebook _tracking script_ (automatically) exfiltrated sensitive data like file names, and that crosses a limit.

cosmie · on March 22, 2021

It's not a leftover – the core premise of how these scripts work use the exact same principle. Even when using the JS tracking library, if you look at the network calls to Facebook after the initial script download, they're all hits to https://www.facebook.com/tr/ with the metadata for the call in query parameters and a return an image content type (image/gif).

As I mentioned in my original comment, the tracking scripts are more than just generator functions for the image pixels. They also do stuff like browser fingerprinting and cookie management[1], and ensure these things get tacked onto generated pixel calls. This improves the fidelity of the data sent back to Facebook, but ultimately it all boils down to image calls with tracking data tacked on as query parameters to the call.

The reason Facebook (and others) don't recommend doing this is because

– As you mentioned, they have way more freedom to do what they want on the page when you load their actual script. So of course that's going to be their preference.

– Advertisers use these pixels for attribution purposes, but ad networks also use the opportunity to further fingerprint and profile users for targeting within their platform.

– The tracking script abstracts away the actual tracking protocol being used (i.e. the query parameters and their associated values). Which helps ensure calls are made correctly, as well as provides flexibility to make changes in the underlying protocol while retaining a stable interface via the JS SDK.

- Takes care of things like generating a unique user id, looking for and saving Facebook Click IDs when seen on incoming traffic, and tacking those values onto pixel calls when they occur.

Any user ID can actually be used, so long as it's unique (and Facebook's methodology is documented and easily replicated in [1], if you want to be consistent with the SDK). And persisting a query parameter into a cookie is actually more robust if done by a first-party script, since ITP has made the lifespan for cookies written by third-party scripts so short.

As long as your custom image generator accounts for those two components (generates a client id if none exists and persists + includes a fbclid if seen on incoming traffic), you will get close to parity with the JS tracking library as far as attribution in Facebook Ads without any need to load third party scripts from Facebook (or other advertisers). Which, as an advertiser, is the only part that you care about. What isn't at parity is all of the secondary fingerprinting that ad networks do, but that's the ad network's problem and preventing that shady shit from happening on your site is the precise reason you'd want to roll your own tracking calls to begin with.

[1] https://developers.facebook.com/docs/marketing-api/conversio...

sbarre · on March 22, 2021

As a first-party site owner, subresource integrity checks[0] (that someone else already linked elsewhere in this thread) lets you at least determine, at the browser request level, if a third-party script has changed since you installed (and hopefully audited) it.

0: https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

mhils · on March 22, 2021

The ad-tech response to SRI is to provide script tags where the entrypoint script loads additional scripts, which you cannot pin. ¯\_(ツ)_/¯

sbarre · on March 22, 2021

Hmm that's a good point.. That does seem like a flaw in the plan..

overscore · on March 22, 2021

For various reasons including this, advertising tracking is moving server-side, where the company can much more tightly control what gets sent to the vendors, and where third party JavaScript no longer has access to the DOM, network requests, or cookies.

londons_explore · on March 22, 2021

And pesky users can't point out GDPR violations using the browser Developer Tools.

Server side analytics will prove much more powerful and opaque when it gets integrated deep enough into web dev stacks to work properly.

deadbytes · on March 22, 2021

This is scary to think about.

The upside of third-party trackers is that you can completely block all of them by just blocking third-party javascript. What are we going to do once all of this tracking code starts getting served from the first party domain instead? Or even served inside the same source files as site code?

I imagine we will start seeing a new class of privacy extensions that behave more like anti-virus. Checking for known hashes of tracking scripts, monitoring for certain patterns of behaviour during execution.

overscore · on March 22, 2021

The future is entirely server-side tracking, with no JavaScript executed in the client unless for UX tracking like Hotjar or A/B testing like Target or Optimize.

Personally, I haven't seen a desire in companies to skirt GDPR. Rather companies just want to be compliant and not have to worry about data breaches or reputational damage from their marketing tools. This example with Backblaze is exactly what companies are trying to avoid.

altano · on March 22, 2021

CORS is about safely allowing two cooperating, different origins to communicate. In this case, Facebook and the host are cooperating.

Lukas_Skywalker · on March 22, 2021

As a protection for the users, addons like Facebook Container for Firefox [0] can isolate all Facebook tracking and prevent the scripts from running on pages that are not facebook.com.

[0] https://addons.mozilla.org/en-US/firefox/addon/facebook-cont...

rciorba · on March 22, 2021

Even just using an ad-blocker will prevent this: https://github.com/gorhill/uBlock

corobo · on March 22, 2021

And if that doesn't tick your creepy boxes lets try the financials. If a user hits your tracking pixel they (and those like them) will more likely see ads similar to yours, meaning potential customers will be more expensive to obtain now.

Don't give data to Facebook lmao.

hertzrat · on March 22, 2021

I have a question about tracking scripts: can they read what we type into browser addons? Eg, your master password when unlocking a password manager?

azernik · on March 22, 2021

One easy way to avoid this kind of mistake in your own product: make a clear distinction between your publicly-facing web site ("corpweb") and your web app for logged-in users. Preferably, they should be served from separate infrastructure.

Corpweb should be as static as possible, except for whatever third-party JS the marketing professionals think is necessary. It's their job, they know what's best.

Your app should have zero third-party JS except for technical analytics (New Relic, Datadog, whatever).

(This distinction can be fuzzier for free services, and for consumer stuff with non-sensitive data; Backblaze is neither of those.)

dylan604 · on March 22, 2021

>It's their job, they know what's best.

That made me LOL. You'd like to think they know best. Most of the time, they do things because other people do things, but don't truly understand what the true ramifications are of their requests.

Marketing: "What's the big deal? All you have to do is add the 2 or 3 lines of JS to the site."

Devs: "Do you know exactly what that will do?"

Marketing: "It'll give us all sorts of useful metrics for free"

Devs: "Do you know if it is secure or will cause our site to become less secure due to vulns in the included JS? Will it cause the site's performance to become sluggish where we will get blamed? Do you know exactly what data is being collected, and will it affect any of our other obligations of maintaining this site?"

Marketing: "Um..., that's your job. We just want the data"

azernik · on March 22, 2021

I'm speaking from the perspective of a site, like Backblaze, where web app and the site fulfill two separate functions. There are lots of cool metrics that marketing wants; any code they put into www.backblaze.com is pretty low cost, and usually done by a separate team than product.

The product site itself (usually app.example.com, but Backblaze seems to use secure.backblaze.com) actually contains customer data in the browser context, is under much higher base resource loads from its core functionality, and is used repeatedly in workflows where poor performance is painful to the user.

No one gives a shit if your pricing page takes 500ms to load instead of 100ms, or if a dozen social media companies who already know where you work learn what kinds of professional products you're looking for.

They do care if a file listing takes longer, a recipe opens slower, if word frequencies in confidential data are leaked to the world.

In most paid products, the non-paid part of a site has radically different performance and security requirements from the paid part, and forcing one to be built to the requirements of the other (in either direction) is wasteful, or dangerous, or both.

usrnm · on March 22, 2021

> Most of the time, they do things because other people do things, but don't truly understand what the true ramifications are

To be fair, it's just as true for developers as it is for the marketing guys

capableweb · on March 22, 2021

This is essentially true for every single human and the reason discussion forums has discussions, because we keep being unable to see the other persons perspective and think we need to spread "the truth" to them.

iamacyborg · on March 22, 2021

I’m a marketer, can’t argue with the above.

cuu508 · on March 22, 2021

If you are doing business in the EU, then you have to be careful about using 3rd-party stuff (Google fonts, embedded Youtube videos, embedded maps, chat widgets, ads, pixels, analytics etc. etc.) on the public-facing website as well.

> It's their job, they know what's best.

That's like sending an alcoholic down the spirits aisle.

azernik · on March 22, 2021

As long as they don't have the car keys, they can go knock themselves out.

Re: the more substantive legal points - there are off-the-shelf solutions (CMPs, for example) and easy checklists for complying in the setting of a static public-facing website. The web designers and brand managers I've worked with are more than capable of meeting clear industry standards.

It's inside a web app, where customer data is on the page and in JS scopes, where the product team is essential in safeguarding customer data.

g_p · on March 22, 2021

> there are off-the-shelf solutions (CMPs, for example) and easy checklists for complying in the setting of a static public-facing website.

In my experience, the people implementing these often don't understand enough about the technology to be allowed to implement this. I wonder if "easy to use" tag managers are to blame, by allowing non-experts to add JS and other includes to webpages without process or scrutiny.

Check a few big brand-name websites, and look at whether they place (third party) cookies before the CMP has even been interacted with.

I can think of some major high street labels where the consent prompt is mere theatre.

cuu508 · on March 22, 2021

> there are off-the-shelf solutions (CMPs, for example) and easy checklists for complying in the setting of a static public-facing website.

Consent Management Platforms, things like Cookiebot?

In my experience, blocking 3rd-party HTTP requests, cookies, LocalStorage access etc. before consent is given – is easy in simple cases, but can quickly get technical and tricky.

azernik · on March 22, 2021

The public-facing web site for non-users had damned well better be a simple case.

smolder · on March 22, 2021

I agree with this approach but there's an issue with "it's their job, they know best". They will inevitably want to put the same tracking crap in the logged-in site as well, so they can see how visitors converted into users, how they used the site once they were users, how valuable that made them as customers, and so on. As long as FB/other analytics firm is saying "we can help you market better with additional data", marketers are going to advocate sharing it.

azernik · on March 22, 2021

Which is why it has to be a two way street - when it comes to the product, specifically its security and performance, engineering needs to absolutely own the thing.

tarsinge · on March 22, 2021

> Corpweb should be as static as possible, except for whatever third-party JS the marketing professionals think is necessary. It's their job, they know what's best.

I strongly disagree. Marketing professionals often lack technical understanding and are superficial about the consequences. This mentality is how you end up with engineers working for companies doing morally questionable choices because they just want to be a cog in the system instead of being concerned about the direction and how the company do business. Separations between services is a shared illusion, if something is against your values please tell your fellow human beings.

azernik · on March 22, 2021

In a static website, where there are standard tools and checklists and web designers to walk them through it, marketing's lack of technical understanding is less of an issue. And in a B2B web app like Backblaze's and my own, the data exposed to the public web site is just not all that sensitive.

And I'm not talking as a prospective employee who "just wants to be a cog in the machine"; I'm talking as a founder and CTO who sets company goals. I'm worried about data leaks caused by poor implementation and short-sightedness, not those caused by company policy that I disagree with. If I disagree with company policy, I change it.

cuu508 · on March 22, 2021

I'd be interested to see a site where marketing professionals with limited technical understanding knocked themselves out, but they used standard tools and checklists, and it came out OK. Do you have any examples?

walrus01 · on March 22, 2021

I agree with this delineation, and if necessary, it lets the sales/marketing people go absolutely wild with the tracking, analytics and CRM-system integration on the public facing marketing website, should the C-level people decide to allow them to do so.

tarsinge · on March 22, 2021

I’m baffled by this mentality but I guess it explains a lot. At which point are you concerned about what other services or even the company is doing? That’s a real slippery slope for me.

intricatedetail · on March 22, 2021

This wasn't an innocent mistake I think as it's not your neighbours project done in a garage.

sneak · on March 22, 2021

What makes you think NR or Datadog will be a better custodian of your customers' private data than Facebook's world class data security team?

Spyware is spyware, and embedding it is rude no matter who it is from.

azernik · on March 22, 2021

* Dealing with sensitive data is their core business, not a side concern. i.e. their customer base is security-conscious enough that they know incompetence on that front will kill their business

* Both of those specific services omit lots of specific identifiers in their data collection, and require you to go out of your way to send truly sensitive stuff to their servers

* By necessity, a technical analytics service gives you lots more control over where exactly it hooks in to your code.

KptMarchewa · on March 22, 2021

NR or Datadog is not in the ad business.

gogopuppygogo · on March 22, 2021

The goal of marketing is to get product in front of the right audience. The proposed solution you mention would include plenty of folks who don’t convert. That would likely create less effective results for marketing.

Also, what would you use to monitor user behavior to improve your product if third party tracking is frowned upon?

cuu508 · on March 22, 2021

> Also, what would you use to monitor user behavior to improve your product if third party tracking is frowned upon?

* user testing, user interviews

* (there's probably a fancy word for this) generate usage metrics from production data. If the product is e.g. a To Do app, you could measure "engagement" by counting how many To Do items each user has created

* first-party tracking – analyze access logs, self-hosted analytics

manigandham · on March 22, 2021

99% of the time, 1st-party tracking just means hosting the tracking script on their own domain, the data still gets sent to 3rd-party analytics services.

azernik · on March 22, 2021

Well, here is where my hot take and general principle needs some nuance :-P

> The proposed solution you mention would include plenty of folks who don’t convert.

The one bit of data I currently have going from the app to corpweb analytics is precisely this - associating a conversion with a website user. That conversion info is sent with hand-coded triggers, the relevant third-party libraries are self-hosted (a good thing generally), and code doesn't call out to it when the user is outside of the billing/subscription flow.

> Also, what would you use to monitor user behavior to improve your product if third party tracking is frowned upon?

I'm open to A/B-testing stuff - the important part in my mind is less specifically about third-party tracking, and more about ensuring that the product/engineering team makes decisions about the product. Pulling third-party code of any kind, but especially tracking code, is a process full of footguns that should be under the control of people who know what they're doing and are empowered to say "no".

I would still be very cautious about giving analytics code full access to all user activity.

ianschmitz · on March 22, 2021

I would also love to know as I've leveraged custom events in Google Analytics fairly extensively in the past.

azernik · on March 22, 2021

Google Analytics custom events and Datadog are in fact the only third-party analytics code I run in my app. Tools like GA that gather less-identifiable user behavior info are borderline in my book between technical and marketing; and I honestly just trust Google as an organization not to mix my own GA data into their marketing data.

iamacyborg · on March 22, 2021

> I honestly just trust Google as an organization not to mix my own GA data into their marketing data.

If you turn on demographic information in GA that explicitly also allows the data from your instance to be shared with Google’s advertising products.

azernik · on March 22, 2021

Yup! I happen to not do that; part of why I trust Google on this (in the site owner role) is that they're good at making privacy-compromising options clearly-delineated and optional.

doliveira · on March 22, 2021

Well, self-hosted analytics. And if the deal is conversions, you can send them to Google from your own server, sending just the user ID.

Zhenya · on March 22, 2021

This is actually pretty crazy. You pay for this service, and then they share, what many consider to be pii, directly to facebook.

I'm guessing if you're logged into facebook, now FB can correlate:

1) you use a backup service

2) all the metadata from the file names

Whoops.

This is why my network runs pi-hole / diversion with all tracking blocked network wide.

tuwtuwtuwtuw · on March 22, 2021

Who are the "many" that consider file name PII?

Nadya · on March 22, 2021

sisters_graduation_2019.jpg

Rebecca_Range_will_and_final_testament.pdf

NadyaNayme - Resume - [Company Name].pdf

divorce_process.pdf

steps-to-take-after-a-car-accident.pdf

personal_injury_michael_perez_west_coast_trial_lawyers.pdf

embarassing_porno_filename.mp4

And this is a smalls subset of filenames that could not only provide PII but also potentially embarrassing or private information that isn't identifying but would be accompanied with files that are personally identifying.

Zhenya · on March 22, 2021

Well said, nadya

stordoff · on March 22, 2021

Just from looking at my own documents folder, "Stephen Tordoff - ESA Appeal SSCS1.pdf". That would reveal that I have a disability, and that I am (or have) claiming benefits for it. Not everyone would be happy with that being public knowledge, and I'd be less that thrilled if things like it were shared with Facebook for no reason.

ipnon · on March 22, 2021

URLs, of which file names could be considered a subset, are definitely considered PII.[0]

[0] https://cphs.berkeley.edu/hipaa/hipaa18.html

thefunnyman · on March 22, 2021

I’ve worked for multiple FAANGs and can assure you that all have considered file names to be PII. Pretty much anything that contains user input should be treated as PII.

pengaru · on March 22, 2021

Lawfirms often utilize automatic content-derived filenames, up to the maximum OS-supported limit. As a result you'll find all sorts of private information in the filenames within their backups.

pengaru · on March 22, 2021

I should also mention, those automagic naming schemes use the beginning text in the document. So it's basically the personal details of plaintiff vs. defendant.

I've also seen similar schemes used by doctor offices/hospitals, where you'll see the patient name and ailment. I once had to troubleshoot a backup problem for what turned out to obviously be an OB-GYN. Imagine my horror as I saw long filenames containing patient names and common STDs scroll by in the thousands.

ncallaway · on March 22, 2021

I would generally consider a filename to be the kind of field that could easily contain PII.

Names or identifiers are the kinds of things that very often end up in file names.

gpm · on March 22, 2021

    Sexual Harassment Complaints/Persons Name.doc

...

mrweasel · on March 22, 2021

Read the law strictly enough and you can put health information in the file name and now you’ve forced Facebook to deal with HIPAA.

That’s taking it to the extreme, but it’s technically correct.

Arelius · on March 22, 2021

No, that’s neither true nor “technically correct”. Services that the user provides health information to for their own purposes aren’t considered covered entities or their business associates, and thus HIPAA rules don’t apply. This is why Dropbox, and GMail don’t have to be HIPAA compliant.

If you say “Read the law strictly enough” please do...

mrweasel · on March 22, 2021

You might be right, I believed I saw some definition that simply stated that "system receiving or storing PHI" would be required to be HIPAA complaint, regardless of how the data got there.

I'm still wondering, because Backblaze will sign a BAA (their website says so), making them a business associate. I'm not talking about some private person uploading their own documents. My concern is that given that Backblaze will sign a BAA, then some companies must be using Backblaze and potentially storing PHI data there. Yes?

Backblaze then need to follow: "§ 164.312 Technical safeguards. (e)(1) Standard: Transmission security. Implement technical security measures to guard against unauthorized access to electronic protected health information that is being transmitted over an electronic communications network."

Facebook isn't authorized to access this data, but that might be more of a problem for Backblaze, even if Facebook could be required to delete the data.

Arelius · on March 25, 2021

Yeah, that seems like it would be on Backblaze, but this might not apply here, as 164.312 only applies to PHI, and it's quite likely that Backblaze will only sign a BAA for their B2 service, and not their standard backup service.

Simon_says · on March 22, 2021

HIPAA only applies to healthcare providers and a couple related industries. Facebook is not there (yet).

Zhenya · on March 22, 2021

John_smith_college_graduation.jpg

KingOfCoders · on March 22, 2021

John_smith_doctored_college_graduation.jpg

berniemadoff69 · on March 22, 2021

Pay careful attention to the response from Backblaze:

> "Hi Brett! The pixels we use are primarily for audience building when we advertise on other platforms like Facebook for example." [...]

The carefully calculated cutesy "Hi Brett!" with the exclamation point is the same reason big tech companies use infantile graphics [0]: by seeming playful, they create the illusion they are a Safe Friend you can Trust.

[0] http://jollo.org/LNT/public/nursery.html

Edman274 · on March 22, 2021

Using a salutation, and addressing someone by name is not a conspiracy to make people trust you. The things that you should care about are the banality of evil, and that no one believes that they've done anything wrong. I live in the Midwest and my job is to make low-impact CRUD applications for a small car insurance company. I would use the same salutation, because I have been taught that that is what I'm supposed to do. I wasn't coached in some session on how to trick people into thinking I'm their buddy - it just becomes part of the shared social vocabulary.

Scene_Cast2 · on March 22, 2021

Modern business vocabulary has shifted from "Dear Mr. LastName," to "Hi FirstName!". This shift happened first in more "trendy" places, although most everyone has already moved on to using informal language in customer relations.

I do agree with your point about banality of evil.

pbourke · on March 22, 2021

I refuse, with every last fibre of my being, that fucking exclamation mark.

numpad0 · on March 22, 2021

Cool! Just a quick recap: resistance is futile.

chris_wot · on March 22, 2021

Is it ever!

KptMarchewa · on March 22, 2021

Just look at your own HN page. Throwing stones while living in a glass house.

    Hi there!

pbourke · on March 22, 2021

Ha - you’re right. I suppose I object to "Hi $FirstName!" becoming the standard in professional communication. Often there’s nothing exciting that follows that exclamation mark, and the comma has been the standard so far.

berniemadoff69 · on March 22, 2021

Backblaze is role-playing 'trusted friend' on twitter the same way McDonalds and Wendys get into 'fights' on twitter. It's just corporate playbook stuff; I wouldn't say it's a conspiracy. Here's the latest posts on backblaze's twitter: https://i.imgur.com/mMkylym.png

sneak · on March 22, 2021

A corporate playbook is an agreed-upon-in-advance set of rules to follow by a group of people.

That's pretty much the definition of a conspiracy.

JW_00000 · on March 22, 2021

A conspiracy needs to be secret.

brianjunyinchan · on March 22, 2021

Absolutely. Better to focus on what actually matters, and keep the action dense.

rPlayer6554 · on March 22, 2021

Dear Mr. Madoff69,

I believe that your analysis has flaws. It would be quite awkward on social media to use the a more formal way of addressing people. Twitter and other platforms have a "style" of conversation and trying to fit the square peg of formal writing into the round hole of internet conversation sounds stilted. I do not understand why you think corporations would decide to do that, no matter what their intentions are.

I humbly await your response.

Yours Truly, Mr. rPlayer

P.S. I hope your Grandmother is doing well. Please send my regards.

P.P.S. Please invest in my new cloud computing blockchain biotech startup where we sell NFTs.

TeMPOraL · on March 22, 2021

Dear Mr. rPlayer,

Please update your signature to conform with the current standards, as outlined in the last month's circular.

Best Regards, TeMPOraL

--

TeMPOraL, Internet Compliance Officer (ICO)

ACME LLC - Synergizing Creative Accounting

ACME LLC, NaN NaN, Null Islands.

The content of this message is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party, without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future.

Please do not print this message unless it is necessary. Every unprinted message helps the environment. Think of the trees!

dgellow · on March 22, 2021

It's even better with the small tree icon hosted by an external service, thus leaking some data when you open the email :)

simias · on March 22, 2021

I don't use twitter so I don't know what the etiquette over there but outside of emails I wouldn't normally expect any salutation in an internet message.

So for me it's not so much that the salutation not formal enough, it's more that it's odd that it exists at all.

But then again maybe usages differ in twitterworld.