notes from an opinionated talk about running IPv6 in production

A few years ago, I was at SCaLE, and attended an excellent talk by someone who operated several campus-wide internetworks, and their hard won experience with IPv6.  They were very opinionated.  I loved it.  Here are some of the notes from that talk:

QoS is a bad word.
Control freaks love QoS.
They can debug it themselves.
People who have are held to SLAs operating production networks have better things to waste their time on, and better ways to crash their switches.

"But I'm not running IPv6!"  That means you actually are, and are nor longer in control of your network.
"I will block IPv6!".  Say goodbye to all the grants that pay your salary.  And everyone's desktops and devices will just make tunnels anyway.

Say NAT one more time, I dare you.

If you think that NAT is protecting you, let me know who you are, so I can blackhole your address range and your IS.

Turning off v4 ICMP is just stupid.
There are lots of stupid people.

You cannot turn off icmp6.
There is no frag in v6.
Thus mtu detect must be on.
Thus icmp6 must be on.
Live with it.

dhcp6 is port 547 not 67


Regarding that article about gender bias in GitHub Pull Requests

Regarding "Gender Bias In Open Source: Pull Request Acceptance Of Women Vs. Men", or even worse, regarding all the uncritical and breathless articles by the BBC, Vice, HuffPost, and so forth:

First of all, anyone who names their project "DeveloperLiberationFront" and uses an icon of a raised fist in woodcut style, has already predeclared their bias away from objective truth.

Second, the authors of the paper exhibit little knowledge of about the large differences in workflow between different projects, and no knowledge about all the different ways that PRs are used and all the different meanings of an "abandoned PR", and also their definition of "project insider" is broken, as for many projects, an "insider" has write access, and may never use PRs at all.

Third, despite GitHub's growing influence, just grabbing tens of thousands of GH PRs is not in the slightest bit representative.

Fourth, their process for computing the gender of PR authors is laughably bad, for reasons that went on for 3 paragraphs before I edited down this text.

Fifth, how many have heard of "p-hacking"? or even have ever actually computed a p value since you took that really annoying stats class in college?  Did you even notice that the this paper both obviously did p-hacking, and then didn't even report the p values?

Finally, allow me to present the following disruption to the breathless and self-reinforcing narrative:

"So, let’s review. A non-peer-reviewed paper shows that women get more requests accepted than men. In one subgroup, unblinding gender gives women a bigger advantage; in another subgroup, unblinding gender gives men a bigger advantage. When gender is unblinded, both men and women do worse; it’s unclear if there are statistically significant differences in this regard. Only one of the study’s subgroups showed lower acceptance for women than men, and the size of the difference was 63% vs. 64%, which may or may not be statistically significant. This may or may not be related to the fact, demonstrated in the study, that women propose bigger and less useful changes on average; no attempt was made to control for this. This tiny amount of discrimination against women seems to be mostly from other women, not from men."
// ScottAlexander

If this was a real paper, submitted for real peer review, a good peer review would be:

"1. Report gender-unblinding results for the entire population before you get into the insiders-vs.-outsiders dichotomy.
2. Give all numbers represented on graphs as actual numbers too.
3. Declare how many different subgroup groupings you tried, and do appropriate Bonferroni corrections.
4. Report the magnitude of the male drop vs. the female drop after gender-unblinding, test if they’re different, and report the test results.
5. Add the part about men being harder on men and vice versa, give numbers, and do significance tests.
6. Try to find an explanation for why both groups’ rates dropped with gender-unblinding. If you can’t, at least say so in the Discussion and propose some possibilities.
7. Fix the way you present “Women’s acceptance rates are 71.8% when they use gender neutral profiles, but drop to 62.5% when their gender is identifiable”, at the very least by adding the comparable numbers about the similar drop for men in the same sentence. Otherwise this will be the heading for every single news article about the study and nobody will acknowledge that the drop for men exists at all. This will happen anyway no matter what you do, but at least it won’t be your fault.
8. If possible, control for your finding that women’s changes are larger and less-needed and see how that affects results. If this sounds complicated, I bet you could find people here who are willing to help you.
9. Please release an anonymized version of the data."
// ScottAlexander

I am willing to bet money that doing real honest academic statistical analysis of their raw data will invalidate their implications and their claims.


Why SSH keys dont have metadata

And other tech rant. It was recently asked, in a forum that I read, the following: "Why is it that SSH public keys don’t have an embedded expiration date, anyway? PKI certificates have them."

My response:

Because as soon as you start adding all sorts of metadata to a key, then everyone will start adding all sorts of metadata to keys, with all sorts of obscure rules about how metadata interact with the environment and various implementations whether a key works or not.

And then the lawyers will show up and insist that you imbed 30 page PDFs of Word docs of someone’s T&Cs and their contracts of adhesion and their “don't hold anyone with money responsible for anything” disclaimers into metadata (you think I joke, I do not at all, this literally regularly happens with “standards based” PKI certs).

And then your keys are going to be huge weirdly encoded binary blobs of shit that you don’t have good tools to manipulate. And you will need to keep special indexes of them, and “bundles” of them, in multiple conflicting filesystem paths and “key stores”.

Part of why SSH took off at all in the first is because it doesn’t have this complex garbage wankery . An SSH public key is a SINGLE LINE, of printable ASCII7. You can edit and clean up your ~/.ssh/authorized_keys file with a textmode text editor.

The lack of metadata in SSH is a feature, not a problem.


This is how to do it, or waving my cane.

1. Design a data abstraction that solves a class of problems.

2. Design a good wire protocol for that abstraction.

3. Better yet, design 2 protocols: one server-to-server and one client-to-server. Federation is the only model that has ever scaled large enough.

4. Implement a simple as possible server. Do not try too hard to make it performant, just very easy to install and very easy to understand. This is the protocol reference implementation.

5. Implement an open source client library, that completely covers the entire data model and the entire wire protocol.

6. Implement another open source client library, in a very different programming language. If this is difficult, you let your knowledge of your favorite language overconstrain the wire protocol. Go back to step 2 and fix it.

7. Implement a command line client on one of those libraries. Again, it must completely cover the entire data model.

8. Implement an ok GUI app.

9. Implement a very high performance highly scalable server. If you are tempted to change the wire protocol to do this, you screwed up.

10. Now, and only now, you can implement a very nice easy to use GUI. At this point, and at this point only, do you bring in any "designers", "UX" people, or anyone who uses Photoshop as working tool.

Of course, for the past 15 years, everyone has been doing this backwards, with disastrous results. It takes huge amounts of wasted CPU and wasted money by the millions and billions to make all the resulting garbage work at all.