BSides SF 2022: Read the Fantastic Manual

This is the blog version of my BSides SF 2022 talk, “Read the Fantastic Manual: Writing Security Docs People Will Actually Read”. I’ll embed video here when it’s available. If you came here from my conference talk: hello! I’m glad you’re here.

Let’s talk documents. Every company needs them, and security in particular needs to be able to effectively educate at scale. We’re going to look at the ins and outs of creating (or reviving) a document repository that explains security needs to the rest of your company, though most of what’s discussed here is useful for anyone who needs to relay information to anyone else. What I’m describing is best for companies of up to perhaps 4,000 people and not so much your very large companies, which often have comms teams for this kind of work. The strategies here help people doing that work too, but as those companies can often afford a content strategist or two, it’s sometimes less critical for a security engineer to put the time into effective doc creation.

How do we figure out how to write?

Before we throw a bunch of work into this, we need a strategy. We need to work to understand what needs to be written and what needs we’re looking to meet with our documentation.

Learn from the present

The people we’re hoping to serve with documentation can give us a good idea of where to start. If it’s safe for you to be back in the office, this is a great time to eavesdrop. If you’re still remote, it’s time to read through the chat and probably a lot of it. If you don’t have a channel called something like #ask-security, a place where it’s clear people can ask questions about security and get answers, it’s time to make one. Not only does this tell people that you want to answer their questions, it also gives you a primary place to see what concepts and processes people are struggling with. What questions come up over and over? What security-adjacent work is making development proceed slowly or unsafely? See what people ask, and you’ll get your first ideas of what documents could benefit the people around you.

We also need to get a sense of what’s going on with the existing docs. In addition to understanding what’s already out there, check the views, opens, or whatever metric your document repository of choice offers. It can be gutting to see that some important doc has been opened all of 13 times (particularly if you wrote it!), but it’s information that we need. We also need to ask people what they think of the existing docs: are they written the right way? Do they cover the right material? What gaps are there? Getting honest feedback can be tricky in some work cultures, but we’ll talk about ways to work with that later.

Learn from the past

We can learn a lot about what to do and what to avoid by talking to the person responsible for existing documentation (though that might, of course, be you). If that person is no longer at the company, try to find someone who knows why that person is no longer there. If there are predictable frustrations, find out about them and figure out some ways to prevent them affecting your work.

Go back through chat histories, old all-hands decks, or other places the past lives and figure out what docs people have historically shared. What’s commonly passed around, and what does it seem like people are unaware of? If something important isn’t mentioned or looked at much, why do you think people might be unaware it exists? Keep at this until you have a good sense of behavior of olde and how well they’ve done.

Pace yourself

Documentation is a marathon, but there are things to keep in mind that can keep you going. For instance, one perk of a nonexistent set of documents (or docs that haven’t gotten the attention they need for a while) is that people will notice when you put some effort into it. A little love applied to something that’s been known to be subpar for a while will be appreciated.

Also reassuring? Perfection is impossible in this work. No, I swear that’s actually awesome. We’re looking for good enough, better, functional, and improved – not perfect. This frees us to aim to do good work without smothering it with an impossible quest.

And while you’re diving into this work and figuring out where your efforts are most needed, try to work on just a couple of questions at a time. We can’t boil nor drink the ocean, and we certainly can’t address all of an engineering org’s or company’s security questions at once. Looking at just a couple questions at a time – questions like “How does this company want to deal with phishing threats?” and “What are some simple ways to boost broad security awareness?” – keeps you from being overburdened and gives you the chance to find interesting connections that being single-minded about this work can obscure.

In short: how do we know what to write?

  • Listen to conversations and read the chat to see what people know and need to know
  • Check stats on existing documents so you don’t duplicate work
  • Ask around to see what’s working with the existing documentation
  • Work to answer two questions at a time as you proceed through this work

What makes a good document?

Bought a triple cat bowl holder so when they're eating the mogs can line up perfe ,,,
Oh never mind.
(Photo of a wooden frame holding three bowls of kibble, followed by a photo of three cats eating from it, all overlapping each other instead of lined up one in front of each bowl.)

It’s a good document when it enables people to carry out the behavior you want. It’s not uncommon for docs to be like this well-intended and very cute cat feeding trough: you thought you were making one thing, but your users thought you made something else. 

How do we build a better document?

First, we need to be thorough. There’s a tendency in some tech documentation to make a lot of unquestioned assumptions about the reader, leaving out vital details, specifics the writer deemed unnecessary, or other key pieces of information that can derail the reader as they go read something else to try to fill in the gaps.

Instead, add whatever you know about the subject. If you know specific commands, file locations, steps, links to background information, or anything else that can help the reader accomplish their goal, just include it. We want this to be a complete resource for finishing the task the doc is meant to illuminate. What if your entire team got food poisoning or, you know, Covid? We’re writing for the survivor. We’re writing to help Sigourney Weaver get to safety in Alien. Do it for Ripley.

Docs should also serve people having a bad day, which is not rarely the case when they’re trying to figure out something difficult that stands between them and finishing their work. Aim to write a friendly document – or, at the very least, not an actively adversarial one. We want the user to believe – to know – that the doc’s author is on their side.

To that end, the doc should speak the user’s language, and it should not use a different vocabulary than the intended reader. It’s fine and normal to introduce potentially new terms, but we need to explain them. This can mean tooltips, explaining acronyms before relying solely on abbreviations, and linking to explanations.

To best accomplish this, we need to have a specific audience in mind. When you’re writing a doc, complete this sentence before you start writing:

I am writing this doc for $audience to describe $information for $purpose.

Some examples:

I am writing this doc for front-end engineers to describe our preferred form libraries for XSS prevention.

I am writing this doc for everyone to describe common phishing techniques for thwarting recent spear phishing attacks.

I am writing this doc for that executive to describe why we need this tool for upcoming budget planning.

(We all know that executive, don’t we?)

Each of these gives you a sense of the reader’s context, what the content will be, and a definition of success or failure. Once you know this (and I suggest pasting your completed sentence at the top of your nascent doc for guidance you start to shape it), you’re ready to write. Once you’re finished, ask someone who’s a member of your stated audience to review it to make sure it does what you think it should.

How do we spot a good document?

  • It includes everything we know that a reader might need to know about the subject or process
  • It’s friendly and speaks the reader’s language
  • It has an intended audience, information to be conveyed, and a purpose for existing

How should we write it?

Alas, for you, I have the perpetual security answer:

A vector graphic of a Magic 8-Ball with a blue triangle reading "IT DEPENDS"

There are so many different kinds of documents we may be called to write. Here are a few of them:

  • Process descriptions
  • Reference
  • Architectural decision records
  • Where the bodies are hidden
  • How to recover from an emergency
  • What teams do
  • How to reach people
  • Lunch menus
  • On-call schedules
  • etc. etc. etc. etc.

Your purpose determines what you need to write.

Let’s look more closely at a few common possibilities.

Playbooks

You had a thing happen; here’s what to do. An example: you need to record a vulnerability. You need to get PII out of the logs. You need to create a new shard for your database.

These tend to be either frequently referred to or rarely looked at but still extremely critical. List out steps, be thorough, and expect to refine it a few times, especially if you’re writing for people who are just learning this stuff.

Other process

This can describe what a team does, how to do something essential but less pressing than what’s described in a playbook, or any other how-to.

This also includes process flowcharts, which I will confess I often hate. It’s not the flowcharts’ fault; it’s that I’ve most often had them lobbed at me by teams with a stronger interest in getting me to go away than in helping anyone. If it’s a team’s first offering when you ask for more information, I become immediately leery. They aren’t as accessible as people think, especially if you’re throwing out a rat’s nest of overlapping swim lanes.

A very involved flow chart with a ton of text, including groupings that say "ANSF TACTICAL," "POPULATION CONDITIONS AND BELIEFS," and "POPULAR SUPPORT." It is too tangled to get any real information from. You are not missing anything here.

I googled “terrible flow chart,” and the internet delivered. It’s funny, but it’s also not far off from experiences I’ve had with these from unhelpful teams. We can do better.

If you must use one of these – and yes, there are legitimate reasons for it, though fewer than some people think – ask someone outside of your team to review it before you throw it at someone who depends on it to get their work done. If you’ve only run it by people on your team, there will be opaque parts that need to be fixed before you call it done.

A grab bag of writing advice

Keep it friendly; informal is probably best, because security can be scary.

Remember, your doc may be someone’s best hope on a really bad day of work. We want to be friendly, and being informal where you can can pay dividends. Remember too that lots of people have had bad experiences with people on security teams. Getting to be the friendly helper on a bad day is an incredible opportunity to make the future different from the past.

Explain all abbreviations and acronyms.

Tech runs on acronyms, but we need to anchor them. When you first use an acronym, abbreviation, or other specialized vocabulary, explain it in some way. Write out what the abbreviation represents, and provide a resource to explain other terms the reader may not know yet.

Link generously, both internally and externally.

Linking to other internal docs is a great way to both show the user what other resources are available and make what they need available without them searching for it. Linking externally, to the resources you inevitably come across and rely on when creating a doc, gives the user who wants to learn more resources you know are good without making them search and vet on their own. You already did the work; let someone else benefit.

Make it clear what person is behind a doc; someone will need to know.

Sometimes, documents can seem to exist in a void, but a human always started it. Make it clear what person or team wrote the thing the user is reading. It humanizes things, and it makes it easier for them to reach out to ask for more, request changes, or just say thank you.

Chunk information by complexity with headers to enable skimming.

People read in lots of different ways, and we can support them by sprinkling a little information architecture on. Use headers liberally, bold critical information, and enable skimming wherever you can. Because we can…

Never assume anyone will read the whole doc.

I know, I know. I tend to read entire pieces of documentation. Maybe you do too. Most people, however, do not. Explain things concisely; some people may read just the paragraph they need, and that’s totally ok. In fact…

A doc must work in isolation from any supporting material.

We sometimes need to write lovingly crafted suites of documentation. I love that for us. However, we can’t assume anyone will actually consume all of it. Instead, if you’re detailing a process, assume someone will never follow the internal links you crafted or follow the divine path of information you’ve laid for them. Never make it necessary to read across three or four docs to get someone through a problem if you can help it.

If in doubt: gifs and animal pictures keep people’s attention.

A grumpy tortie cat sitting on carpet, her front legs crossed, looking at the camera as if she thinks you've done something really dumb


It works for me, anyway. If you need to hold people’s attention through something dry, throw in fun things. Surprise can keep people engaged for longer.

Include how to give feedback in each doc: survey, chat, comments, etc.

If people want to ask for more, offer a correction, or compliment you, we want them to be able to find us. Have something at the bottom that tells them where to provide feedback. This is, you see, a gift.

How do we write our docs?

  • Know what kind of doc you’re writing.
  • Remember that your job is to explain AND be approachable.
  • Act like each doc must exist in isolation.
  • Make it skim-friendly for people in a hurry.

Where should our completed docs live?

First, we have to find all the options. There seem to always be at least two official doc repositories; usually one is Google Drive, but not always. It’s our job to figure out which one to use. Both may be in play (or more!), but usually only one is the right answer.

We also have to keep an eye on chat and what gets said there. Chat is not documentation, but it is an information repository. Watch how links are shared, what information is posted, and how people react to it.

When you choose your one true doc repository, figure out how it works. For instance, how are docs marked as drafts vs. completed?

We can learn a lot of this through asking questions and observing. We can also learn by choosing incorrectly and getting corrected (and probably will). This is normal and not a problem, so long as we absorb the information given to us.

We also have to understand how people look for information. Do they horde and share links, do they roll the dice with search, or do they graze until they find what they need?

As we figure this out, it’s important to remember that there is no perfect solution here. No, seriously, this is actually great. Documentation can be more or less effective, but it’s unavoidably a subjective exercise. We can aim for better, but we’ll never achieve perfection. And if you’re tempted to suggest importing your immaculate documentation storage solution from your previous job, thinking that will surely solve all the problems you’re facing, I invite you to remember the joke about how you now have 24 Javascript frameworks.

Instead, we’re looking for more like a 75 percent solution to our problem. You know how Cs get degrees? This is that situation, except our C is consistency. Here’s a secret: mostly it doesn’t matter where our documents live, so long as people know where to look. Whatever you do, just stick with it until you have a good reason to change course. Then iterate and communicate what you’re doing to the people who depend on you. No, really, that’s it. Yes, there are some really painful places to put documentation out there, but if we’re dealing with those, we’re probably fine so long as we stay the course.

How do we figure out where to put the docs?

  • Figure out all the places docs can live
  • Figure out where people most often search
  • Figure out how they search
  • Match it when you can, especially at first
  • Stay the course and iterate deliberately and clearly

Circulating our work in the company

I know some people hide in documentation to avoid humans, but I regret to inform you that we need to talk to people so they know our work exists. I promise it can be rewarding, though. Here’s how.

Chat, especially in this more remote era, is one of our best tools. As before, we want to watch for opportunities to provide links to resources we’ve written at just the right time. Now, we’re looking for when questions come up, even if they’re not directly phrased as questions. We also want to make announcements when we’ve created something new.

Meetings, though painful in too great a number, remain a great tool for socializing our work. I like team meetings for this, especially if I’ve written something because of a need surfaced by a particular team. It’s genuinely fun when you know someone’s struggled with something and their life will become easier because of your work.

All-hands and other larger, higher-level meetings are great for this too. Less targeted, they make up for the broader aim with sheer numbers. Better still is if you have a regular segment about new documentation so that people know to expect these things monthly or quarterly.

A word on bribes and incentives

It can be enough to make written resources people need, in a way that works for them, provided in the right medium for the way they glean new information. That’s great. However, if we want to influence behavior, bribes and incentives are very useful.

Stickers

Stickers are a classic for a reason. Why do we tech types go apeshit for these? I don’t fully know. Not everyone who loves them was a Lisa Frank freak as a kid like I was, so I can’t account for it, but it works very well, especially if you do limited-edition ones for an event or other achievement. They’re inexpensive but fun, and people will do things you want of them if you provide stickers. I’ve gotten them from Gibson Print and Sticker Giant. My current company gets nice ones from Printfection.

Badges

People like flair. Give them flair. You may need to make friends with your company’s design department, but I swear this is actually a fun errand.

Rewarding ambassadors

There are lots of ways to do this. Having an organized kudos program where people nominate each other for doing good security work is one way; complimenting someone to their manager is another. Call out good work when you see it.

Coffee

Most offices have free coffee, and a lot of them (especially in the Bay Area) have legitimately good coffee. Despite all this, the $5 Starbucks card still feels special. Maybe it’s the permission to go get something covered in whip, I don’t know. But it works.

The common thread here is that effective incentives don’t need to be expensive. These are not bonuses; those are done by another team and likely above your paygrade. Instead, the goal of this kind of treat is to make a compliment more tangible. We want people to know we see and appreciate their efforts, which makes them more likely to do it the next time they get the opportunity.

Get comfortable repeating yourself

We rarely learn something on the first encounter with it. Instead, the journey to full documentation awareness for any given user is more likely chat + seeing it in the docs repo + a mention in a meeting + a link provided by a teammate at just the right time. This is to be expected; it’s normal and fine.

If you feel uncomfortable repeating yourself (as most non-tedious people do), make a calendar for when you’ll announce things in meetings, chat, and other places. This way, when you feel like you’ve said it too many times, you can look at your schedule and know exactly how many times you’ve said it. This also makes (yes) iterating easier, because you can see at a glance what you’ve done and decide whether more, less, or different would be a good idea.

How do we get our work out there?

  • Promote where the people are: chat, meetings, all-hands.
  • You are going to have to repeat yourself.
  • You are going to have to repeat yourself.
  • Bribe people toward the behavior you want.

How do we know if our efforts are working?

We have to evaluate what we’ve done.

We can ask people via regular check-ins and one-on-ones with people who gave us pointers on where to put our efforts. We can do surveys, though some people have had bad experiences with them. If you keep them short, to the point, and connected to a personal appeal, you’ll likely get some good information. We can look at metrics, going back to the views and opens we discussed earlier.

We can also find a goal person. This can be a persona, but if you can, try to find a real person whose needs you’re looking to meet. Once you’ve made the resources they need, you can find another goal person and start working to make other people’s lives easier.

We also need to review our body of work regularly. I like quarterly for this, but if that sounds punishing, biannual can work too, so long as you don’t skip it. We need to know what we’ve published, how it’s being used, and what it tells us about the future of our work.

It’s also a great time to review feedback, which ideally we’ve been collecting all along via our feedback call to action in every doc we’ve published. If you haven’t done that already, this is a great time to do it.

A note on soliciting feedback

If we are very lucky, we work at a company that prioritizes psychological safety. However, if you don’t, you may need to do some extra work to get feedback from people. At psychologically unsafe companies, what you know to be a friendly request for an opinion can sound like a trap to people who’ve gotten used to living in these cultures of no. Honest feedback depends on people feeling safe enough to be honest. Alas, if you’re in charge of security documentation, you probably don’t have the stature in the company to make a culture-wide change of this kind.

However, you can make a psychological safety bubble around you and your team. Like the rest of this work, it takes steady effort across time, but it can be done. When you ask for feedback, make it clear that you really want to hear the tough stuff, because you really want to improve things. Emphasize that you’re asking for opinions on a document, not on you or your team. And, when you hear difficult things, take that feedback and use it.

Across time, people will begin to trust you (if they didn’t at first) and be more immediately forthcoming; seeing their needs addressed warms people up, even if they’ve had reason to be withdrawn.

This can be tough to do, especially if you’re not used to it, but this work can’t go without honest reactions and opinions.

What next?

We get to start the next cycle. Ask around, do your research, determine new needs, and keep mapping the future.

How do we measure success?

  • Do surveys, ask people
  • Review regularly
  • Create a psychological safety bubble if wider culture doesn’t support it
  • Start the next cycle!

Keep with this cycle of listen, learn, implement, and iterate, and across time even the worst problems of documentation will begin to lessen. As documentarians, it’s our job to serve our colleagues. It’s tough work, but it scales better than almost anything we can do, and it’s available even when we’re not there to answer questions. I find it to always be worth the work, and I hope you think so too.

PancakesCon 3: CLI Server Surveillance and Writing Your First Novel

This is the blog component of my PancakesCon 3 talk on 16 January 2022. Video to come.

My monthly writing newsletter lives here.

Part one: CLI Server Surveillance

This part of the talk concentrates on CLI tools that enable OSINT: Open Source INTelligence gathering. This is when you use publicly available information to learn more about a site, company, network, or other entity with the goal of getting a more complete picture than the individually released pieces of information were probably intended to create. The commands and how I’ll suggest using them are all legal to use, but I’ve noted certain applications here and there that cross that line. If in doubt, remember that typically one person using one command at a time is fine, but scripting something that hammers a target, particularly without permission, is a bad idea for lots of reasons.

I primarly work on a Mac, so the commands below are ones I’ve tested out there. Usually (but not always!) the commands and flags are the same on Linux. Try them out and learn some stuff about your system. If all else fails, add the -h flag to the command. You’ll be fine. A note on systems and assumptionsI’m Mac based, but the commands and flags are often the same on Linux… but not always. Try them out, learn some stuff about your system. If all else fails, -h! I’ve noted Windows equivalents wherever possible, but the specific flags and placement of parameters will be yours to discover.

You have to be specific to be terrific, so let’s establish what we want to learn about a website or server with these tools. We’ll be looking to answer these:

  • What’s this server doing?
  • What’s this server running?
  • What ports are open?
  • Who owns this server? 
  • Who is the domain registered with?
  • What nameservers hold the records for this domain?

Why do we want to know these things about a server? I’ve most often done this work in the early stages of researching a target for my job – with the specific, written permission of the owners of the servers. In my last job, I sometimes did timeboxed pen testing for potential vendors, and I would start out in a big-picture way: what can I figure out about how this site runs from the outside? Without setting off alarms on the side of the folks who own the servers and without being disruptive or possibly breaking the law, what can I learn? 

The tools I’m about to describe are either available on a fresh OS install or are so standard that the steps to install are well documented. I had to send a few texts to my boyfriend asking him to run commands on his normal-person Mac, as I couldn’t remember if I’d installed some of these myself or if they came standard. There’s a blog post version of this talk, and it’ll include links to installation docs if needed. 

The tools

ping

ping answers questions like whether the server is up and what the server’s IP is, based on providing the domain. It uses ICMP – internet control message protocol. It’s a protocol like HTTP, HTTPS, and FTP (plus a bunch of others), although it’s part of the internet layer – those other three are application layer.

Traceroute, described below, uses ICMP too, and it’s a common protocol for diagnostic stuff. It’s an IPv4 tool, though there’s an IPv6 version too: ping6. It’s used for diagnostics because you get an echo reply if the host you’re pointing it toward is there. Implementations of ping vary a little: payload size, TTL, time to wait for a response. The echo reply is sometimes called “pong.” :)

My most common use of ping is when I realize my internet connection is being a little funky, and I want a quick sense of how broken stuff might be. Nice quick diagnostic tool when you’re trying to figure out where to start with troubleshooting. Is it zero percent packet loss or not? Beyond that, it’s oOne way to get an IP from a URL or quietly check if a server is live. It’s also possible to change the origin of the ping (though that rules out getting a response, of course).

Ping is a lovely, ordinary tool that’s been put to a couple of good, weird uses (which you should not try at home. A ping flood is an avalanche of ping requests that don’t wait for a response, with the goal of overwhelming the target and creating a denail of service. Another nefarious use of innocent old ping is the (wait for it) PING OF DEATH. Instead of a ton of individual ping requests, ONE GIANT PING would be sent, because older systems would crash (or experience buffer overflow) if you sent a packet bigger than 65,535 bytes (the size of a correctly formed IPv4 packet). This is mostly a thing in the past – firewalls and other mitigations make these generally a nonissue now. However, it was exploited in 2013 against a version of Windows! It’s patched now, but 2013!

Some sample commands

ping $url or $ip

ping -a $url or $ip (dings with each response, because you need that)

Stop after five pings: ping -c 5 $url

Intervals between: ping -i $seconds $url

It’s the same on windows: cmd > ping.

curl

curl – short for “client URL” – debuted in 1996, so it’s a more recent arrival. (Ping was born in December 1983.) It can answer questions like:

  • What’s this server serving?
  • What server software is it running, and what version?
  • What HTTP response code does it return to various requests?

I reach for it when I either want to see something from a server but not in a browser, or I want to see information from the server that typically doesn’t show in a browser.

Some sample commands

Plain old curl $url if I’m curious what’s behind phishing URLs in texts

curl -I $url for headers (case sensitive!!!), which is the same as curl --head $url

Add -v for a little more detail about certs, TLS handshakes, IP addresses, ports, and more nuts and bolts of what’s going on

And yes, it works in Windows too: cmd > curl

When I read through the headers, I’m looking for indications of server software, and most especially for outdated versions of that server software.

nmap 

nmap is short for network mapper. You might have to install this one, but the instructions are clear, or: brew install nmap. nmap can answer questions like:

  • What ports are open that I could possibly connect to?
  • What software is running? (With a little extra research)

nmap does lots of different kinds of scans. You can go deep on this one. It can also be very noisy if you’re trying to scan things quietly (but legally!). Hitting all the ports can draw attention you don’t want, so be strategic. It uses ping but can also use TCP, UDP, and other options as you do your recon.

nmap is one of those tools where you can use five percent of it and still do a ton. There are amazing depths to be found – it rewards putting time into it. We’re mostly focusing on port scanning here, though, since we’re looking at other people’s servers, not investigating networks we’re already in.

Some sample commands

nmap -p# $url or $ip (can be -p22 to see if SSH is available, or -p0-1023 for reserved ports, or a different selection)

nmap -F $url or $ip for faster, fewer ports than default scan

And yes, it’s available on Windows too.

Yes!

nmap has a ton of resources available, including a guide to port scanning basics and one of many, many broader guides.

netcat

netcat is another networking tool. For our purposes, it can answer questions like:

  • Is this port open?
  • How do I write a GET or other request?

netcat is generally good for networking things; it can be a quick way to see if a port is open. Beyond that, it’s a neat tool to put you right in the middle of how requests work when you send them to a server. Connect to a server and then send a get request for /index.html. You can get some visibility into networks with it, and also it’s just fun to mess with. What can you GET? 

Some sample commands

netcat -vz $url $port to see if the port is open (just a little quicker than nmap sometimes)

-z for zero I/O (so just scanning)

-v or -vv for verbosity, of course

-T for telnet!

-t for TCP mode

netcat -vt google.com 80 (verbose; tcp) and then GET /index.html

Try ncat for Windows.

traceroute (or tracert on Windows)

traceroute can answer questions like:

  • What equipment sits between me and this server?
  • Who hosts this site?
  • What does a request’s journey across the internet look like?

If you’re also a networking nerd, traceroute is really interesting. It’s another diagnostic program in the tradition of ping. It shows possible routes between your computer and the server you’re trying to reach, plus transit delays across a network. Ping only does round-trip times with success or failure; traceroute shows you everything in between. It’s a fun way to get to know some networking stuff, like the private IPs for your home devices, what it looks like when your request is making its way out of your ISP’s domain and when it hits the open internet, and when the request resolves. 

The route is probably going to be at least a little different every time so run it multiple times to see a little more of the world. You’ll probably get some output that just has asterisks. This can mean timeouts, firewalls that prevent data being sent back to you, or equipment configured to reject ICMP packets (which are sometimes not prioritized)

Generally I’ve used this without flags and just watched the output. It was a favorite tool in the networking class I took at the Bradfield School of Computer Science. The various flags let you control TCP vs. ICMP for your requests, the number of probe packets per hop (defaulting to three), wait times, ports, TTLs, and stuff like that. 

Some sample commands

traceroute $url or $ip gets you a lot of what you’ll want here. Tinker with flags if you’re curious, though; you’ll learn something cool (even if it’s “don’t do that again.”)

It defaults to IPv4, but you can specify either with the -4 or -6 flags

whois

whois can answer questions like:

  • What are the (possible) nameservers for this domain?
  • Who owns this block of IPs (maybe)?
  • Who’s the (likely) registrar for this domain?

Note the qualifiers. I’ll explain that in a minute.

whois works with both URLs and IPs as parameters. IPs are extra fun, because you might get details about a CIDR range, which is a block of IPs designated using a 0.0.0.0/0 syntax. Check out cidr.xyz to learn more about what that means.  

For URLs, you get nameserver details and other stuff IANA (the Internet Assigned Numbers Authority) stores. It can begin to tell you who owns a domain or IP, where their information is stored, where the site is hosted, where the domains are registered, and other technical details. There may be contact information in here, but most often it’s business details these days.

whois gets its information from a whois server – think a database of “we expect things to look like this,” which is (as we know) not the same as “things look like this in practice.” Sometimes it’s useful to see what things were once like or were expected to be like. In OSINT, additional information, even if it’s no longer accurate, can still provide helpful detail.

Some sample commands

whois $url

whois 8.8.8.8 (or any IP)

The whois command is the same on Windows.

dig

dig is short for domain information groper. It can answer questions like:

  • What’s lurking in the DNS records for this domain?
  • What do the domain’s actual nameservers have to say about all this right now?

dig provides DNS records for a domains, which can include some fun, weird stuff, especially in the TXT records. Information comes from the domain’s nameservers. Since it uses DNS information, it pulls from the same information used to resolve a domain put into a browser’s address bar. DNS is the authoritative source here. If in doubt, go with dig.

Some sample commands

dig $url

dig +trace $url to see how it does the search 

dig google.com TXT (or MX or CNAME or any other type of record) to poke around some more. TXT records are my favorite because they’ve been used for so many weird things: SPF (sender policy framework) records to reduce spam and fraud via email, but also marketing software they might use and other sites that require verification that way. 

If there are many IPs that resolve for a domain, it’ll show you four.

dig is available on Windows if you install BIND. You can also try nslookup.

host does something similar but only returns a little information: IP if you provide the domain, domain name pointer if you give it an IP. If you’re using it in a script and want to just use a little bit of the output, maybe processing with awk, this can be simpler and more predictable to process.

A few other useful tools for your OSINT 

Wappalyzer offers Chrome and Firefox extensions that tells you the tech stack sites you visit.

Company press releases, articles about the company, and interviews with employees can give you insight too.

Look the company up on GitHub and see what’s public.

Check LinkedIn to get an idea of staffing (especially in the security teams).

Never underestimate viewing the page source (use this at your own risk if you live in Missouri, though).

And clever people have written other libraries and projects that bundle these tools, like this osint Python project.

These are fun and useful ways to learn about tools and computers, get a sense of how networks work, and learn things about a target. Individually, they’re powerful and useful. If you combine these things into scripts that give pretty output that matches your goal and how your brain works, you can do some amazing stuff.

Right! Onto part two of this talk.

Writing your first novel

Our goals here are to learn:

  • How to complete a first draft without feeling terrible all the time
  • What a first draft is and our working definition of a novel

A first draft is the first attempt at telling a long-form story. It is not a final draft; it’s material to be used when you revise later. Your early goals are to get from A to B, figure out some of who your characters are, get a sense of what’s important to you in this story, and give yourself something to work with in revisions. Nothing more.

Most likely, though, you’ll manage more than that as you go along. You might find that you’ve created whole people, for instance. 

When I say “a novel,” I’m referring to a work of fiction that’s likely between 70k and 100k words (though epic fantasy can go longer). It can (and likely will, at some point) incorporate something from your life, if only feelings, but overall, it’s full of stuff you’ve made up.

The great thing about a first draft is that it is, by definition, not a final draft. When the first draft is done, the arrangement of all the islands in your little imagination archipelago is probably not final, but they’re there, and you’ve written the whole thing down, for your definition of “the whole thing.”

Revisions come later and are their own very large subject. Today, I’m trying to get you from “I’ve written zero novels” to “I feel like I could totally write a first-draft novel.”

Because I bet you can if you want to.

Why write a novel at all?

For some of us, it’s really fun! Maybe you have a character/scene/place in your head that you want to put into words, maybe in support of a goal. Or perhaps you have an experience you want to write about but want to fictionalize it a little or otherwise just don’t want to write a memoir.

Or maybe you’re me, and armless escapism is really useful sometimes.

What you need to get started

You need to want to do this. This sounds obvious, but a lot of writing advice can get unhelpfully abstract about this point int he process. But seriously, the single most important thing (and the same advice I give people considering getting into tech) is that you have to like this. You can figure out the rest later, but if you like the idea of having written a novel better than the reality, you’re going to have a much harder time.

You need a seed of an idea. A scene is enough. A character you want to follow around and create adventures for is enough. You just need a place to start tugging the thread and trust that the rest can follow. 

How should you write, practically speaking?

Any of these work:

  • Longhand, if you want
  • Google Drive (my choice)
  • A notes app on your phone
  • Git
  • Microsoft Word (seriously; it’s pretty standard among professional writing/publishing types too)
  • Scrivener

Anything that fits how your brain works and lets you sit and write with minimal friction is the right tool. It should mostly invisibly support how you do things.

A little more on first drafts

I know, I know, but the thing is that a lot of lofty “how to write a novel” advice doesn’t emphasize the special and rather wonderful qualities of writing a first draft.

A first draft is:

  • A way of getting from point A to point B in one of many possible ways
  • A great chance to delight yourself
  • A place to push boundaries of the story, because logic and characterization aren’t set yet
  • Probably going to have repetition in it
  • An inelegant first route that can still be super fun to write

A first draft is not:

  • A chance at perfection (that comes maybe later but probably never)
  • Your only chance
  • Required to contain everything that needs to be there (I just changed the name of a major character almost two years after she first entered my head because I realized something didn’t work)
  • Required to be written linearly – it’s totally ok to bounce around, leave placeholders, write yourself notes, etc. I sometimes put BREANNE in places so that I can cmd-F and find the things my past self wanted me to put some time into. 

A first draft is woven cloth, not a finished dress. It’s clay, not a sculpture. Go wild.

Planning

Writers and their ways of planning fall across a spectrum. At one end, we have plotters; at the other, pantsers.

Plotters like to know everywhere the story is going before they start (still usually leaving room for change along the way). This is perfectly valid; lots of plotters start with an outline and then write.

Pantsers write by the seat of their pants. They know some things when they start; they know more when they’re done.

Some people (including me) are in between. I tend to write out the parts I know (this character, this scene, this setting), get about to the halfway point of a draft, and then create an outline to start better organizing the stuff I’m coming up with.

See where you feel comfortable, mostly. Play with both ends of the spectrum. Expect your process to evolve if you choose to keep up with this. 

Mindfulness

Honestly, I think this is probably the most important skill for finishing the first draft of your first novel, more than beautiful sentences and perfect plot twists. You need to be able to keep putting words down with the knowledge that you’re not perfectly nailing it today. But nailing it is not today’s goal; that’s your future self’s problem when you revise. Unlike a lot of “good luck, future self” things, this is a healthy one. There is drafting you and revising you. Today, drafting you is in charge. Revising you will use different skills and priorities.

Our goal, with endless thanks to Anne Lamott, is a shitty first draft. I say this joyfully.

Here’s the great news: your first draft doesn’t have to be good, and odds are it won’t be. We’re not setting records for the marathon, we’re crossing the line in our own time, content enough not to have collapsed on the way.

While making yourself believe your story, you also need to make yourself believe that this will get better – but the thing is, it really will! You can remind yourself that I promised you the work will get better in time, and I mean it: if you choose to keep going with this, it just will.

You’ll also start to learn things about your own writing. Everyone – yes, everyone – has patterns in their writing that, left unchecked, would be undesirable in a final draft. Luckily for us, we’re only writing a first draft, and we don’t have to fix these things! You can ignore them entirely, or you can choose to passively witness them with the goal of addressing them on another day. My characters love to gaze at things pensively before finally saying something out loud. They can do this as much as they wish in a first draft.

The same is true of themes, which are the higher-level “abouts” of a story. A story might be about two people who meet at work and end up on a road trip from San Francisco to Montana, but the theme might be discovery of the true self, the compromises we make in love, or the American dream. Themes are a concern of the revising you in the future. You have the option of totally ignoring it in a first draft. If you want, you can watch and get a sense of what some likely themes are… and then put them away until the time is right.

Remind yourself: perfection is not required this day. Only word count and progress.

How to keep going on this long solo journey

Have a place and a time to do this thing. Yes, routine again. But it helps to not have to do the work of where and when anew every time. 

Bribe yourself. Maybe you go on a walk afterward or text a friend. I don’t love using food for stuff like this, because we don’t need to earn food. But put an incentive on the other side, and routine will come together aster.

Practice acountability, even if it’s just posting about what you’re doing on Twitter. More on this in a minute.

Practice being very nice to yourself. When I write a note about a to-do, I often write “please.” Writing can be punishing. Be soft where you can. 

Celebrate victories. Victories can take lots of forms. Embrace most of them, even and especially the small ones. 

How to know you’re doing actual work

The simplest way to do this is to track your word or page count. This can be very official (I have a spreadsheet) or just by taking a look at things (shortcut in Google Docs: cmd-shift-c). However, this can also be emotionally unhelpful for some of us, so don’t feel like this is how you need to do this. If you suspect this kind of rigorous measurement will psych you out, skip it!

Instead, consider tracking your butt-in-seat-hands-on-keyboard time. And this doesn’t have to mean writing down your minutes. It can just mean that you always sit down to write for a half hour after work. You may write, you may not, but you created the conditions where writing can take place. That can be victory too.

Related: consider finding a routine. Feel happy when you stick to it. I like London Writers’ Hour, Australia edition, which is roughly at lunch in my time zone. On those days, I get to know exactly when I’ll write with no thinking or life rearranging. Awesome.

Set goals for yourself if you want, but keep them gentle at first. Ambition can come later. People don’t start out wanting to climb a mountain; first, they usually climb an indoor wall of low difficulty, and then they work their way up. Writing’s the same. 

Accountability

If you want to do this, figure out if you do best with accountability to yourself or someone else.

I do a mix of bullet journaling and spreadsheets to keep on top of my life and progress on most things in it. It means a lot to me to check certain boxes everyday, and I really dislike missing one. So internal accountability works well for me.

But you may do better with external accountability. In that case, you can try posting on social media when you’ve done your work for the day, texting a friend, or reaching out to writing community. Speaking of…

Community and why it’s useful

Writing’s a lonely journey. It’s nice to have parts of it be less so.

I’ve found community largely through Slacks: Clarion West’s during their summer writing event, dedicated channels in various tech Slacks I’m in. But I’ve also done things on Twitter, and there are sub-Reddits that are super active: r/writing, r/KeepWriting, r/writers, and others dedicated to specific genres are great places to start. When in-person events are going on, local readings, writing classes, and conferences are really useful for this. And writing classes still exist these days – there are tons being held over Zoom, so you can commune remotely and still meet people. Classes held across multiple days are great for this.

Beyond that, keep going, and you’ll get to a point where you need beta readers, which are the first people who read something you’re working on once it gets to a place where it’s ready for someone else. (For a lot of people this is the second, third, or fourth draft.) It’s a great idea to make friends with people who are in a similar place with their work to you; you can learn together, and quite possibly (if you have ambitions with your work) help each other succeed. 

And then: just one damned thing after another

It’s useful to have an idea of where to start and where you’re going (or I feel reassured by that, anyway). But you don’t have to. Write the thing you’re excited by first. Then figure out the next thing you’re excited by and keep going. This gives you momentum, but it’s also good data. If you’re excited to write it, probably someone will be excited to read it. Conversely, if you find yourself putting off writing something over and over again or it feels like a drag, that may well shine through to the reader too. Write the boring stuff too, but note where the electricity is.

Troubleshooting

All writers are different, but many of us experience the same problems.

“I keep rereading what I’ve written and trying to optimize it.”

Write yourself a note about what you want to accomplish the next time you sit down. Then, if you must, only reread the last couple of paragraphs before you start typing. Minimize the reorientation you have to do, and you’ll be less likely to get lost in the work.

“I hate what I’ve written lately.”

Totally ok. We all do sometimes. The more you do this, the more you’ll cringe when rereading old stuff – you’re growing, so this is actually a great sign. It’s great to have more than one thing to write going at any given time, so your ENTIRE WRITING SUCCESS doesn’t depend on solving the one and only problem before you right now. Alternatively: walk away for a while. It’s totally normal and good to put things down and not pick them back up for a bit. A lot of times, answers creep in at the quiet moments. My brain likes to give me surprise answers when I take walks and when I’m falling asleep.

“I don’t know what happens at the end.”

Very normal. Meander with your characters more. Dive into the middle. Explore your world, figure out colors and textures and smells. The end, like all the rest, is a work in progress. Things can change wildly between drafts, and that’s totally valid. I also like writing monologues from characters if I feel like I don’t understand their motivations or goals. Usually they’ll reveal something to me if I just give them space to talk. 

“I don’t like this novel business after all.”

Congratulations, my friend. You get to choose another hobby if this really just feels bad for you. And cheers to you for trying something new!

Takeaways

Thanks for staying with me through two very different subjects! Here’s what I hope you carry away with you.

The command line can do so much powerful, cool stuff, and you can do awesome things with the information it can give you.

If you want to write a first draft novel, you absolutely can.

As with so many things: butt in seat, hands on keyboard. You can do it.

Thank you <3

Diana Initiative 2021: The System Call Is Coming from Inside the House

Like ghosts, security vulnerabilities are the result of us going about our lives, doing the best we can, and then experiencing things going awry just because that’s usually what happens. We plan systems, we build software, we try to create the best teams we can, and yet there will always be echoes in the world from our suboptimal choices. They can come from lots of places: one team member’s work not being double checked when a large decision is at stake, a committee losing the thread, or – worst of all – absolutely nothing out of the ordinary. Alas, it’s only true: security issues are a natural side effect of creating and using technology.

In this post (and the talk it accompanies; video to come when it’s up), we’re going to talk about the energy signatures, cryptids, orbs, poltergeists, strange sounds, code smells, and other things that go bump in the night. That’s right: we’re talking about security horror stories.

Before we get started, here’s a little disclaimer. Everything I’m about to talk about it is something I’ve encountered, but I’ve put a light veil of fiction on everything. It’s just professionalism and good manners, a dash of fear of NDAs, and a little superstition too. The universe likes to punish arrogance, so I’m not pretending I’m immune to any of this. No one is – it’s why appsec exists!

Now: let’s get to the haunts.

Automatic Updates

I’m starting with one that’s probably fresh in everyone’s minds, and I’m mentioning it first because it’s such a dramatic betrayal. You think you’re doing the right thing. You’re doing the thing your security team TOLD you to do by keeping your software up to date. And then suddenly, you hear a familiar name in the news, and you have an incident on your hands. 

a white man in a blue shirt who has been stabbed by his own wooden stake. A dialogue bubble reads, "Mr. Pointy, no!"

This one is like getting stabbed with your own stake when you’re fending off a vampire; I thought we had a good thing going!

A drawing of a newspaper that says "OH NO" and then "YIKES YIKES YIKES"

Ideally, you’ll learn about it via responsible disclosure from the company. Realistically, it might be CNN. Such is the way of supply chain attacks.

You invite it in by doing the right thing. Using components, tools, and libraries with known vulnerabilities is in the most current version of the OWASP top ten for a reason, so there’s endless incentive to keep your stuff up to date. The problem comes from the tricky balance of updates vs. stability: the old ops vs. security argument.

A couple of jobs ago, we were suddenly having trouble after our hourly server update. Ansible was behaving itself, but a couple of our server types weren’t working right. After an intense afternoon, some of my coworkers narrowed it down to a dependency that had been updated after a long time with no new releases. It wasn’t even what you’d call a load-bearing library, but it was important enough that there’d been a breaking change. The problem was solved by pinning the requirement to an earlier version and reverting the update.

My own sad little contribution toward this not being the state of things forever was to make a ticket in our deep, frozen, iceboxy backlog saying to revisit it in six months. I was an SRE then, but I was leaning toward security, and I was a little perturbed by the resolution – though I couldn’t have suggested a better solution for keeping business going that day.

(It did get fixed later, by the way. My coworkers reached out to tell me, which was very kind of them.)

This has become one of those stories that stays with me when I review code or a web UI and find that an old version of server software or a dependency has a known issue. Even if it doesn’t have a documented vulnerability, not throwing the parking brake on until it’s updated to the most recent major version feels like doing badly by our future selves, even if it’s what makes sense that day. 

The best way to fix this is to pay attention to changes, check hashes on new packages, and generally pay close attention to what’s flowing into your environment. It isn’t easy, but it’s what we’ve got.

Chrome extensions and other free software

The horrors that lurk in free software are akin to the risks of bringing a Ouija board in the house that you found on the street

Behold: my favorite piece of art I made for this talk/post.

a Ouija board sits on top of a pile of trash bags in a brown puddle

You can identify it by listening for you or someone around you saying, “Look at this cool free thing! Thanks, Captain Howdy, I feel so much more efficient now!” Question everything that’s free, especially if it executes on your own computer. That’s how we invite it in: sometimes it’s really hard to resist something useful and free, even if you know better.

A particular problem with Chrome extensions is that you can’t publish them without automatic updates enabled, so someone has a direct line to change code in your browser, so long as you keep it installed.

Captain Howdy Efficiency Extension with picture of Regan from The Exorcist

Last fall, The Great Suspender, a popular extension for suspending tabs and reducing Chrome’s memory usage, was taken over by malicious maintainers. As a result, people who had, once upon a time, done their due diligence were still sometimes unpleasantly surprised.

That’s the tough thing about evaluating some of these: you have to consider both the current risks (permissions used, things lurking in code) and the possible future risks. What happens if this program installed on 500 computers with access to your company’s shared drive goes rogue? It makes it difficult to be something other than that stereotype of security, the endless purveyors of NO. But a small risk across a big enough attack surface ends up being a much larger risk.

In short: it’s the Facebook principle, where if you get a service for free, you might consider the possibility that you’re the product. Security isn’t necessarily handled as carefully as it should be for the product rather than paying customers. Pick your conveniences carefully and prune them now and then. (Or get fancy, if you control it within your company, and consider an allowlist model rather than a blocklist. Make conveniences prove themselves before you bring them in the house.)

unsafe-eval and unsafe-inline in CSP

watercolor of a brown book that says "Super Safe Spells, gonna be fine!"

Our next monster: an overly permissive content security policy. I’d compare it to a particular episode of Buffy the Vampire Slayer or any movie that uses that trope of people reading an old book out loud without understanding what they’re saying.

Fortunately, a content security policy is a lot easier to read than inspecting every old leather book someone might drag into your house

watercolor of a brown book that says "script-src 'unsafe-eval'" and "Good CSP Ideas"

We invite it in because sometimes,the risky choice just seems easier. You just need to run a little code from a CDN that you need to be able to update easily. You know.

For fending it off, I would gently urge you not to put anything in your CSP that literally contains the word “unsafe.” I honestly understand that there are workarounds that can make sense in the moment, when you’re dealing with a tricky problem, when you just need a little flexibility to make the thing work.

In that case, I urge you to follow my quarantine problem-solving process, for moments where you need to figure something out or finish a task, but the brain won’t cooperate.

  1. Have you eaten recently? (If not, fix that.)
  2. Does this just feel impossible right now? Set a timer, work on it for a bit, then stop.
  3. Can you not think around this problem usefully? Write out your questions.

I would suggest starting with “why do I think unsafe-eval is the best option right now, and what might I google to persuade myself otherwise?”

Sometimes you want to keep things flexible: “I’ll just allow scripts from this wide-open CDN URL wildcard pattern.” But what can this enable? What if the CDN gets compromised? Use partners you trust, sure, but it’s also a good idea to have a failsafe in place, rather than having something in place that allows any old code from a certain subdomain.

Look, it’s hard to make things work. You have to be a glutton for punishment to do this work, even if you love it. (And I usually do.) But you can’t just say yes to everything because of future-proofing or whatever feels like a good reason today, influenced by your last month or so of work pains. You can’t do it. I’m telling you, as an internet professional, not to do it. Because your work might go through my team, and I will say NOT TODAY, INTERNET SATAN, and then you’re back at square one anyway.

Stacktraces that tell on you

“The demon is a liar. He will lie to confuse us; but he will also mix lies with the truth to attack us.”

William Peter Blatty, eminent appsec engineer and author of The Exorcist

Logs and stacktraces contain the truth. Right? They’re there to collect context and provide you information to make things better. Logs: they’re great! Except…

Exception in thread "oh_no" java.lang.RuntimeException: AHHHHHH!!!
    at com.theentirecompany.module.MyProject.superProprietaryMethod(MyActualLivelihood.java:50)
    at com.theentirecompany.module.MyProject.catOutOfBagMethod(MyActualLivelihood.java:34)
    at com.theentirecompany.module.MyProject.underNDAMethod(MyActualLivelihood.java:27)
    at com.theentirecompany.module.MyProject.sensitiveSecretMethod(MyActualLivelihood.java:11)
    at com.theentirecompany.module.MyProject.oh_no(MyActualLivelihood.java:6)

…except for when they get out of their cage. Or when they contain information that can be used to hurt you or your users.

We invite it in by adding values to debug statements that don’t need to be there. Or maybe by writing endpoints so that errors might spill big old stacktraces that tell on you. Maybe you space and leave debugging mode on anywhere once you’ve deployed. Or you just haven’t read up OWASP’s cheat sheet on security misconfiguration.

How to fend it off: conjure a good culturally shared sense of safe log construction and remember what shouldn’t be in logs or errors:

  • Secrets
  • PII
  • PHI
  • Anything that could hurt your users
  • Anything that could get you sued.

Make a commit hook that checks for debug anything.

Secrets in code

“To speak the name is to control the thing.”

Ursula K. Le Guin

The monster I’d compare this to is when the fae (or a wizard, depending on your taste in fantasy) has your true name and can control you.

API_KEY = "86fbd8bf-deadbeef-ae69-01b26ddb4b22"

How to identify it: read your code. Is there something in there you wouldn’t want any internet random to see? Yank it! Use less-sensitive identifiers like internal account numbers if you need to, just not the thing that lets someone pretend to be you of one of your users if they have it.

You know you’re summoning this one if you find yourself saying, “Oh, it’s a private repo, it’ll never matter.” Or maybe, “It’s a key that’s not a big deal, it’s fine if it’s out there.”

We fend this one off with safe secret storage and not depending on GitHub as a critical layer of security.

We all want to do it. Sometimes, it’s way easier than environment variables or figuring out a legit secret storage system. Who wants to spend time caring for and feeding a Vault cluster? Come on, it’s just on our servers, it’s not a big deal. It’s only deployed within our private network.

It is a big deal. It is my job to tell people not to do this. Today I will tell you for free: don’t do this!

I’ve argued with people about this one sometimes. It surprises me when it happens, but the thing is that people really just want to get their job done, which can mean the temptation of doing things in the way that initially seems simpler. As a security professional, it becomes your job to help them understand that saving time now can create so any time-consuming problems later.

It’s great to have a commit check that looks for secrets, but even better is never ever doing that. Best of all is both!

A single layer of input validation

Our next monster: a single layer of input validation for your web UI. I quote Zombieland: always double-tap.

many snaky hydra heads baring pointy teeth

Our comparison for this one is anything tenacious. Let’s say zombies, vampires, hydras. Anything that travels in a pack. And we identify it by siccing Burp Suite on it. Maybe the form doesn’t let you put a li’l <script> tag in, but the HTTP request might be only too happy to oblige. We invite it in by getting a little complacent.

The best way to fend it off is to remember that regular users aren’t your real threat (and you’re probably just irritating people with names that don’t meet the definition of “normal” some shoddier validation will catch, which can do some awful racist things). There’s a reason injection, especially SQL injection, is a perennial member of the OWASP top ten.

Say it with me: client-side and server-side validation and sanitation! And then add appropriate encoding too, depending on what you’re doing with this input.
Most people do only interact with your server via your front end, and bless them. But the world is filled with jerks like me, both professional jerks and people who are jerks for fun, and we will bypass your front end to send requests directly to your server. Never take only one measure when you can take two.

Your big-mouthed server

curl -I big-mouthed-server.com

How to identify this one? curl -I thyself

We invite this one in by not changing defaults, which aren’t always helpful. This also relates to security misconfiguration from the OWASP top ten. The early internet sometimes defaulted too much toward trust (read The Cuckoo’s Egg by Cliff Stoll for a great demonstration of the practices and perils of this mindset), and we can still see this in defaults for some software configs.

Protect yourself from this one by changing your server defaults so your server isn’t saying anything you wouldn’t want to broadcast. The primary concern here is about telling someone you’re using a vulnerable version of your server software, so keep your stuff updated and muffle your server.

Let’s go X-Files about it: trust no one, including your computers.

ESPECIALLY your computers.

Don’t trust computers! They just puke your details out everywhere if someone sends them a special combination of a few characters.

In conclusion, sometimes I wish I had been born in the neolithic.

Laissez-faire auth (or yolosec, if you prefer)

Our next monster: authentication that accepts any old key and authorization that assumes that, if you’re logged in, you’re good to go. Its horror movie counterpart is that member of your zombie-hunting group that cares more about being a chill dude than being alive.

You can identify it by doing a little token swapping. Can you complete operations for one user using another’s token? The monster is in your house.

Here we are again at the OWASP top ten: broken authentication is a common and very serious issue. The problem compounds because we need to do better than “Are you allowed in?” We also need to ask, “Are you allowed to do THIS operation?”

We invite this one in by assuming no one will check to see if this is an issue or by assuming the only people who interact with your software have a user mindset. Or, sometimes worst of all, just not giving your devs enough time and resources to do this right.

We can fend it off by using a trusted authentication solution (OAuth is popular for a reason) and ensuring there are permissions checks, especially on state-changing operations – ones that go beyond “if you’re not an admin, you don’t have certain links in your view of things.”

I’ve seen a fair amount of “any token will do” stuff: tokens that don’t invalidate on logout, tokens that work if you fudge a few characters, things like that. It’s like giving a red paper square as a movie ticket: anyone can come in anytime. Our systems and users need us to do better.

The elder gods of technology

Ah, yes, the technology that will never leave us. Think ancient WordPress, old versions of Windows running on an untold number of servers, a certain outdated version of Gunicorn, and other software with published CVEs.

“That is not dead which can eternal lie.”

Which monster would I compare this to? Well, I’m not going to say his name, am I?

You know you’re at risk of summoning it when you hear someone say “legacy software,” though newer projects are NOT immune to this. Saying “we’re really focusing our budget on new features right now” is another great way to find… you know… on your doorstep.

We can fend it off by allocating budgets to review and update dependencies and fix problems when they’re found. And they should be sought out on a regular schedule.

No tool, framework, or language is bad. They all have potential issues that need to be considered when. For instance, there are still valid reasons to use XML, and PHP is a legitimate language that just needs to be lovingly tended.

Yep, we’re back at the risk of using components with known vulnerabilities from the OWASP top ten. No tool, framework, or language is bad, but some versions have known problems, and some tools have common issues if you don’t work around them. It’s not on them; it’s on you and how you use and tend your tools.

The real incantation to keep this one away is understanding why we keep these things around. No engineering team keeps their resident technical debt nightmare because they like it. They do it because it’s worked so far, and rewriting things is expensive and finicky, particularly if outside entities depend on your software or services.

I’ve never been on an engineering team that didn’t have lots of other things to do rather than address the thing that mostly hasn’t given them problems… even if none of the engineers without vivid memories of the 90s understand it very well.

Sometimes the risk of breaking changes is scarier than whatever might be waiting down the road, once your old dependencies cause problems… or someone on the outside realizes what your house of cards is built on.

Javascript :D

Speaking of no tool, framework, or language being bad: let’s talk about Javascript. Specifically Javascript used incorrectly.

Our horror comparison? It’s a little 12 Monkeys, a little Cabin Fever. Anything where they realize the contagion is in the water or has gone airborne. “Oh god, it’s everywhere, it’s unavoidable, it’s a… global… pandemic…” Oh no.

The way to summon this one is simple. Say it with me:

internet

Anything… internet. It’ll be there, whether you think you invited it or not.

To fend it off? Just… be very careful, please.

I’m not going to make fun of Javascript. Once I was a child and enjoyed childish things, like making fun of Javascript. Ok, that was like three years ago, but now I am an adult and have quit my childish ways. In fact, I made myself do it somewhere between when I realized that mocking other people’s tech is not actually an interesting contribution to a conversation and when I became an appsec engineer, so roughly between 2016 and the end of 2019.

The internet and thus life and commerce as we know it runs largely on Javascript and its various frameworks. It’s just there! And there have been lots of leaps forward to make it less nightmarish, thanks to the really hard work of a lot of people, but still, things like doctor’s appointments and vaccination slots and other giant matters of safety and wellbeing hang out on top of the peculiarities of a programming language that was created in ten days.

Unfortunately, new dragons will just continue to appear, because that’s the side effect of making new things: new features, new problems. The great thing for appsec that’s an unfortunate thing about humanity is that we make the same mistakes over and over, because if something worked once, we want to think we have it sorted. And unfortunately, the internet and technology insist on continuing to evolve.

How to fix it? Invest in appsec.

Sorry. It’s a great and important goal to develop security-conscious software engineers and to talk security as early as possible in the development process. But there’s a reason writers don’t proofread their work – or shouldn’t. Writing and reviewing the work are two different jobs. It’s why healthy places don’t let people approve their own PRs. If we created it, we can’t see it objectively. And most of us become sharper when honed by other perspectives and ideas.

The most permissive of permissions

And finally, the monster that’s nearest and dearest to my heart: overly permissive roles, most specifically AWS permissions. I’d compare this to sleeping unprotected in the woods when you know very well they’re full of threats.

You can identify it by checking out your various IAM role permissions. And yes, this is totally a broken authentication issue too, a la OWASP.

The best way I know to invite this one in is by starting out as a smaller shop and then moving from a team of, say, three people who maybe even all work in the same room, doing everything, to thirty or fifty or more people, who have greater specialization… yet your permissions never got the memo.

And the best way I know to fend it off is to make more refined roles as early as you can. I know, it’s hard! It isn’t a change that enables a feature or earns more money, so it’s easy to ignore. Worst of all, as you refine the roles, reduce access, and iterate, you’re probably going to piss off a bunch of people as you work to get it right. IAM: the monster that doesn’t need a clever analogy because getting it right sucks so bad on its own sometimes!

It’s also the monster that lurks around basically every corner, because IAM is ubiquitous. So it’s everywhere, and it’s endlessly difficult: awesome! Alas, we have to put the work in, because messing it up the most efficient way I know to undermine so much other carefully done security work.

AWS permissions that allow access to all resources and actions

And yet I feel warmly toward it, because IAM was my gateway into security work. It’s the thing I tend to quietly recommend to the security-aspiring, because most people don’t seem to like doing it very much, yet the work always needs to be done. Just showing up and being like, “IAM? I’d love to!” is a highly distinctive professional posture. Get your crucifix ready and have some incantations at hand, and you’ll never run out of things to do. It’s never not useful. Sorry, it’s just like the rest of tech: if you’re willing to do the grunt stuff and get good at it, you’ll probably have work for as long as you want it.

Whew. That’s a lot of things that go bump in the night.

Let’s end with some positive things. I swear there are some. And if a security engineer tells you that there are still some beautiful, sparkly, pure things in the world, that’s probably worth listening to.

Strategies for vulnerability ghost hunting

Don’t be the security jerk

Be easy to work with. This doesn’t mean being in a perpetually good mood – I am NOT, and former coworkers will attest to this if you ask – but it means reliably not killing the messenger.

People outside of your security team – including and especially people who aren’t engineers – are your best resource for knowing what’s actually going on. Here’s the thing: if you’re a security engineer, people don’t act normally around you anymore. You probably don’t get to witness the full spectrum of people’s natural behavior. Unfortunately, it just comes with the territory. And it means we have to rely on people telling us the truth when they see something concerning.

Even people who are cooperative with security will hedge a little sometimes when dealing with us. I know this because I’ve done it. But you’ll reduce this problem if everyone knows that talking to security will be an easy, gracious thing where they’re thanked at the end. Make awards for people who are willing to bring you problems instead of creating a culture of fear and covering them up! Make people feel like they’re your ally and a valued resource, because they are!

Be an effective communicator

Being able to communicate in writing is so important in this work. Whether it’s vulnerability reports, responsible disclosure, blog posts warning others about haunted terrain, or corresponding with people affected by security poltergeists, being able to write clearly for a variety of audiences is one of our best tools.

If you think you’re losing your grip on how regular people talk about this stuff, bribe your best blunt, nontechnical friend to listen to you explain things. Then have that blunt friend tell you when what you said didn’t make a goddamn lick of sense… then revise and explain again. Do this until you’re able to explain things in plain language accessible to your users and spur them to action using motivations that make sense to them

Now let’s leave this haunted house together and greet the coming dawn.

I hope you encounter none of these terrifying creatures and phenomena that I described, that your trails from server to server in our increasingly connected world are paved with good intentions and mortared together with only the best outcomes. But should you wander into dark woods full of glowing red eyes and skittering sounds belonging to creatures just out of sight… I hope you are better equipped to recognize and banish them than you were earlier. Thank you for reading.

Day of Shecurity 2021: Intro to Command Line Awesomeness!

Recently, I hugged a water buffalo calf. It was approximately as exciting as giving this talk is, which is to say: VERY.

Updated: now with video!

First, the important stuff: here’s the repo with a big old markdown file of commands and how to use them. It also includes my talk slides, which duplicate the markdown (just with a prettier theme, in the way of these things).

Second: I went to the Lookout Security Bootcamp in 2017, one of my first forays into security things (after some WISP events in San Francisco and DEF CON in 2016). That’s where I conceived of the idea of this talk. There was a session where we used Trufflehog and other command line tools, and we concluded with a tiny CTF. Two of the three flags involved being able to parse files (with grep, among other things), and I was familiar with that from my ops work, so I won two of the three gift cards up for grabs. I used the money to buy, as I remember, cat litter, the Golang book, and a pair of sparkly Doc Martens I’d seen on a cool woman in the office that day. I still wear the hell out of the boots, including at DEF CON, and I still refer to them as my security boots.

I spent the rest of that session teaching the rad women around me the stuff I knew that let me win those challenges. This had two important effects on me. The first was that I thought, “Wait, it might be that I have something to offer security after all.” The second was that I wanted to do a session someday and teach these exact skills.

I went to Day of Shecurity in 2018 and 2019 too. It’s a fabulous event. At the last one, just a handful of months before we all went and hid in our houses for more than a year, I went to a great session on AWS security (a favorite subject of mine) by Emily Gladstone Cole. And I thought: oh, there it is. I’m ready. I told my friends that day that I wanted to come back to DoS as a presenter. And I pitched the session, it got accepted, and after a fairly dreamless year, one of mine came true.

So if you’re reading this: hello! These events really do change lives. The things you do really can bring you closer to what you want. And, as I like to say in lots of my talks, there is a place for you here if you want to be here. We need you. Keep trying.

I wrote about my own journey into security here. Feel free to ask questions, if you have them! I love talking about this, and I would like to help you get to where you want to go.

You Can Put WHAT in DNS TXT records?! A blog post for !!con West 2020

Why use a phone book metaphor to explain DNS when you can get even more dated and weird? (This will make more sense when I link the video, but in the meantime, enjoy.)

It Is Known that DNS contains multitudes. Beyond its ten official record types are myriad RFC-described and wildly off-brand uses of them, and DNS TXT records contain the most possibility of weird and creative use. I quote Corey Quinn:

“I mean for God’s sake, my favorite database is Route 53, so I have other problems that I have to work through.”

(From this interview.)

He’s also described it as the only database that meets its SLAs 100 percent of the time. (Route 53 is AWS’s DNS product, encompassing TXT records and many other things, if you have not had the pleasure.)

What is this mystery that is woven through the internet? Let me introduce you to (or reacquaint you with, if you’ve met) the DNS TXT record.

DNS and its ten beautiful children

There are ten kinds of DNS records, each of which will include a reference to a specific domain or subdomain, which usually exists to enable access to that domain’s server or otherwise help with business connected to that domain (email settings, for instance).

The one you might’ve seen or made the most is an A record, or an address mapping record. This is the one that matches URL to IP address – IPv4, in this case. AAAA does the same for IPv6. There are CNAMES, or canonical name records, which alias a hostname to another hostname, often used for things like $marketingproject.business.com, when the marketing project site is being hosted on Heroku or somewhere other than your company’s primary business. You can read about them all here. This post, however, and its accompanying talk (link to come) is about my favorite of them all: TXT records.

TXT records are used for important, fairly official things, but it’s only by agreed-upon practice. While you’ll see very consistent formatting in them for things like SPF, DKIM, DMARC, or domain ownership verification (often in the form of a long random string value for a key that likely starts with _g), the truth is that you can put almost anything in there. My favorite off-brand but still computer-related one I heard about was a large university that put lat/long information in each server’s TXT records, for the sake of finding it more efficiently on a sprawling campus.

For the records’ contents, there are a few restrictions:

  • You cannot exceed 255 characters per string
  • You can include multiple strings in a single TXT record, but they must be enclosed in straight quotes and separated by commas. These can be concatenated into necessarily longer records, like DKIM with longer keys or very elaborate SPF records
  • All characters must be from the 128-character printable ASCII set (no emoji allowed, and no curly quotes or apostrophes either)
  • At least on AWS, you can’t exceed 512 bytes per record, whether it’s a single string or several
  • They are not returned in the order they were added (which made the Emily Dickinson poem I added as three records come out a little funny in my terminal; it still kind of worked, though)

I cribbed that together from a mix of (provider-accented) experimentation and anecdotal information from others who have pondered this stuff. The official docs are often a little hand-wavy on this level of detail (appropriately, I’d say). RFC 1035, for instance, states: “The semantics of the text depends on the domain where it is found.” For its RDATA packet, it offers this:

3.3.14. TXT RDATA format

    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    /                   TXT-DATA                    /
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

TXT-DATA        One or more <character-string>s.

I mean, fair. (Data goes in data place; I cannot argue with this.) Future compatibility and planning for unexpected needs to come are a part of every RFC I’ve dug into. Meanwhile, RFC 1464 more directly portends some of the weirdness that’s possible, while also explaining the most common format of TXT files I’ve seen:

        host.widgets.com   IN   TXT   "printer=lpr5"
        sam.widgets.com    IN   TXT   "favorite drink=orange juice"

   The general syntax is:

        <owner> <class> <ttl> TXT "<attribute name>=<attribute value>"

I am accustomed, when dealing with infrastructure tags, to having the key-value format be required, either through a web UI that has two associated fields to complete or through a CLI tool that is very prepared to tell you when you’re doing it wrong.

I have not found this to be the case with TXT records. Whether you’re in a web UI, a CLI, or Terraform, you can just put anything – no keys or values required. Like many standards, it’s actually an optional format that’s just become normal. But you can do what you want, really.

And there are peculiarities. When doing my own DNS TXT poking for this presentation and post, I worked with Dreamhost and AWS, and they acted very differently. AWS wanted only one TXT record per domain and subdomain (so you could have one on example.com and another on wow.example.com), while Dreamhost let me make dozens – but it made DNS propagation really sluggish, sometimes getting records I’d deleted an hour ago, even after clearing the cache. (Dreamhost, meanwhile, has a hardcoded four-minute TTL for its DNS records, which you have to talk to an administrator to change, specially, on your account. It’s always interesting in DNS.) In short, the system is not prepared for that much creativity.

Too bad for the system, though. :)

DNS, ARPANET, hostnames, and the internet as we know it™️

DNS did not always exist in its current scalable, decentralized state, of course. Prior to around 1984, the connection of hostname to IP was done in a file called hosts.txt, which was maintained by the Stanford Research Institute for the ARPANET membership. The oldest example I found online is from 1974, and you can see other examples here and chart the growth of the protointernet. It went from a physical directory to more machine-readable formats, telling you and/or your computer how to reach the host identified as, say, EGLIN or HAWAII-ALOHA. These were static, updated and replaced as needed, and distributed weeklyish.

hosts.txt began its saunter out of our lives when DNS was described in 1983 and implemented in 1984, allowing the internet to scale more gracefully and its users to avoid the risk of stale files. Instead, independent queries began flying around a decentralized infrastructure, with local caches, recursive resolvers, root servers that pointed to top-level domain servers, and nameservers that kept the up-to-date IP addresses and other details for the domains in their care. (You can find a less breezy, more detailed description of this technology you used today here.)

The joys and sorrows of successful adoption

The great and terrible thing about DNS is that so many things rely on it. So if DNS is having a bad day (a much-used DNS server is down, for instance), it can interrupt a lot of processes.

That means, though, that it can also be used to do all sorts of interesting stuff. For instance, a DNS amplification attack involves sending out a ton of DNS queries from lots of sources and spoofing the source address in the packets so they all appear to come from one place, so the responses all go to one targeted server, possibly taking it down. 

TXT records figure into some of this weirdness. Let’s get into some of the interesting backflips folks have done with DNS and its most flexible of record types.

Shenanigans, various

DNS tunneling

This is, so far as I’ve been able to tell, the OG of DNS weirdness. It’s been going on for about 20 years and was first officially described at Black Hat in 2004 by Dan Kaminsky (who stays busy finding weird shit in DNS; if you like this stuff, you’ll enjoy digging into his work).

There are a few different ways to do this, but the central part is always smuggling some sort of information in a DNS query packet that isn’t supposed to be there. 

DNS packets are often not monitored in the same way as regular web traffic (but my Google ads, in the wake of researching this piece, will tell you that there are plenty of companies out there who’d love to help you with that). The permissiveness of DNS query packet movement makes a great vector for exfiltrating data or getting malware into places it would otherwise be hard to reach.

Data is sometimes smuggled via nonexistent subdomains in the URLs the packet seems to be querying for (c2VyaW91cyBldmlsIGJ1c2luZXNzIGlzIGFmb290IGhlcmUgb2ggbm8.evil.com, for instance), but if your packet is designed to return, say, a nice chunk of TXT records? You can really stuff some information or code in there. DNS queries: they smuggle stuff AND evade lots of firewalls. Awesome!

Thwarting internet censorship

The more common DNS association with censorship is avoiding government DNS poisoning by manually setting a DNS server to 8.8.8.8. This isn’t a perfect solution and is getting less useful as more sophisticated technology is put to monitoring and controlling tech we all rely on. However, there’s another way, like David Leadbeater’s 2008 project, which put truncated Wikipedia articles in TXT records. They aren’t live anymore, but there are so many possible uses for this! Mayhem, genuine helpfulness… why not both?

DNSFS OMG OMG

Ben Cox, a British programmer, found DNS resolvers that were open to the internet and used TXT records to cache a blog post on servers all around the world. He used 250-character base64 strings of about 187 bytes each to accomplish this, and he worked out that the caches would be viable for at least a day.

I love all of this stuff, but this is probably my favorite off-brand TXT use. I honestly screamed in my apartment when I saw his animated capture of the sending and reassembly of the blog post cache proof of concept. 

Contributing to the shenanigans corpus

So naturally, I wanted in on this. One does not spend weeks reading about DNS TXT record peculiarities without wanting to play too. And naturally, as a creator of occasionally inspirational zines, I wanted to leave a little trove of glitter and encouragement in an unlikely part of the internet that was designed for no such thing.

Pick a number between 1 and 50. (It was going to be 0 and 49, but Dreamhost will allow 00 as a subdomain but not 0. Go figure!) Use that number as the subdomain of maybethiscould.work. And look up the TXT record. For example:

dig txt 3.maybethiscould.work

will yield

3.maybethiscould.work. 14400 IN TXT "Tilt your head (or the thing) and look at it 90 or 180 degrees off true."

Do a dig txt on only maybethiscould.work, and you’ll see them all, if DNS doesn’t choke on this. My own results have varied. If you like spoilers, or not having to run dig 50 times to see the whole of one thing you’re curious about, you can also see the entire list here.

In the meantime, now you know a little bit more about a thread of the internet that you’ve been close to and benefited from for some time. And next time: always dig the TXT record too. Start with dns.google if you want a strong beginning.

/etc/services Is Made of Ports (and People!)

Hospital switchboard, 1970s
I’ve been using a switchboard/phone extension metaphor to explain ports to my non-engineer friends who have been asking about my talk progress. Feels fitting to start the blog version of it with this switchboard professional.

You can view the video version of this talk here.

I learned about /etc/services in my first year of engineering. I was working with another engineer to figure out why a Jenkins master wasn’t connecting to a worker. Peering through the logs and netstat output, my coworker spied that a service was already listening on port 8080. “That’s Jenkins,” he said.

“But how did you know that?”

“Oh, /etc/services,” he replied. “It has all the service-port pairings for stuff like this.”

Jenkins is not, in fact, in /etc/services, but http-alt is listed at port 8080. The more immediately relevant answer was probably “through experience, because I’ve seen this before, o junior engineer,” but his broader answer got me curious. I spent some time that afternoon scrolling through the 13,000-plus-line file, interested in the ports but especially curious about the signatures attached to so many of them: commented lines with names, email addresses, and sometimes dates, attribution for a need as yet unknown to me.

I got on with the business of learning my job, but /etc/services stayed in my head as a mystery of one of my favorite kinds: everyone was using it, but many well-qualified engineers of my acquaintance had only partial information about it. They knew what used it, or they at least knew that it existed, but not where it came from. The names in particular seemed to surprise folks, when I asked colleagues for their knowledge about this as I was doing this research.

This post, a longer counterpart to my !!con West talk on the same subject, digs into a process and a file that was once commonplace knowledge for a certain kind of back-end and network engineer and has fallen out of more regular use and interaction.  I’ll take you through some familiar services, faces old and new, correspondence with contributors, and how you – yes, you – can make your mark in the /etc/services file.

What is it, where does it live

In *nix systems, including Mac OS, /etc/services lives exactly where you think it does. Windows also has a version of this file, which lives at C:\Windows\System32\drivers\etc\services. Even if you’ve never opened it, it’s been there, providing port name and number information as you go about your life.

The file is set up like this: name, port/protocol, aliases, and then usually a separate line for any comments, which is where you’ll often find names, email addresses, and sometimes dates. Like so:

ssh      22/udp  # SSH Remote Login Protocol

ssh      22/tcp  # SSH Remote Login Protocol

#                   Tatu Ylonen <ylo@cs.hut.fi>

The most common protocols are UDP and TCP, as those were the only ones you could reserve until a few years ago. However, as of an August 2011 update to RFC 6335 (more on that later), you can now snag a port to use with SCTP and/or DCCP as well. This RFC update added more protocols, and it also initiated a change from the old practice of assigning a port for both UDP and TCP for a service to only allocating the port for the protocol requested, and just reserving it for the others, though they’ll only be used if other port availability dwindles significantly.

Incidentally, the presence of a service in /etc/services does not mean the service is running on your computer. The file is a list of possibilities, not attendance on your machine (which is why your computer is probably not currently on fire).

Going through the first 500-odd lines of the file will show you some familiar friends. ssh is assigned port 22. However, ssh also has an author: Tatu Ylonen. His bio includes a lot of pretty typical information for someone who’s listed this far up in this file: he designed the protocol, but he has also authored several RFCs, plus the IETF standards on ssh.

Jon Postel is another common author here, with 23 entries. His representation in this file just hints at the depth of his contributions – he was the editor of the Request for Comment document series, he created SMTP (Simple Mail Transfer Protocol), and he ran IANA until he died in 1998.  A high /etc/services count is more a side effect of the enormity of his work rather than an accomplishment unto itself.

It’s cool to see this grand, ongoing repository of common service information, with bonus attribution. However, that first time I scrolled (and scrolled, and scrolled) through the entirety of /etc/services, what stayed with me were how many services and names I wasn’t familiar with – all this other work, separate of my path in tech thus far, with contact information and a little indicator of what that person was up to in, say, August 2006.

For instance: what’s chipper on port 17219? (It’s a research rabbit hole that took me about 25 minutes and across Google translate, the Wayback Machine, LinkedIn, Wikipedia, a 2004 paper from The European Money and Finance Forum, AMONG OTHER RESOURCES. Cough.) chipper, by Ronald Jimmink, is one of two competing e-purse schemes that once existed in the Netherlands; the longer-lasting competitor, Chipknip, concluded business in 2015. The allure of these cards, over a more traditional debit card, was that the value was stored in the chip, so merchants could conduct transactions without connectivity for their card readers. This was a common race across Europe, in the time before the standardization of the euro and banking protocols, and chipper is an artifact of the Netherlands’s own efforts to find an easier way to pay for things in a time before wifi was largely assumed.

Then there’s octopus on port 10008 (a port which apparently also earned some notoriety for being used for a worm once upon a time). Octopus is a a professional Multi-Program Transport Stream (MPTS) software multiplexer, and you can learn more about it, including diagrams, here.

There are, of course more than 49,000 others; if you have some time to kill, I recommend scrolling through and researching one whose service name, author, or clever port number sparks your imagination. Better still, run some of the service names by the longer-tenured engineers in your life for a time capsule opening they won’t expect.

Port numbers and types

Ports are divided into three ranges, splitting up the full range of 0-65535 (the range created by 16-bit numbers).

  • 0-1023 are system ports (also called well-known ports or privileged ports)
  • 1024-49151 are user ports (or registered ports)
  • And 49152-65535 are private ports (or dynamic ports)

Any services run on the system ports must be run by the root user, not a less-privileged user. The idea behind this (per W3) is that you’re less likely to get a spoofed server process on a typically trusted port with this restriction in place.

Ok, but what actually uses this file? Why is it still there?

Most commonly, five C library routines use this file. They are, per (YES) the services man page:

  • getservent(), which reads the next entry from the services database (see services(5)) and returns a servent structure containing the broken-out fields from the entry.
  • getservbyname(), which returns a servent structure for the entry from the database that matches the service name using protocol proto
  • getservbyport(), which returns a servent structure for the entry from the database that matches the port port (given in network byte order) using protocol proto
  • setservent(), which opens a connection to the database, and sets the next entry to the first entry
  • endservent()

The overlapping use of these routines makes service name available by port number and vice versa. Thus, these two commands are equivalent:

  • telnet localhost 25
  • telnet localhost smtp

And it’s because of information pulled from /etc/services.

The use  you’ve most likely encountered is netstat, if you give it flags to show service names. The names it shows are taken directly from /etc/services (meaning that you can futz with netstat’s output, if you have a little time.)

In short: /etc/services used to match service to port to give some order and identity to things, and it’s used to tell developers when a certain port is off limits so that confusion isn’t introduced. Human readable, machine usable.

Enough about ports; let’s talk about the people

First, let’s talk about the method. Shortly after getting the acceptance for my !!con talk, I went through the entire /etc/services file, looking for people to write to. I scanned for a few things:

  • Email addresses with domains that looked personal and thus might still exist
  • Interesting service names
  • Email addresses from employers whose employees tended to have long tenures
  • Anything that sparked my interest

I have a lot of interest, and so I ended up with a list of 288 people to try to contact. The latest date in my local copy of /etc/services is in 2006, so I figured I’d be lucky to get responses from three people. And while more than half of the emails certainly bounced (and the varieties of bounce messages and protocols had enough detail to support their own fairly involved blog post), I got a number of replies to my questions about how their work came to involve requesting a port assignment, how it was that they knew what to do, and how the port assignment experience went for them.

I will say that the process revealed an interesting difference between how I’d write to folks as a writer and researcher (my old career) vs. how one writes questions as an engineer. As a writer working on a story that would eventually involve talking to people about a subject, I would research independently and only later approach the people involved; my questions would start a few steps back from what I already knew to be true from my research. This allows room for people to feel like experts, to provide color and detail, and to offer nuance that doesn’t come out if the person asking questions charges in declaring what they know and asking smaller, more closed questions.

This is… not the case in computer science, when questions are typically prefaced with a roundup of all 47 things you’ve googled, attempted, and wondered about, in the interest of expediency. This meant that my very second-person questions, in the vein of “how did you do this” and “what was the nature of your process,” sometimes were taken as some email rando not being able to, how you say, navigate the internet in a most basic way.

The more you know.

Happily, I got more than three responses, and people were incredibly generous in sharing their experiences, details of their work, and sometimes relevant messages from their astonishingly deep email archives.

bb, port 1984: Sean MacGuire

The first response I got, delightfully, was for the service named after my initials: bb, on port 1984. More delightfully, this turned out to be the port reserved for software called Big Brother, “the first Web-based Systems and Network Monitor.” Its author, Sean MacGuire, figured out the process for reserving a port after researching IANA’s role in it. At the time (January 1999), it was approved in 12 days. He described it as “totally painless.” Fun fact: Sean also registered the 65th registered domain in Canada, which required faxing proof of the company and making a phone call to the registrar.

The thing I started to learn with Sean’s response was how this was, at one point, pretty ordinary. Most web-based services restrict themselves to ports 80 and 443 now, in large part because a lot of enterprise security products clamp down on access by closing less-commonly used ports, so reserving a port for your new service isn’t always a necessary step now.

In which I am written to by one of the chief architects of the internet as we know it

The next response I got was from a little further back in computer science history. For context, I’ll tell you how I went about contacting people: back in December, after my talk was accepted for !!con West, I went through the /etc/services file on my computer and selected people to contact. I picked people whose email address domains looked like they might still be around, who worked for companies where people tend to have long tenures, or who were contacts tied to interesting-sounding services.

I did this across about ten days, which meant that, by the time I got to the end, I couldn’t have recounted to you the first folks I selected, particularly as I’d chosen 288 people to try to reach in all. Incidentally, about half of those bounced – not nearly as many as I expected.

This is all to say that I was a little startled to read this response:

> How did you come to be the person in charge of reserving the port

I designed the HTTP protocol

Which reminded me that I had indeed selected this entry as one worth diving into, when I was first getting a handle on this research:

http          80/udp www www-http # World Wide Web HTTP

http          80/tcp www www-http # World Wide Web HTTP

#                       Tim Berners-Lee <timbl@W3.org>

He was, I am pleased to say, impeccably polite in his brief response, and he recommended his book, Weaving the Web, which is such a wonderful look at finding compromise between competing standards and design decisions across dozens of institutions, countries, and strong-willed people. As he said, more information on his place within this work and that file can be found there, and if you’re at all curious, I so recommend it.

More contributors

I also liked that some people had fun with this or considered it, as Christian Treczoks  of Digivote, port 3223, put it, a “real “YEAH!” moment.” Barney Wolff, of LUPA, worked a few layers of meaning into his assignment of port 1212: “I picked 1212 because the telephone info number was <area>-555-1212. And LUPA (an acronym for Look Up Phone Access) was a pun on my last name. I don’t know if my bosses at ATT or anyone at IANA ever noticed that.”

Christian Catchpole claimed port 1185, appropriately named catchpole. He requested a low port number in the interest of claiming something memorable. He explained: “The original project back in 2002 involved a compact wire protocol for streaming objects, data structures and commands etc. While the original software written is no longer in the picture, the current purpose of the port number involves the same object streaming principal.  I am currently using the port for async communications for my autonomous marine robotics project.” The original uses of many ports have shifted into computer science history, but Christian’s projects live on.

Alan Clifford (mice, port 5022) claimed his space for a personal project; approval took 21 days. (He, like several people I contacted, keeps a deep and immaculate email archive.) Mark Valence (keysrvr at 19283 and keyshadow at 19315) recounted his involvement thusly: “I was writing the code, and part of that process is choosing a port to use.” He ended up in /etc/services around 1990 or 1991, when his team was adding TCP/IP as an option for their network service a year or so prior, enabling Macs, PCs, and various Unix systems to communicate with each other.

Ulrich Kortenkamp (port 3770, cindycollab) was one of two developers of Cinderella, and he claimed a port in /etc/services to codify their use of a private port for data exchange. He added: “And I am still proud to be in that file :)”

Greg Hudson’s contributions date to his time as a staff engineer at MIT, when he became a contributor to and then a maintainer of the school’s Zephyr IM protocol (zephyr-hm in the file) and then similarly with Subversion, the open-source version control system now distributed by Apache. His name is connected to ports 2102-2104 for Zephyr and port 3690 for Subversion.

Jeroen Massar has his name connected to four ports:

He noted that AURORA also has an SCTP allocation too, which is still fairly rare, despite that protocol being available since 2011. He remarked, “[This] is actually likely the ‘cool’ thing about having ‘your own port number’: there is only 65536 of them, and my name is on 4 of them ;)”

I asked people how they knew what to do; some were basically like :shrug: “The RFC?” But others explained their context at the time. Mostly, folks seemed to have general industry awareness of this process and file just because of the work they did. (“I was the ‘Network Guy’ in the company,” said Christian Treczoks.)  Some knew the RFC or knew to look for it; others had been involved with the IETF and were around for the formation of these standards. My anecdotal impression was that it was, at that point, just part of the work. If you were on a project that was likely to need a port, it was known how you’d go about getting it.

Who controls this file? Where does the official version come from?

Like so many things in the big world of the internet, /etc/services and its contents are controlled by IANA. The official version varies; what’s on IANA’s official and most up-to-date version deviates some from what you might find locally. The version of /etc/services on my Mac, as I’ve mentioned, is about 13 years out of date. However, people are still claiming ports, and you can see the most current port assignments at IANA’s online version.

On most Unixes, the version of /etc/services you see is a snapshot of the file taken from IANA’s official version at the time that version of the OS was released. When installing new services, often you’ll want to tweak your local copy of /etc/services to reflect the new service, if it’s not already there, even if only as a reminder.

However, updates vary between OSes; the version included with the Mac OS is not the most current, and how updates are added and communicated can vary widely. Debian, for instance, furnishes /etc/services as part of the netbase package, which includes “the necessary infrastructure for basic TCP/IP based networking.” If the included version of /etc/services got out of date, one could file a bug to get it updated.

To learn how /etc/services managed and how to contribute, the golden standard is:

RFC 6335

No, seriously. The Procedures for the Management of the Service Name and Transport Protocol Port Number Registry has most of what you would need to figure out how to request your own port. While the process is less common now, it’s still regimented and robustly maintained. There’s a whole page just about statistics for port request and assignment operations. While this isn’t as commonly used as it once was, it’s still carefully governed.

RFC 7605, Recommendations on Using Assigned Transport Port Numbers, includes guidance on when to request a port). 7605 and 6335 are concatenated together as BCP 165, though 6335 is still referred to frequently and is the most commonly sought and cited resource.

How can you get your piece of /etc/services glory?

There’s real estate left to claim; as of this writing, more than 400 ports are still available. Others have notes that the service claims are due to be removed, with timestamps from a few years ago, and just haven’t yet.

If you have a service in need of a port, I am pleased to tell you that there is a handy form to submit your port assignment request. You’ll need a service name, transport protocol, and your contact information, as you might guess. You can also provide comments and some other information to bolster your claim. The folks managing this are pretty efficient, so if your request is valid, you could have your own port in a couple of weeks, and then you could be hiding inside of your computer many years from now, waiting to startle nosy nerds like me.

Despite the shift to leaning more heavily on ports 80 and 443, people are absolutely still claiming ports for their services. While the last date in my computer’s /etc/services file is from about a decade ago, the master IANA list already has a number of dates from this year.

So: make your service, complete the form, and wait approximately two weeks. Then cat /etc/services | grep $YourName, and your immortality is assured (until IANA does an audit, anyway).  

And if you do, please let me know. Enabling people to do great, weird, and interesting things is one of my favorite uses of time, and it’d make my day. Because there are no computers without people (or not yet, anyway), and a piece of that 16-bit range is waiting to give your work one more piece of legitimacy and identity.

Thanks to everyone who wrote back to me for being so generous with their experiences, to Christine Spang for the details about Debian /etc/services updates, to the patient overseers of RFC 6335, and the authors of all the RFCs and other standards that have managed to keep this weird world running.