Add Magic Numbers article from rachelbythebay.com

This commit is contained in:
lhark 2021-01-04 23:05:57 +01:00
parent eaa5fd2fdd
commit de84216928

157
magicnumbers Normal file
View file

@ -0,0 +1,157 @@
# Magic numbers from https://rachelbythebay.com/w/2020/11/26/magic/
"HTTP" is 0x48545450 (1213486160) or 0x50545448 (1347703880). If it shows up in your error logs, you're probably calling out to a poor web server instead of someone who actually speaks its binary protocol. See also: malloc("HTTP").
"GET " is 0x47455420 (1195725856) or 0x20544547 (542393671). This might show up in your server's logs if someone points a HTTP client at it - web browser, curl, that kind of thing.
"SSH-" is 0x5353482d (1397966893) or 0x2d485353 (759714643). If this shows up in your error logs, you're probably calling out to some poor SSH daemon instead of someone speaking your binary language.
1048576 is 2^20. It's a megabyte if you're talking about memory.
86400 is the number of seconds in a UTC day, assuming you aren't dealing with a leap second.
86401 would be the number when you *are* dealing with a (positive) leap second.
86399 is actually possible, if the planet speeds up and we need to *skip* a second - the still-mythical "negative leap second".
82800 (23 hours) is what happens in the spring in time zones which skip from 1:59:59 to 3:00:00 in the early morning: you lose an hour. Any scheduled jobs using this time scale in that hour... never run!
90000 (25 hours) is what you get in the fall in time zones which "fall back" from 1:59:59 summer time to 1:00:00 standard time and repeat the hour. Any scheduled jobs using this time scale in that hour... probably run twice! Hope they're idempotent, and the last one finished already! See also: Windows 95 doing it over and over and over.
10080 is the number of minutes in a regular week. You run into this one a lot if you write a scheduler program with minute-level granularity.
10140 is what happens when you have the spring DST transition.
10020 is then what you get in the winter again.
40 is the number of milliseconds of latency you might find that you can't eliminate in your network if you don't use TCP_NODELAY. See also: Nagle.
18 is the number of seconds that GPS and UTC are currently spread apart, because that's how many leap seconds have happened since GPS started. UTC includes them while GPS does not (but it does supply a correction factor). See also: bad times with certain appliances.
168 is one week in hours. Certain devices that do self-tests operate on this kind of timeframe.
336 is two weeks in hours and is another common value.
2000 is how many hours you work in a year if you work 40 hours a week for 50 weeks and the other two are handled as "vacation". This is why you can approximate a full-time yearly wage from an hourly wage by multiplying by 2: $20/hr -> $40K/year.
08 00 is the hex representation of what you get in an Ethernet packet for IPv4 traffic right after the destination and source hardware addresses. See also: EtherType.
45 00 is what comes after that, assuming a relatively boring IPv4 packet.
08 06 is what shows up in an Ethernet packet after the destination and source hardware addresses when some host is trying to do ARP for another host. You'll probably see it right before the aforementioned "08 00 45 00" when two hosts haven't talked to each other recently on the same network.
86 DD is what you'll get for IPv6, and you should see more of that in your future. Also, you won't see ARP.
2130706433 may be interpreted as 127.0.0.1 by some programs, and you can even ping it on some systems depending on how they handle network addresses for various fun and arcane historical reasons. If it works on your machine, others should, too. See also: ping and inet_aton and the first part.
33434 is the first UDP port used by the traditional (i.e., Van Jacobson style) traceroute implementation. See also: 32768 + 666.
49152 is $C000, or a block of open memory on a Commodore 64 frequently used for non-BASIC funtimes. SYS 49152 is how you jump there.
64738 is the reset vector on those same Commodore 64 machines.
64802 is the reset vector for people claiming to be even more OG than the C-64 crowd - it's from the VIC-20.
828 was the beginning of the cassette buffer on a Commodore 64, and was a place you could store a few bytes if you could be sure nobody would be using the cassette storage system while your program was running.
9090909090 might be a string of NOPs if you are looking at x86 assembly code. Someone might have rubbed out some sequence they didn't like. See also: bypassing copy protection for "software piracy".
31337 might mean you are elite, or if you like, eleet.
f0 0f c7 c8 is a sequence made famous in 1997 when people discovered they could lock up Intel machines while running as an unprivileged user on an unpatched system. See also: the Pentium f00f bug.
-1 is what you get back from fork() when it fails. Normally, it returns a pid_t which is positive if you're the parent, or zero if you're the child.
-1, when handed to kill() on Linux, means "target every process but myself and init". Therefore, when you take the return value from fork(), fail to check for an error, and later hand that value to kill(), you might just kill everything else on the machine. See also: fork() can fail - this is important.
15750 Hz is the high-pitched whine you'd get from some old analog NTSC televisions. See also: colorburst.
10.7 MHz is an intermediate frequency commonly used in FM radios. If you have two radios and at least one has an analog dial, try tuning one to a lower frequency and another to 10.7 above that (so 94.3 and 105.0 - hence the need for analog tuning). You may find that one of them squashes the signal from the other at a decent distance, even. See also: superheterodyne conversion.
455 kHz is the same idea but for AM receivers.
64000 is the signaling rate of a DS0 channel. You get it by having 8000 samples per second which are 8 bits each.
4000 Hz is the absolute max theoretical frequency of a sound you could convey during a phone call going across one of those circuits. In practice, it rolls off significantly a few hundred Hz before that point. See also: Nyquist.
1536000 therefore is the rate of a whole T1/DS1, since it's 24 DS0s. This is where we get the idea of a T1 being 1.5 Mbps.
/16 in the land of IPv4 is 2^16, or 65536 addresses. In the old days, we'd call this a "class B".
/17 in IPv4 is half of that /16, so 32768.
/15 in IPv4 would be double the /16, so 131072.
/64 is a typical IPv6 customer allocation. Some ISPs go even bigger. Just a /64 gives the customer 2^64 addresses, or four billion whole IPv4 Internets worth of space to work with. See also: IPv6 is no longer magic.
1023 is the highest file descriptor number you can monitor with select() on a Unix machine where FD_SETSIZE defaults to 1024. Any file descriptor past that point, when used with select(), will fail to be noticed *at best*, will cause a segmentation fault when you try to flip a bit in memory you don't own in the middle case, and will silently corrupt other data in the worst case. See also: poll, epoll, and friends, and broken web mailers.
493 is what you see if someone takes a common file mode from a Unix box of 0755 (-rwxr-xr-x) and renders it as decimal. The 0755 notation you'd typically see for that sort of thing is octal. See also: bad times with unit tests.
420 is likewise the decimal for 0644, the equivalent sort of mode (user gets to read and write, everyone else can't write) when it's not executable.
497 is the approximate number of days a counter will last if it is 32 bits, unsigned, starts from zero, and ticks at 100 Hz. See also: old Linux kernels and their 'uptime' display wrapping around to 0.
(2^32) - (100 * 60 * 60 * 24 * 497)
887296
(2^32) - (100 * 60 * 60 * 24 * 498)
-7752704
49.7 is the approximate number of days that same counter will last if it instead ticks at 1000 Hz. See also: Windows 95 crashing every month and a half (if you could keep it up that long).
(2^32) - (1000 * 60 * 60 * 24 * 49)
61367296
(2^32) - (1000 * 60 * 60 * 24 * 50)
-25032704
208 is the approximate number of days a counter will last if it is 64 bits, unsigned, starts from zero, ticks every nanosecond, and is scaled by a factor of 2^10. See also: Linux boxes having issues after that many days of CPU (TSC) uptime (not necessarily kernel uptime, think kexec).
(2^64) - (1000000000 * 60 * 60 * 24 * 208 * 1024)
44235273709551616
(2^64) - (1000000000 * 60 * 60 * 24 * 209 * 1024)
-44238326290448384
248 is the approximate number of days a counter will last if it is 32 bits, signed, starts from zero, and ticks at 100 Hz. See also: the 787 GCUs that have to be power-cycled at least that often lest they reboot in flight.
(2^31) - (100 * 60 * 60 * 24 * 248)
4763648
(2^31) - (100 * 60 * 60 * 24 * 249)
-3876352
128 is 2^7, and you'll approach this (but never hit it) when you fill up a tiny unsigned counter on an old system. See also: our cherished 8 bit machines of the 80s, the number of lives you could get in Super Mario Bros, and countless other places of similar vintage.
256 is 2^8, in the event that same counter was unsigned.
32768 is 2^15, and if you used a default 16 bit signed value in your SQL database schema (perhaps as "int"), this is the first number you can't jam into that field. See also: a certain company's "employee ID" column in a database that broke stuff for me as a new employee once upon a time.
65536 is 2^16, in the event that same counter was signed instead.
16777... is the first five digits of 2^24, a number you might see when talking about colors if you have 3 channels (R, G, B) and 8 bits per channel. See also: "16.7 million colors" in marketing stuff from back in the day.
2147... is the first four digits of 2^31, a number you will probably see right around the time you wrap a 32 bit signed counter (or run out of room). See also: Unix filesystems without large file support.
4294... is the first four digits of 2^32, and you'll see that one right around the time you wrap a 32 bit unsigned counter (or run out of room).
9223... is the first four digits of 2^63... 64 bit signed counter...
1844... 2^64... 64 bit unsigned counter...
"1969", "December 31, 1969", "1969-12-31" or similar, optionally with a time in the afternoon, is what you see if you live in a time zone with a negative offset of UTC and someone decodes a missing (Unix) time as 0. See also: zero is not null.
"1970", "January 1, 1970", "1970-01-01" or similar, optionally with a time in the morning... is what you get with a positive offset of UTC in the same situation. Zero is not null, null is not zero!
A spot in the middle of the ocean that shows up on a map and, once zoomed WAY out, shows Africa to the north and east of it is 0 degrees north, 0 degrees west, and it's what happens when someone treats zero as null or vice-versa in a mapping/location system. See also: Null Island.
5 PM on the west coast of US/Canada in the summertime is equivalent to midnight UTC. If things suddenly break at that point, there's probably something that kicked off using midnight UTC as the reference point. The rest of the year (like right now, in November), it's 4 PM.
2022 is when 3G service will disappear in the US for at least one cellular provider and millions of cars will no longer be able to automatically phone home when they are wrecked, stolen, or lost. A bunch of home security systems with cellular connectivity will also lose that connectivity (some as backup, some as primary) at that time. Auto and home insurance rates will go up for some of those people once things are no longer being monitored. Other "IoT" products with that flavor of baked-in cellular connectivity which have not been upgraded by then will mysteriously fall offline, too - parking meters, random telemetry loggers, you name it.
February 2036 is when the NTP era will roll over, a good two years before the Unix clock apocalypse for whoever still hasn't gotten a 64 bit time_t by then. At that point, some broken systems will go back to 1900. That'll be fun. Everyone will probably be watching for the next one and will totally miss this one. See also: ntpd and the year 2153.
January 19, 2038 UTC (or January 18, 2038 in some local timezones like those of the US/Canada) is the last day of the old-style 32 bit signed time_t "Unix time" epoch. It's what you get when you count every non-leap second since 1970. It's also probably when I will be called out of retirement for yet another Y2K style hair on fire "fix it fix it fix it" consulting situation.
1024 is how many weeks you get from the original GPS (Navstar, if you like) week number before it rolls over. It's happened twice already, and stuff broke each time, even though the most recent one was in 2019. The next one will be... in November 2038. Yes, that's another 2038 rollover event. Clearly, the years toward the end of the next decade should be rather lucrative for time nuts with debugging skills! See also: certain LTE chipsets and their offset calculations.
52874 is the current year if someone takes a numeric date that's actually using milliseconds as the unit and treats it like it's seconds, like a time_t. This changes quickly, so it won't be that for much longer, and by the time you read this, it'll be even higher. See also: Android taking me to the year 42479.
$ date -d @$(date +%s)000
Sun Mar 25 21:50:00 PDT 52874
Any of these numbers might have an "evil twin" when viewed incorrectly, like assuming it's signed when it's actually signed, so you might see it as a "very negative" number instead of a VERY BIG number. Example: -128 instead of 128, -127 instead of 129, -126 instead of 130, and so on up to -1 instead of 255. I didn't do all of them here because it's just too much stuff. See also: two's complement representation, "signed" vs. "unsigned" (everywhere - SQL schemas, programming languages, IDLs, you name it), and *printf format strings, and epic failure by underflow.
These numbers ALSO have another "evil twin" in that they might be represented in different byte orders depending on what kind of machine you're on. Are you on an Intel machine? That's one way. Are you looking at a network byte dump? That's another. Did you just get a new ARM-based Macbook and now the numbers are all backwards in your raw structs? Well, there you go. I listed them for the HTTP/GET /SSH- entries up front because I know someone will search for them and hit them some day. The rest, well, you can do yourself. See also: htons(), htonl(), ntohs(), ntohl(), swab() and a bunch more depending on what system you're on.