29 KiB
The logic of the world is prior to all truth and falsehood.
— Ludwig Wittgenstein[1]
A curated list of falsehoods programmers believe in. A falsehood is an idea that you initially believed was true, but in reality, it is proven to be false.
E.g. of an idea: valid email address exactly has one @
character. So, you will use this rule to implement your email-field validation logic. Right? Wrong! The reality is: emails can have multiple @
chars. Therefore your implementation should allow this. The initial idea is a falsehood you believed in.
The falsehood articles listed below will have a comprehensive list of those false-beliefs that you should be aware of, to help you become a better programmer.
Contents
- Meta
- Arts
- Business
- Cryptocurrency
- Dates and Time
- Education
- Emails
- Geography
- Human Identity
- Internationalization
- Management
- Multimedia
- Networks
- Phone Numbers
- Postal Addresses
- Science
- Society
- Software Engineering
- Transportation
- Typography
- Video Games
- Web
Meta
- Falsehoods Programmers Believe - A brief list of common falsehoods. A great overview and quick introduction into the world of falsehoods.
- Falsehoods about Programming - A humbling and fun list on programming and programmers themselves.
- Falsehoods about Falsehoods Lists - Meta commentary on how these falsehoods shouldn't be handled.
Arts
- Falsehoods about Music - False assumption that might be made in codifying music.
- Falsehoods about Art - Common misconceptions about art.
Business
- Falsehoods about Online Shopping - Covers prices, currencies and inventory.
- Falsehoods about Prices - Covers currencies, amounts and localization.
- Falsehoods about IBANs - International Bank Account Numbers are not international.
- Falsehoods about Economics - Economics are not simple or rational.
- Decimal Point Error in Etsy's Accounting System - The importance of types in accounting software: missing the decimal point ends up with 100x over-charges.
- Twenty five thousand dollars of funny money - Same error as above at Google Ads, or the danger of separating your pennies from your dollars, where $250 internal coupons turned into $25,000. My advice: get rid of integers and floats for monetary values. Use decimals. Or fallback to strings and parse them, don't validate.
- Characters
<
and>
in company names lead to XSS attacks - Because UK allows companies to be registered with special characters, a hacker leveraged them to register\"><SCRIPT SRC=MJT.XSS.HT></SCRIPT> LTD
, but also; DROP TABLE "COMPANIES";-- LTD
,BETTS & TWINE LTD
andSAFDASD & SFSAF \' SFDAASF\" LTD
. - Minutiae of company names - How the rules of the State of Delaware and the IRS does not intersects.
- CLDR currency definitions - Currency validity date ranges overlap due to revolts, invasions, new constitutions, and slow planned adoption.
tax
- A PHP 5.4+ tax management library.
Cryptocurrency
- Falsehoods about Bitcoin - A list of mistaken perspectives on Bitcoin.
- Falsehoods about Ethereum - Misconceptions and common pitfalls in contract programming.
Dates and Time
- Falsehoods about Time - Seminal article on dates and time.
- More Falsehoods about Time - Part. 2 of the article above.
- Falsehoods about Time and Time Zones - Another takes on time-related falsehoods, with an emphasis on time zones.
- Critique of Falsehoods about Time - Takes on the first article above and provides an explanation of each falsehood, with more context and external resources.
- Falsehoods about Unix Time - Mind the leap second!
- Falsehoods about Time Zones - Has some nice points regarding the edge-cases of DST transitions.
- Your Calendrical Fallacy Is Thinking… - List covering intercalation and cultural influence, made by a community of iOS and macOS developers.
- Time Zone Database - Code and data that represent the history of local time for many representative locations around the globe.
- The Long, Painful History of Time - Most of the idiosyncrasies in timekeeping can find an explanation in history.
- You Advocate a Calendar Reform - Your idea will not work. This article tells you why.
- So You Want to Abolish Time Zones - Abolishing timezones may sound like a good idea, but there are quite a few complications that make it not quite so.
- The Problem with Time & Timezones - A video about why you should never, ever deal with timezones if you can help it.
- $26,000 Overcollection by Labor Department - The consequence of wrong calendar accounting.
- RFC-3339 vs ISO-8601 - An giant list of formats from the two standards, how they overlaps, and live examples.
- ISO-8601,
YYYY
,yyyy
, and why your year may be wrong - String formatting of date is hard. - UTC is Enough for everyone, right? - There are edge cases about dates and time (specifically UTC) that you probably haven't thought of.
- Storing UTC is not a silver bullet - “Just store dates in UTC” is not always the right approach.
- How to choose between UT1, TAI and UTC - Depends on your priorities between SI seconds, earth rotation sync, leap seconds avoidance.
- Why is subtracting these two times (in 1927) giving a strange result? - Infamous Stack Overflow answer about both complicated historical timezones, and how historical dates can be re-interpreted by newer versions of software.
- Critical and Significant Dates - From Y2K to the overflow of 32-bit seconds from Unix epoch, a list of special date to watch for depending on the system.
- “I'm going to a commune in Vermont and will deal with no unit of time shorter than a season.” - Is the note left on his terminal by a quitting engineer in the 70s, after too much effort toiling away on sub-second timing concerns. Source: The Soul of a New Machine.
Education
- Falsehoods CS Students (Still) Believe Upon Graduating - A list of things (not only) computer science students tend to erroneously and at times surprisingly believe even though they (probably) should know better.
- Postdoc myths - “Lots of things are said, written and believed about postdoctoral researchers that are simply not true.”
Emails
- Falsehoods about Email - On addresses, content and delivery.
- I Knew How to Validate an Email Address Until I Read the RFC - Provides intricate examples that are unsuspected valid email addresses according the RFC-822.
- So you think you can validate email addresses (FOSDEM 2018) - Presentation of edge-case email addresses and why you should not use regex to parse them.
- Your E-Mail Validation Logic is Wrong - A summary of the various, surprising things that are allowed in an email address.
Geography
- Falsehoods about Geography - Takes on places, their names and locations.
- Falsehoods about Maps - Covers coordinates, projection and GIS.
- I Hate Coordinate Systems - A guide for geospatial practitioners on diagnosing and fixing common issues with coordinate systems.
- Top 5 most insane kanji place names in Japan - “There's one special group of kanji that's hard even for Japanese people to read: place names.”
Human Identity
- Falsehoods about Names - The article that started it all.
- Falsehoods about Names – With Examples - A revisited version of the article above, this time with detailed explanations.
- Falsehoods about Biometrics - Fingerprints are not unique.
- Falsehoods about Families - You can't really define a family with strict rules.
- Falsehoods about Gender: #1 & #2 - Gender is part of human identity and has its own subtleties.
- Falsehoods about Me - Issues at the intersection of names and gender and internationalization.
- Gay Marriage: The Database Engineering Perspective - How to store a marriage in a database while addressing most of the falsehoods about gender, naming and relationships.
- Personal Names Around the World - How do people's names differ around the world, and what are the implications for the Web?
- XKCD #327: Exploits of a Mom - Funny take on how implementation of a falsehood might lead to security holes.
- Hello, I'm Mr. Null. My Name Makes Me Invisible to Computers - Real-life example on how implemented falsehood has negative impact on someone's life.
- HL7 v3 RIM - A flexible data model for representing human names.
- Apple iOS
NSPersonNameComponentsFormatter
- Localized representations of the components of a person's name.
Internationalization
On character encoding, string formatting, unicode and internationalization.
- Falsehoods about Language - Translating a software from English is not as straightforward as it seems to be.
- Falsehoods about Plain Text - Plain text can't cut it, which makes Unicode even more incredible for its ability to just work well.
- Falsehoods about text - A subset of the falsehoods from above, illustrated with some examples.
- Internationalis(z)ing Code - A video about things you need to keep in mind when internationalizing your code.
- Minimum to Know About Unicode and Character Sets - A good introduction to unicode, its historical context and origins, followed by an overview of its inner working.
- Awesome Unicode - A curated list of delightful Unicode tidbits, packages and resources.
- Dark corners of Unicode - Unicode is extensive, here be dragons.
- Let's Stop Ascribing Meaning to Code Points - Dives deeper in Unicode and dispels myths about code points.
- Breaking Our
Latin-1
Assumptions - Most programmers spend so much time withLatin-1
they forgets about other's scripts quirks. - Ode to a shipping label - Character encoding is hard, more so when each broken layer of data input adds its own spice.
- i18n Testing Data - Compilation of real-word international and diverse name data for unit testing and QA.
- Big List of Naughty Strings - A huge corpus of strings which have a high probability of causing issues when used as user-input data. A must have set of practical edge-cases to test your software against.
Management
- Falsehoods about Job Applicants - Assumptions about job applicants and their job histories aren't necessarily true.
Multimedia
- Falsehoods about Video - Cover it all: video decoding and playback, files, image scaling, color spaces and conversion, displays and subtitles.
- Horrible edge cases to consider when dealing with music - Music catalogs data are full of crazy stuff.
- MusicBrainz database schema - An open-source project and database that seems to have solved the complexity of music catalog management.
- DDEX - The industry standard for music metadata, including archiving, sound recording, sales and usage reporting, royalties and license deals.
- Apple Music Style Guide - Quality insurance guidelines to format music, art, and metadata to increase discoverability.
Networks
- Falsehoods about Networks - Covers TCP, DHCP, DNS, VLANs and IPv4/v6.
- Fallacies of Distributed Computing - Assumptions that programmers new to distributed applications invariably make.
- There's more than one way to write an IP address - Some parts of the address are optional, mind the decimal and octal notations, and don't forget IPv6 either.
- IDN is crazy - International characters in domain names mean support of homographs and heterographs.
hostname-validate
- An attempt to validate hostnames in Python.
Phone Numbers
- Falsehoods about Phone Numbers - Covers phone numbers, their representation and meaning.
libphonenumber
- Google's common Java, C++ and JavaScript library for parsing, formatting, and validating international phone numbers. Also available for C#, Objective-C, Python, Ruby and PHP.
Postal Addresses
- Falsehoods about Addresses - Covers streets, postal codes, buildings, cities and countries.
- Falsehoods about Residence - It's not only about the address itself, but the relationship between a person and its residence.
- Letter Delivered Despite No Name, No Address - Ultimate falsehood about postal addresses: you do not need one.
- What is the Most Minimal UK Address Possible? - The trick is to rely on postcodes, which in the UK are pretty specific and “often identify one or a few specific buildings, unlike countries where a postcode represents an entire neighbourhood”.
- The Bear with Its Own ZIP Code - Smokey Bear has his own ZIP Code (
20252
) because he gets so much mail. - Why doesn't Costa Rica use real addresses? - Costa Rican uses an idiosyncratic system of addresses that relies on landmarks, history and quite a bit of guesswork.
- Regex and Postal Addresses - Why regular expressions and street addresses do not mix.
- Parsing the Infamous Japanese Postal CSV - “I saw many horrors, but I've never seen this particular formatting choice anywhere else.”
- USPS Postal Addressing Standards - Describes both standardized address formats and content.
libaddressinput
- Google's common C++ and Java library for parsing, formatting, and validating international postal addresses.addressing
- A PHP 5.4+ addressing library, powered by Google's dataset.postal-address
- Python module to parse, normalize and render postal addresses.address
- Go library to validate and format addresses using Google's dataset.
Science
- Falsehoods about Systems of Measurement - On working with systems of measurement and converting between them.
Society
- Falsehoods about Political Appointments - Designing election systems has its own tricks.
- Falsehoods about Women In Tech - Myth about women in STEM (Science, Technology, Engineering, Math) industries.
Software Engineering
- Falsehoods about Versions - Attributing an identity to a software release might be harder than thought.
- Falsehoods about Build Systems - Building software is hard. Building software that builds software is harder.
- Falsehoods about Undefined Behavior - Invoking undefined behavior can cause anything to happen, for a much broader definition of "anything" than one might think.
- Falsehoods about CSVs - While RFC4180 to exists, it is far from definitive and goes largely ignored.
- Falsehoods about Package Managers - Covers package and their managers.
- Falsehoods about Testing - An attempt to establish a list of falsehoods about testing.
- Falsehoods about Search - Why search (including analysis, tokenization, highlighting) is deceptively complex.
- What every software engineer should know about search - A better sourced article on the difficulty of implementing search engines.
- Falsehoods about Pagination - Why your pagination algorithm is giving someone (possibly you) a headache.
- Falsehoods about garbage collection - Misconceptions about the predictability and performance of garbage collection.
- Myths about File Paths - Diversity of file-systems and OSes makes file paths a little harder than we might think of.
- The weird world of Windows file paths - “On any Unix-derived system, a path is an admirably simple thing: if it starts with a
/
, it's a path. Not so on Windows.” - Myths about CPU Caches - Misconceptions about caches often lead to false assertions, especially when it comes to concurrency and race conditions.
- Myths about
/dev/urandom
- There are a few things about/dev/urandom
and/dev/random
that are repeated again and again. Still they are false. - Facts about State Machines - State machines are often misunderstood and under-applied.
- Hi! My name is… - This talk could have been named falsehoods about usernames (and other identifiers).
- Popular misconceptions about
mtime
- Part of a post on why file'smtime
comparison could be considered harmful. - Rules for Autocomplete - Not falsehoods per se, but still a great list of good practices to implement autocompletion.
- Floating Point Math - “Your language isn't broken, it's doing floating point math. (…) This is why, more often than not,
0.1 + 0.2 != 0.3
.” - The yaml document from hell - YAML is full of obscure complexity like accidental numbers and non-string keys.
- I am endlessly fascinated with content tagging systems - There are edge-cases even in tagging systems which are suposed to be barebone.
- Falsehoods about Quantum Technology - Common misconceptions about quantum technology and computers.
Transportation
- Falsehoods about Cars - Even something as common as defining a car is full of pitfalls.
- Falsehoods about Airline Seat Maps - Airline seat maps are far more complex than just neat rows and columns of seats.
- The Maddening Mess of Airport Codes - Having multiple international and national agencies trying to reconcile history, practicality and logistics makes codes follow arcane rules.
- My name causes an issue with any booking! - Old airline reservation systems considers the
MR
suffix asMister
and drops it.
Typography
- Falsehoods about Fonts - Assumptions about typography on the web and in desktop applications.
- Truths programmers should know about case - A complete reverse of the falsehoods format, on the topic of case (as in uppercase and lowercase text).
Video Games
- The Door Problem - All the things you have not considered implementing for your doors in games.
Web
- Falsehoods about REST APIs - Pitfalls to be mindful of when creating and documenting APIs.
- URLs: It's complicated… - There's a lot of components in an URL, and all have their own logic.
- The Hidden Complexity of Downloading Favicons, Told in 15+ Edge Cases - Downloading that little icon you see in you browser tabs should be a simple exercise. It turned out to be a lot more complicated than you think. Be vigilant that you are not shaving a Yak.
Contributing
Your contributions are always welcome! Please take a look at the contribution guidelines first.
Footnotes
This list gathered some popularity in social medias over the past few years. See it being discussed and mentioned elsewhere.
The header image is based on a modified photo taken in February 2010 by Iza Bella, distributed under a Creative Commons BY-SA 2.0 UK license.
[1]: Notebooks, 1914-1916 (Liveright, 2022) - source: page 14e. [↑]