The Art of Readable Code
I. Code should be easy to understand II. Packing information into names 1. Choosing specific words Ex, ‘getPage(url)’ method. The word "get" doesn't really say much. Does this method get a page from a local cache, from a database or from the Internet? a more specific name might be ...
I. Code should be easy to understand
II. Packing information into names
1. Choosing specific words
Ex, ‘getPage(url)’ method. The word "get" doesn't really say much. Does this method get a page from a local cache, from a database or from the Internet? a more specific name might be 'fetchPage(), downloadPage()'…
Finding more colorful wordsKey idea: It’s better to clear and precise than to be cute
Ex:
- send : deliver, dispatch, route, distribute … - find : search, extract, locate, …. - start : launch, begin, create, open …. - make : create, setup, build, new, add …
2. Avoiding generic names like tmp, retval, …
Because it doesn’t pack much information. However, some cases where generic names do carry meaning. There are, however, some cases where generic names do carry meaning. Let’s take a look at when it makes sense to use them. Ex:
if (right < left) { tmp = right; right = left: left = tmp; } Or String tmp = user.name(); tmp += " " + user.phone_number(); tmp += " " + user.email(); template.set("user_info", tmp);
Loop IteratorsAdvice : the name ‘tmp’ should be used only in cases when being short-live or temporary
With names like (i, j, k) , another choise would be (club_i, member_i, user_i) or (ci, mi, ui) For instance, the following loops find which users belong to which clubs:
for (int i = 0; i < clubs.size(); i++) for (int j = 0; j < clubs[i].members.size(); j++) for (int k = 0; k < users.size(); k++) if (clubs[i].members[k] == users[j]) cout << "user[" << j << "] is in club[" << i << "]" << endl;
In the if statement, members[] and users[] are using the wrong index. Bugs like these are hard to spot because that line of code seems fine in isolation:
if (clubs[i].members[k] == users[j])
In this case, using more precise names may have helped. Instead of naming the loop indexes (i,j,k), another choice would be (club_i, members_i, users_i) or, more succinctly (ci, mi, ui). This approach would help the bug stand out more:
if (clubs[ci].members[ui] == users[mi]) # Bug! First letters don't match up.
When used correctly, the first letter of the index would match the first letter of the array:
if (clubs[ci].members[mi] == users[ui]) # OK. First letters match.
3. Using concrete names instead of abstract names
When naming a variable, function, or other element, describe it concretely rather than abstractly. For example, suppose you have an internal method named ServerCanStart(), which tests whether the server can listen on a given TCP/IP port. The name ServerCanStart() is somewhat abstract, though. A more concrete name would be CanListenOnPort(). This name directly describes what the method will do.
4.Attaching extra infomation to a name by using a suffix or prefix
So if there’s something very important about a variable that the reader must know, it’s worth attaching an extra “word” to the name. For example, suppose you had a variable that contained a hexadecimal string:
string id; // Example: "af84ef845cd8"
You might want to name it hexid instead, if it’s important for the reader to remember the ID’s format.
Values with UnitsIf your variable is a measurement (such as an amount of time or a number of bytes), it’s helpful to encode the units into the variable’s name. Ex:
var start = (new Date()).getTime(); ‘start’ should be rename → start_ms (milisecond) createCache(int size) : size → size_mb throttleDownload(float limit) : limit → max_kbpsEncoding other important attributes
Situation | Variable name | Better name |
---|---|---|
Bytes of html have been converted to UTF-8 | html | html_utf8 |
Incoming data has been url encoded | data | data_urlenc |
5. How long should a name be?
When picking a good name, there’s an implicit constraint that the name shouldn’t be too long. The longer a name is, the harder it is to remember, and the more space it consumes on the screen, possibly causing extra lines to wrap.
On the other hand, programmers can take this advice too far, using only single-word (or singleletter) names. So how should you manage this trade-off? How do you decide between naming a variable d, days, or days_since_last_update?
This decision is a judgment call whose best answer depends on exactly how that variable is being used. I have a suggestion shorter names are Okay for shorter scope.
III. Names that can’t miscontrued
**Key idea **: actively scrutinize your names by asking yourself, “what other meanings could someone interpret from this name?”
Ex: Filter() method
result = Database.all_objects.filter(“year <= 2017”);
Question: What does results now contain?
- Objects whose year is <= 2017
- Objects whose year is not <= 2017
It’s unclear whether it means “to pick out” or “to get rid of.” It’s best to avoid the name filter because it’s so easily misconstrued. If you want to pick out, a better name is ‘select()’. If you want to get rid of, a better name is ‘exclude()’.
Prefer min and max for (Inclusive) LimitsEx : Let’s say your shopping cart application needs stop people from buying more than 10 items at once:
CART_TOO_BIG_LIMIT = 10 if shopping_cart.number_items() > CART_TOO_BIG_LIMIT Error(“Too many items in cart”)
The root proplem is that CART_TOO_BIG_LIMIT is an ambiguous name, it’s not clear whether you mean “up to” or “up to and including”
Advice: The clearest way to name a limit is to put max_ or min_ in front of the thing being limited.
In this case, the name should be MAX_ITEMS_IN_CART. The new code is simple and clear:
MAX_ITEMS_IN_CART = 10 if shopping_cart.number_items() > MAX_ITEMS_IN_CART Error(“Too many items in cart”)Prefer begin and end for Inclusive/Exclusive Ranges
Naming Booleans When picking a name for boolean variable or a function that returns a boolean, be sure it’s clear what true or false really mean. Here’s a dangerous example:
boolean read_password = true;
Depending on how you read it (non pun intended), there are two different interpretations:
- we need to read the password
- the password has already been read
In this case, it’s best to avoid the word ‘read’, and name it ‘need_password’ or ‘user_is_authenticated’ instead. In general, adding words like ‘is, has, can, or...’ should can make booleans more clear.
Matching Expectations of UsersSome names are misleading because the user has a preconceived idea of what the name means, even though you mean else.
Ex: get()
Many programmers are used to the convention that methods starting with get are ‘lightweight accessor’ that simple return an internal member. Going against this convention is likely to mislead those users.
Here’s an example, in Java, of what not to do:
public class StatisticsCollector { public void addSample(double x) { ... } public double getMean() { // Iterate through all samples and return total / num_samples } ... }
In this case, the implementation of ‘getMean()’ is to iterate over past data and calculate the mean on the fly. This step might be very expensive if there’s a lot of data. But an unsuspecting programmer might call ‘getMean()’ carelessly, assuming that it’s an inexpensive call.
Instead, the method should be renamed to some thing like ‘computeMean()’, which sounds more like an expensive operation.