Tend the Fire

Book Summary: Don’t Shoot the Dog

2013-08-07T00:00:00+00:00

Don’t Shoot the Dog, by Karen Pryor, is an excellent book on training via positive conditioning. It’s explicitly about animal training, but it’s clearly applicable to training yourself and other humans, too. I kept notes when I thought I was learning something new; this is my summary of things learned.

1. Reinforcement

1.1. Reinforcement vs. Rewards

A reinforcer is anything that tends to increase the probability that the act will occur again. You can make an action more frequent with positive reinforcement; you cannot reinforce behavior that isn’t happening.

An “aversive” is anything a person or animal will work to avoid. A negative reinforcer is an aversive that can be halted or avoided by changing behavior – preferably, as soon as the new behavior starts, the aversive stops.

Spends a while talking about “reward” and “punishment”, and how most people’s notions of these vary widely from positive and negative reinforcement.

1.2. Timing reinforcers

The timing of a reinforcer carries precise information. A “yes” at the right moment during a lesson is far more effective and informative than thorough praise five minutes later. Reinforcing too late is one of the most common difficulties – actions happen quickly, and you might be reinforcing the wrong things. It may even help to have someone else watch you for late reinforcers. “Laggardly reinforcement is the beginning trainers biggest problem.”

Reinforcing too late is common in rewarding behavior; reinforcing too early is common in trying to be encouraging – like getting kids to do schoolwork, or think they’re smart. Both train unexpected or random behavior.

Timing is also important for negative reinforcement. If the negative reinforcer doesn’t cease exactly when behavior has been modified, it neither reinforces nor informs.

1.3. Scheduling reinforcers

Reinforcers should be as small as will be noticed. A small mouthful of food for an animal or a single M&M for a human; a smile, a pat, a single “good”. If you keep the reinforcer small, the trainee will tire of it less quickly.

In animals, you can get about 20 reinforcements in per session, and no more than about 80 per day, before they lose interest. This varies by species and age.

1.4. Jackpots

Surprising or “unearned” positive reinforcers of, say, ten times the usual size, might relieve a feeling of oppression, resentment, or sullen inaction. Apparently, this isn’t really understood.

1.5. Conditioned reinforcers

A conditioned reinforcer is a signal deliberately presented before or during the delivery of a primary reinforcer. (clickers, whistles, “good dog”). A conditioned reinforcer lets you pinpoint the timing of reinforcement – essentially, a way to bring the reinforcement immediately to the timing of the action, even if doing so with the primary reinforcer is impossible. (e.g., training dolphins to jump)

This signal should be reserved for this purpose – don’t give it outside of reinforcement training. You can give all the love and affection you like outside of training; but reserve the “good job” signal to signal correctly learned behavior. (Consider: even small children resent false praise!)

Train a conditioned reinforcer by pairing it with multiple other reinforcers – this way, the learner reacts positively even if they’re currently satiated with the food, water, or praise you gave it in training. In training, these signals should be followed, when first possible, by a primary reinforcer. You can tell when an animal recognizes the “good job” signal; it startles, and starts to seek the primary reinforcer.

The “good job” signal will end offered behavior. As such, you may need a second signals to say “good, keep going.” This keep-going signal doesn’t have to be trained directly with a primary reinforcer; the learner will soon start to recognize it as signalling an intermediate step if it’s followed often enough by the “good job” signal.

A conditioned negative reinforcer, if needed, should always be a warning that current behavior will lead to immediate punishment. It must precede actual punishment, not just be shouting while punishing (as with a choke chain).

1.6. Reinforcement Schedules

Constant reinforcement is only needed during learning. To maintain a behavior, it’s important to switch to variable reinforcement. Dropping reinforcement suddenly will quickly extinguish the behavior; gradually moving to longer variable reinforcement will make the behavior persist.

Better, once the behavior is on variable reinforcement, you can selectively reinforce the best varieties of the learned behavior, shaping it. (e.g. training dolphins to jump high)

1.7. Exceptions to Variable Reinforcement

Do not put puzzles or tests on variable reinforcement. Roughly speaking, the learner is uncertain that they performed correctly, and needs the feedback.

1.8. Long Reinforcement Schedules

Use variable, not fixed, schedules. Extremely long schedules sometimes lead to extinction, too – usually at a metabolic boundary, where the reward isn’t worth the energy to do the needed work.

Long reinforcement schedules also lead to “slow starts”; the behavior will happen, but even animals will put off starting the long sequence of behaviors as the schedule gets longer. Pryor seems to equate this to procrastination, and says that you can beat it by introducing a reinforcer for getting started.

1.9. Reinforcing yourself

Do it, do it. Most of us are vastly under-reinforced for our behavior.

2. Shaping

Shaping: teach successive approximations to desired behavior. Works because even trained behavior is variable.

Establish intermediate goals, starting with a behavior that already occurs sometimes.

2.1. The Ten Laws of Shaping

2.1.1. Raise criteria in increments small enough that the subject always has a realistic chance for reinforcement.

Raise criteria within the range that the subject is already achieving. Not a function of the subject’s ability, but a function a the subject’s current behavior.

2.1.2. Train one aspect of a behavior at a time; don’t try to shape for two criteria simultaneously.

“At a time”, here, means to only work on one criterion at a time, to avoid confusing the subject. If the task can be broken into separate components, do that. (e.g. putting involves getting angle and distance right. You can shape these separately!)

2.1.3. Put the current response on a variable schedule of reinforcement before adding or raising criteria.

If a learner has been earning reinforcers predictably, simply skipping reinforcers might be confusing. You have to train that, too – and train it before you can train other criteria. Once the subject has learned that a skipped reinforcer doesn’t mean “wrong”, just “try again”, you can start selecting for a new criterion.

2.1.4. When introducing a new criterion, temporarily relax the old ones.

Well-learned behavior may fall apart when learning new skills. Shape the new skills anyway; reinforce the combination of skills once all the skills have been recently learned (and you’re sometimes getting it).

2.1.5. Plan ahead of your subject, so that you’re prepared.

If your subject makes a sudden leap forward, you want to know what to reinforce. (Some especially intelligent animals, and certainly people, can learn to anticipate your program of shaping, and just go ahead and do each step immediately.) Even if these occasions are rare, they’re exciting for you and the subject, and you want to be prepared to use that excitement if it happens.

2.1.6. Stick to one shaper per behavior.

Different trainers will have different ideas of the same criterion.

2.1.7. If one shaping procedure is not eliciting progress, find another.

These things aren’t magic; there’s plenty of programs leading to the same behavior.

2.1.8. Don’t interrupt a training session gratuitously; that constitutes a punishment.

This really only applies to formal training – giving lessons, training an animal – not informal settings, where smiles and fluid interaction suffice. Your attention matters; in a training setting, failing to give a reinforcer must be a considered action, and this means you have to attend all behavior. Removing your attention is a rebuke; don’t do it frivolously.

2.1.9. If behavior deteriorates, “go back to kindergarten”; quickly review the whole shaping process with a series of easily earned reinforcers.

Sometimes you’re actually working under slightly changed circumstances, and that change of context “loosens” the training. You might not even know what’s changed. If trained behavior seems to be completely shot, just review the whole process. This can be quick.

2.1.10. End each session on a high note, if possible; but in any case, quit while you’re ahead.

Different subjects can take different amounts of time. (An hour may be about as long as a human can usefully learn; it’s certainly a traditional period in many contexts.) What to stop on is more important. Always quit while you’re ahead, both to end sessions and to switch behaviors. Move on as soon as some progress has been achieved. (Peak-end rule!) If you work on a behavior where the subject is tired, and behavior deteriorates, you’ll untrain the behavior! If it’s not yet a good time to end the training session, this is a good time to move to a different behavior. In fact, if you stop on that high note, and let that accomplishment be the most prominent memory, you’ll often see better performance in the next session than the best performance you just saw. If a session isn’t going to hit a high note – fatigue will set in before achievement, say – then end the session with an easy, guaranteed way to earn a reinforcer, so the whole session is remembered as being reinforcing. End with easy play or other games. Never introduce new material late in a session.

2.2. Shortcuts: Targeting, Mimicry, and Modeling

Targeting: teach an animal to touch its nose to a target. Then, you can elicit lots of other behavior by moving the target appropriately.

Mimicry: many, kids especially, will learn well by copying behavior. Dogs are bad at this; cats are quite good at copying other cats, and occasionally noncats. (You may be able to train a cat by training a dog in front of the cat.)

If you want to demonstrate a gross physical skill to someone, do it with your back turned to them; if you want to demo a fine right-handed skill to a left-handed person, do it facing them.

Modeling: put the subject, manually, through the desired action. This is actually not great with alone; you need to add shaping as well – reinforce effort on the subject’s part. This way, you can shape the skill, while fading away the modeling.

2.3. Special Subjects (e.g. yourself)

“The single most useful device in self-reinforcement, I found, was record keeping […]. I needed to record performance in such a way that improvement could be seen at a glance. I used graphs. Thus my guilt over a lapse could be assuaged by looking at the graphs and seeing that, even so, I was doing much better now than I had been six months previously.”

Training by computer, especially, can work well, because the reinforcement the program gives actually works.

2.4. Shaping Without Words

In formal training, when the subject is a willing party to being shaped, you can give instruction with words, and then shape them. This is fine, and helps.

In informal situations… people resent being shaped. Particularly if you’re shaping away behaviors they currently endorse for whatever reason. (Anger, sadness, ranting, whatever.) Be careful to notice tiny improvements in behavior, and be careful not to talk about the reinforcement unless you actually have the subject’s permission. (Probably an interesting conversation here.) In particular, if you talk about it, you’re bribing them; they learn to take actions for promised rewards, instead of learning the impulse preverbally. Don’t brag about it later, either.

(Manipulating others to make them stronger is a great idea. Manipulating others to your own ends, at their expense, is evil. Seriously fucked-up, nasty nasty evil shit. Incidentally, if I learn anyone’s using what I’ve taught here to screw with someone else’s life in destructive ways, I will go make their life interesting, see if I don’t.)

3. Stimulus Control

Training response to stimuli – like commands, or “triggers”.

To establish a cue, start with the behavior first. You can’t train the behavior if it’s not already happening. Once the behavior is trained, and on a variable schedule, you can start to reinforce the behavior only when it’s cued.

You can introduce the cue in many ways:

produce the cue just as the behavior is starting and reinforce completing the behavior alternate between cue and no cue; reinforce only the behavior that follows the cue. shape response to the cue as a behavior itself; then shape that behavior into what you wish to train. “Once your learner understands the rules, new cues can be attached to new behaviors practically instantly this way.”

3.1. Rules of stimulus control

Perfect stimulus control is defined by four (obvious) conditions; but each of these may need to be trained independently:

The behavior always occurs immediately after the stimulus The behavior never occurs, in training or work, without the stimulus The behavior never occurs in response to a different stimulus No other behavior occurs in response to this stimulus. Pick signals that can be easily perceived.

You can train multiple signals for the same behavior, just not multiple behaviors for the same stimulus.

Nonprimary signals just need to have enough magnitude to be noticed; shouting isn’t even useful, unless your unpleasantness is an intentional aversive. Once a signal is trained, you can fade it until it’s barely perceptible, and still get the right response. (e.g. Clever Hans)

If you wish, you can shape speed of response. But be sure you’re training to a steady response-time criterion.

Behavior chains: a well-reinforced signal is an opportunity for reinforcement, so it becomes a desirable event itself – so you can reinforce a behavior by presenting the stimulus for another behavior. These are common – possibly, anything that you remember linearly is like this; songs, the alphabet, taking a shower, getting dressed… and, in fact, you can debug them as you would debug a behavior chain.

In part, you should train behavior chains backwards; and don’t start training behavior n-1 until behavior n is learned and on cue. (e.g. Teaching a dog to play frisbee)

4. Untraining

Eight methods to get rid of unwanted behavior (with four good fairies.)

Shoot the dog
Punishment (works badly!)
Negative reinforcement
Extinction
Train incompatible behavior
Put the behavior on cue, and then never give the cue (!)
Shape the absence – train the action’s opposite
Change the motivation – if you can understand why the unwanted behavior is happening, you can remove its cause. (e.g. don’t shop with hungry, tired children)

Note that some problems have multiple causes, and may require multiple solutions. One can use these techniques to alleviate physical and chemical addictions, but it’s tough; these are firmly addressing behavioral concerns.

(e.g., biting fingernails – Train yourself to notice that you’re about to bite your fingernails; train that as a cue to do something incompatible. I now trim them when I notice that, so long as a fingernail clipper is handy. Also, try to remove the stress that causes the fingernail biting in the first place.)

5. Reinforcement in the Real World; 6. Clicker Training

Lots of examples of the above principles. Good to read to check your understanding, but I think there’s no new high-level ideas here.

General Notes

Reinforcement vs. reward and punishment

The difference between “reinforcement” and the usual notion of “reward” and “punishment” is pretty stark. To increase the likelihood of an act, a reinforcer must happen as close in time to an even as you can get it. Rewards like year-end bonuses aren’t doing that; these may make goals more valuable, but they’re doing very little indeed to shape behavior on a subverbal level. Similarly, punishment that doesn’t stop immediately when the undesired behavior stops won’t shape behavior. Punishments outside of negative reinforcers will not yield predictable results.

Note: self-punishment is particularly useless; you train down the act of punishing yourself more than you train down whatever behavior you’re punishing. This is unpleasant and useless!

Reinforcement trains the entire context!

You don’t get to pick what in the current context you’re reinforcing. You’re reinforcing everything in the trainee’s context at the time. If you only remember to click up a desired behavior when the trainee is in your living room, you’ll train being in the living room. Whenever you’re giving positive reinforcement for anything, interactively, you’re also training up interacting with you. For a pet this may be desirable and adorable; in other cases, less so.

And, vice versa – if you use lots of negative reinforcement, but only when the trainee is near you, then aside from whatever specific behaviors you think you’re training, you’re training not being near you. If this isn’t what you want to train, then you might not want to do negative reinforcement like this…

This is probably part of “fallout”, forward-referenced to Chapter 4. Should have a look.

The Training Game

Two players, subject and trainer. (Expect group to get tired in about 6 rounds.)

Send the subject out of the room. Select a trainer, choose a behavior, and bring the subject back. Instruct the subject to “be active”: move around, be energetic. Without talking, and using only one nonverbal reinforcer (a clicker, a whistle, clapping or snapping), get the subject to perform your action.

Instruct the subject to return to the entry, after the first few reinforcements.

Quote: “The subject gets to discover that in this form of learning, brains don’t help. It doesn’t matter what you are thinking about; if you just keep moving around, collecting whistle sounds, your body will find out what to do without “your” help. This is an absolutely excruciating experience for brilliant, intellectual people.” If you’re the subject, don’t analyze too much, just go with what seems vaguely indicated.

A Week of No Facebook

2013-07-21T00:00:00+00:00

I’d noticed that, lately, I was spending most of my free time alone, bouncing back and forth between Twitter, Facebook, my email, and games on Steam. Steam can just eat all of my day, if I sit down unmotivated. I wasn’t doing anything awesome, unless I’d planned in advance to be away from the internet. So, I’m taking a vacation from the biggest, most pointless consumers of my attention.

Today is Day 2 of a week-long vacation from Facebook and Steam. (I’d intended to also ignore Twitter, but it was too easy to convince myself that I’d hurt nothing, checking Twitter updates for 3 minutes… insidious!)

Yesterday, I found lots of other random things to fill my attention. I watched every episode of Adventure Time and Gravity Falls. I reread all of Gunnerkrigg Court. I read some fanfiction. Basically, I spent hours desperately avoiding anything I really endorsed doing. But this became obvious after a while, and I got myself to clean my apartment, and handle a bunch of organization for an upcoming project. Today, I’ve managed to buy groceries, and arranged next week’s Less Wrong meetup, and reached out to the agents of a dozen houses that I and my friends might rent, and spent some time learning D3, and played my violin some. I never play my violin, and it’s really nice to do so! I’m really happy with the amount of stuff I’ve done today, and some of it’s fun, creative stuff. I feel better, more powerful, more capable than I usually do after a day by myself.

And I’m sure that, even if I stayed off of Facebook forever, and I never played another video game, I’d just find new ways to procrastinate. Or, perhaps, this is just what if feels like to desire something, without enjoying or endorsing it.

Right now, it’s unusually vivid that debugging my desires is a target-rich environment.

Just-So Stories

2013-07-08T00:00:00+00:00

Just-so stories are awfully fun to weave. They’re all the best for being maximally improbable but internally consistent. Instant fiction, but with less “story” and more “explanation”. Easier by a wide margin for me to generate off the cuff.

“Why is this teabag a pair of squares, when a pair of circles could have the same volume but with less material?”

“Well, you have to understand that tea-brewing is, in many places, a cultural touchstone with intensely traditional roots. If this is high-quality tea — or wants to look like high quality tea — then it must take that traditional form.

“But why is the traditional tea bag square?”

“Simple! The four corners of the tea bag were, originally, intended to represent the four humors, to be respectively calmed and invigorated by the beverage itself. A circle would represent…” (ominously) “too many humors.”

Ok, that’s not very good. Maybe it would be more amusing in person? Certainly, if people know it’s a game, and are interested in playing together, then it’d be fun to build ridiculous towers of faulty explanation.

(Inspired by a facebook post from, oh, two months ago.)

Record Linkage

2013-06-11T00:00:00+00:00

From Wikipedia:

“Record linkage” is the term used by statisticians, epidemiologists, and historians, among others, to describe the process of joining records from one data source with another that describe the same entity. Commercial mail and database applications refer to it as “merge/purge processing” or “list washing”. Computer scientists often refer to it as “data matching” or as the “object identity problem”. […] This profusion of terminology has led to few cross-references between these research communities.

Quality of Expertise

2012-11-01T00:00:00+00:00

In some fields, experts are unimaginably better than novices. Master chess players, for instance, are demonstrably better chess players than beginners. A chess master can look at a board position and immediately spot useful moves.

In other fields, people with experience in some field think they can make expert judgments about that subject, are confident in those judgments, and are correct about as often as random selection.

Why do some types of expertise have solid evidence, while others seem fundamentally broken?

Much of this follows the structure of Kahneman and Klein, 2009 ¹. Lots more of it follows from the details in ².

A Model of Reliable Intuition

According to current models, the skilled intuition of reliable experts is precisely recognition of learned patterns. This intuition may seem mysterious to someone observing it, and possibly even to someone experiencing it. This is because the relevant pattern recognition is frequently subverbal – the expert often cannot explain the cue for that intuition.

So: expertise is domain specific.

Note that reliable, expert intuition about a situation can only arise if the situation yields valid cues, and it is possible for a would-be expert to learn those cues. Thus, skilled intuition relies on sufficiently regular environments.

Predicting the long-term behavior of humans or human systems is not “sufficiently regular”!

Steps of attaining expertise:

cognitive stage (system 2 action and learning)
associative stage (system 1 learning)
autonomous stage (primarily system 1 action)

Thus can experts think more quickly, in parallel, via associative thought.

Ways Intuition Can Be Wrong

Several biases we’ve learned: insufficient reflection, anchoring, substitution. These only happen in the “absence of skill”; a learned response to a particular task can override any of these biases. (citation?) But there’s no easy way to tell when you’re applying a learned skill, versus using a general and inappropriate heuristic.

“Subjective confidence is often determined by the internal consistency of the information on which a judgment is based, rather than by the quality of that information.” That is, the conjunction fallacy has real intuitive power.

In many cases, models of people’s judgments outperform those people. Statistically, this can be explained entirely by inconsistent judgment in those people. Across many tested tasks, assuming linear outcomes but being consistent (like a “simple” linear model) is more advantageous than letting a human be inconsistent but nonlinear.[^lens model studies] Moreover, humans are actually bad at learning nonlinear situations.

Confidence is a poor indicator of reliability!

How to Tell the Difference

Short of an in-depth study analyzing the outside view of expert opinion, the best way to tell the difference between reliable and unreliable expertise is to assess how tightly the expert’s intuition has been bound to results:

The situation must yield reliable cues
The expert must have prolonged, deliberate practice with the system
That practice must have involved rapid, unequivocal feedback
The cues and situation must not shift.

“If an environment provides valid cues and good feedback, skill and expert intuition will eventually develop in individuals of sufficient talent.” ¹ [^but wait]

“We describe task environments as “high-validity” if there are stable relationships between objectively identifiable cues and subsequent events or between cues and the outcomes of possible actions.”)

In low-validity environments, simple statistical approaches frequently outperform experts, by identifying and steadily using weakly-valid cues, instead of yielding the noisy outputs characteristic of human experts in such fields. [^lens model studies]

Highly predictable environments yield higher “linear consistency” of human judgment.[^lens model studies]

How to Become an Expert

(This section is entirely about expert performance, rather than just expert judgment. Might be part of the differences between these fields.)

Expertise requires “deliberate, well-structured practice”.²

Deliberate practice requires:

Lots of time. (on the order of 10,000 hours, in many studies)
Setting specific goals beyond one’s current level of performance, but which can be mastered within hours.
Appropriate, immediate, objective feedback.
Becoming able to identify errors; continuously identifying and eliminating errors.

Deliberate practice requires steady concentration! It does not become automatic – rather, it is a way to prevent practiced actions from becoming automatic. The need for concentration limits deliberate practice to four to five hours a day; and at most an hour or so without rest.

Maintaining expertise requires deliberate practice; without well-designed practice, lots of practice does not yield expertise.

For Further Thought

In what domains should we expect expert judgment to be correct?
In what domains should we expect expert judgment to fail? Could these judgments be systematically improved?
What might work when some of the conditions for deliberate practice can’t be met?
For whatever we’re trying to improve, how can we better structure our practice?
In particular, what does practice towards expertise in rationality look like?
This is relevant!

Daniel Kahneman and Gary Klein. 2009. Conditions for Intuitive Expertise: A Failure to Disagree
[^lens model studies]: Karelaia and Hogarth, 2008. Determinants of Linear Judgment: A Meta-Analysis of Lens Model Studies ↩ ↩²
Ericsson, Charness, Hoffman, and Feltovich. 2006 The Cambridge handbook of expertise and expert performance. ↩ ↩²

Prospect Theory

2012-11-01T00:00:00+00:00

This was a bunch of lecture notes I made before a quick talk explaining the basics of prospect theory to the Madison Less Wrong meetup. I haven’t even tried to make this readable, yet.

The Allais Paradox

From WP:

Gamble 1:

Option A: Gain $1M.
Option B: Gain $1M at 89% or gain $5M at 10%.

Gamble 2:

Option A: Gain $1M at 11%.
Option B: Gaim $5M at 10%.

Alternately, from LW:

Gamble 1:

Option A: $24K at 100%
Option B: $27K at 33/34

Gamble 2:

Option A: $24K at 34%
Option B: $27K at 33%

The Endowment Effect:

“Pure tokens”, tradeable for between $10 and $20 dollars at the experiment’s end: markets worked.

Mugs: Some of the group (Sellers) are given a nice mug (worth about $6), Buyers had to use their own money to by mugs if they wanted them. Average selling price was about double the average buying price. Later, “Choosers” could accept either a mug or money, at whatever point they found themselves indifferent.

Averages: Sellers: $7.12, Choosers: $3.12, Buyers: $2.87.

Prospect Theory: Evaluation

Values of gains or losses. (losses about double slope of gains; range between 1.5 and 2.5)

Decision weights. (almost-logistic curve; crosses x=y between .2 and .6; usually)

Exact curves vary from person to person!

Prospect Theory: Editing

The full “prospect theory” actually has a few more moving parts. Decisions, it says, are broken into two parts, “Editing” and “Evaluation”

From here:

Editing:

Coding: outcomes become gains or losses
Combination: simplify prospects by combining probabilities with identical outcomes
Segregation: riskless components of prospects are separated from risky components. (300 @ p 200 @ (1-p)) becomes 200 + 100@p.
Cancellation: discard common outcome-probability pairs between choices.
Simplification: prospects are likely to be rounded off; very unlikely outcomes are discarded.
Detecting dominance: strictly-dominated outcomes are scanned and rejected.

Evaluation: The edited prospects are evaluated by summing the products of decision weights of probabilities, and the values of gains or losses.

“Cumulative Prospect Theory”: is described online, but I’m not going to explain it.

Mental Accounting

The frames by which we ascribe “gains” and “losses” are complicated. You can fiddle with these, but the defaults can be hard to see.

(Richard Thaler, “Mental Accounting”)

What Do We Do About This?

You will regret negative outcomes less than you expect. Do not try too hard to minimize regret.
Avoid “What You See Is All There Is”, by evaluating relevant, similar scenarios. In particular:
- Make a story where losses are neutral but gains are more positive, or vice versa, to balance risk aversion and the endowment effect.
- Imagine buying or selling your options to someone else. How would you value them?
- Make a story where your decision is reversed. “Would you take 1 year’s extra labor to have your current, worse system?”
- Don’t play stopping problems; comparison shop.
Find better frames. (gpm vs mpg, say: example, Adam goes 12->14mpg; Beth goes 30->40mpg. Who saves more gas? (note that, starting next year, new cars will have gpm info on their sticker))
Shift goalposts: goals are reference points, and we can view losses and gains in terms of the goals we’ve already set.

People Are Irrational, and that’s OK

2012-11-01T00:00:00+00:00

26 Jan 2005

People are irrational, and that’s OK.

For years, I thought this: “People aren’t rational beings, but I should act towards them as if they are. If I don’t, I’ll be treating them as if less than human.”

This is wrong. Certainly, honesty and clarity are important values in communication. Unfortunately, people have emotions, and are likely to misinterpret or disregard simply-written communication.

People aren’t rational. This is obvious on reflection. the thrust of much cognitive science in the past few decades is that people are not only irrational, but irrational in predictable ways. As such, it may be incorrect, or even immoral, to pretend that they are rational and act accordingly.

Moreover, people don’t expect rational behavior from other people. As such, there’s little or no chance of offending a random stranger by treating them as if they were irrational. What is offensive, rather, is to be blatantly patronizing.

The central lesson, here, for me: I can’t assume that people have psyches similar to mine, because they probably don’t. People are strange, irrational, subject to credulity, and likely delusional. If they’re different enough from me, then treating them as rational people is likely to be misinterpreted, and certain to be misunderstood.

21 Dec 2011

When I first wrote this, in 2005, I was trying to understand how to interact with a few people who were largely led by (to me) wildly emotional mood swings, and sometimes paranoia. I’d have a hard time talking with them - often leaving them crying or angry. To them, I was grossly insensitive to their feelings; to me, they had no control of themselves. They assumed, deep down, that a person’s behavior is primarily emotional; I assumed, deep down, that a person’s behavior is primarily rational.

I still think they were vastly overemotional people, but on the key determinants of human behavior, they were clearly less confused than I.

1 Nov 2012

I can summarize better:

When I was young, I assumed that everyone was “sane”: usually reasonable, rational, and in control of themselves.
Shortly thereafter, I realized that everyone was mad. Mad! Ruled by emotion! Not rational at all! Obviously, I though, it would be rude to point this out, or act as if it were true.
Quite a while later, I finally realized that my standards for madness were shared by almost no one; that emotion is the main determinant in most action, and not reason.
In the past few years, I think I’ve finally internalized that my own actions are primarily determined by my emotions, as well, and that pretending otherwise just means I have a poor mental model of myself.

This last point has important consequences. If I want to get me to act, an early step is to get me to feel the right way about it. I can do this far better if I’m doing it deliberately, instead of by chance. I can’t do it deliberately at all if I pretend that I am a lucid model of crystal reason.

This is now painfully clear to me. Really, I’m embarrassed to admit that I once really thought differently, and have considered removing this page. But if anyone ever reaches these conclusions faster because they’ve seen this page, then the minor cost of my embarrassment is well spent.

Getting Started With Linux

2012-04-21T00:00:00+00:00

So, you have a Linux system, and you want to learn how to use it well. (If you don’t already have Linux, you can get it pretty easily. It’s free, and installation nowadays is pretty painless.) If you’re just starting out with Linux, you can learn how to wield the system by trial-and-error, and googling for specific commands when you recognize the need. But this requires more time and patience than is really necessary.

The Linux Documentation Project has a pretty decent Linux user’s guide, with the caveat that some of the more specialized programs it recommends are now supplanted by better-to-use, more robust programs. (Ignore everything it says about CD recording, for instance.)

Actually try the exercises. Try even those that seem trivially easy; it’s always nice to get feedback telling you that you know what you’re doing. Moreover, whenever you feel like you’re getting full of uninternalized detail – and there is, I admit, a lot of detail – stop reading and fiddle with an actual computer instead.

If you get stuck:

Use the built-in documentation to look for clues. Relevant commands are man, apropos, and info.
Use The Google. Sometimes, the literal error message you’re seeing will be exactly the right search term.
Use reductionism. In particular; see if you can focus your confusion onto a very small, confusing thing. (This can be difficult, but often solves the problem itself.)
Ask someone. In particular, ask ubuntu and superuser are probably the best places to get help quickly. Doubly so, if you’ve already done (3), and can therefore explain your problem succinctly and precisely.
If, after all this, you still can’t the system to do what you want, you can try to get in-person help from a local expert, perhaps from a nearby Linux User Group. Volunteerism is a big part of the Linux ethos; so you can probably find someone willing to help you solve tricky problems – especially if you’ve clearly tried the above, other ways to solve the problem.

Improving Arguments

2012-03-01T00:00:00+00:00

We probably underestimate the value of improving our arguments, and are overconfident in apparently-solid logical arguments. What can you do to improve a complex argument?

If an argument contains 20 inferences in sequence, and you’re wrong about such inferences 5% of the time without noticing the misstep, then you have about a 64% chance of being wrong somewhere in the argument. If you can reduce your chance of mistakes to 1% per inference, then you only have an 18% chance of being wrong, somewhere. Improving the reliability of the steps in your arguments, then, has a high value-of-information – even though 1% and 5% both feel like similar amounts of uncertainty.

So, if being wrong about an argument is highly costly – if you would stand to lose by believing incorrectly, or win by believing correctly – then it is well worth spending some real effort to ensure that long arguments are correct, before you act on them. This is even true if that argument appears to be very solid, and hangs together tightly.

Writing an argument in detail is a good way to improve the likelihood that your argument isn’t somewhere flawed. Consider:

Writing allows reduction. By pinning the argument to paper, you can separate each logical step, and make sure that each step makes sense in isolation.
Writing gives the argument stability. For example, the argument won’t secretly change when you think about it while you’re in a different mood. This can help to prevent you from implicitly proving different points of your argument from contradictory claims.
Writing makes your argument vastly easier to share. Like in open source software, enough eyeballs makes all bugs trivial.

So, if you can spot non-sequiturs in your writing, and you put a lot of weight on the conclusion it’s pointing at, it’s a really good idea to take the time to fill in all the sequiturs.

Adapted from this comment at LessWrong.

Extra Markdown Commands for Emacs

2012-02-11T00:00:00+00:00

I spend a lot of my time in emacs. But I’ve only recently really started to fiddle with things in emacs lisp. Here’s some twiddling.

I really like Markdown. I write practically everything in Markdown; it’s a wonderful way to just barely organize what should be mostly plain text. I even like the way that plaintext files look in Markdown. Writing in Markdown is like swinging a well-balanced hammer; it’s easy enough to use that it just seems obvious, rather than designed. (This site is just a thin wrapper around a bunch of Markdown files, for instance…)

Anyway, I also rather like the emacs Markdown Mode, but its title and subsection headings don’t expand to underline the entire line of current text. This bothered me, so I did this:

(defun underline (c)
  (save-excursion
    (let (w)
      (move-end-of-line 1)  
      (setq w (current-column))
      (newline)
      (insert (apply 'concat (make-list w c)))
    )))

(defun markdown-under-title ()
  (interactive)
  (underline "=")
)

(defun markdown-under-section ()
  (interactive)
  (underline "-")
)

(add-hook 'markdown-mode-hook
  (lambda () 
    (local-set-key "\C-c\C-tt" 'markdown-under-title)
    (local-set-key "\C-c\C-ts" 'markdown-under-section)
    ))

You could argue that I’m being needlessly fiddly. You’d probably be right.