Can I put all my rules in one big Redofile like make does?

One of my favourite features of redo is that it doesn't add any new syntax; the syntax of redo is exactly the syntax of sh... because sh is the program interpreting your .do file.

Also, it's surprisingly useful to have each build script in its own file; that way, you can declare a dependency on just that one build script instead of the entire Makefile, and you won't have to rebuild everything just because of a one-line Makefile change. (Some build tools avoid that same problem by tracking which variables and commands were used to do the build. But that's more complex, more error prone, and slower.)

See djb's Target files depend on build scripts article for more information.

However, if you really want to, you can simply create a default.do that looks something like this:

case $1 in
    *.o) ...compile a .o file... ;;
    myprog)  ...link a program... ;;
    *) echo "no rule to build '$1'" >&2; exit 1 ;;
esac

Basically, default.do is the equivalent of a central Makefile in make. As of recent versions of redo, you can use either a single toplevel default.do (which catches requests for files anywhere in the project that don't have their own .do files) or one per directory, or any combination of the above. And you can put some of your targets in default.do and some of them in their own files. Lay it out in whatever way makes sense to you.

One more thing: if you put all your build rules in a single default.do, you'll soon discover that changing anything in that default.do will cause all your targets to rebuilt - because their .do file has changed. This is technically correct, but you might find it annoying. To work around it, try making your default.do look like this:

. ./default.od

And then put the above case statement in default.od instead. Since you didn't redo-ifchange default.od, changes to default.od won't cause everything to rebuild.

What are the parameters ($1, $2, $3) to a .do file?

NOTE: These definitions have changed since the earliest (pre-0.10) versions of redo. The new definitions match what djb's original redo implementation did.

$1 is the name of the target file.

$2 is the basename of the target, minus the extension, if any.

$3 is the name of a temporary file that will be renamed to the target filename atomically if your .do file returns a zero (success) exit code.

In a file called chicken.a.b.c.do that builds a file called chicken.a.b.c, $1 and $2 are chicken.a.b.c, and $3 is a temporary name like chicken.a.b.c.tmp. You might have expected $2 to be just chicken, but that's not possible, because redo doesn't know which portion of the filename is the "extension." Is it .c, .b.c, or .a.b.c?

.do files starting with default. are special; they can build any target ending with the given extension. So let's say we have a file named default.c.do building a file called chicken.a.b.c. $1 is chicken.a.b.c, $2 is chicken.a.b, and $3 is a temporary name like chicken.a.b.c.tmp.

You should use $1 and $2 only in constructing input filenames and dependencies; never modify the file named by $1 in your script. Only ever write to the file named by $3. That way redo can guarantee proper dependency management and atomicity. (For convenience, you can write to stdout instead of $3 if you want.)

For example, you could compile a .c file into a .o file like this, from a script named default.o.do:

redo-ifchange $2.c
gcc -o $3 -c $2.c

Why not $FILE, $BASE, $OUT instead of $1, $2, $3?

That sounds tempting and easy, but one downside would be lack of backward compatibility with djb's original redo design.

Longer names aren't necessarily better. Learning the meanings of the three numbers doesn't take long, and over time, those extra few keystrokes can add up. And remember that Makefiles and perl have had strange one-character variable names for a long time. It's not at all clear that removing them is an improvement.

What happens to stdin/stdout/stderr?

As with make, stdin is not redirected. You're probably better off not using it, though, because especially with parallel builds, it might not do anything useful. We might change this behaviour someday since it's such a terrible idea for .do scripts to read from stdin.

As with make, stderr is also not redirected. You can use it to print status messages as your build proceeds. (Eventually, we might want to capture stderr so it's easier to look at the results of parallel builds, but this is tricky to do in a user-friendly way.)

Redo treats stdout specially: it redirects it to point at $3 (see previous question). That is, if your .do file writes to stdout, then the data it writes ends up in the output file. Thus, a really simple chicken.do file that contains only this:

echo hello world

will correctly, and atomically, generate an output file named chicken only if the echo command succeeds.

Isn't it confusing to capture stdout by default?

Yes, it is. It's unlike what almost any other program does, especially make, and it's very easy to make a mistake. For example, if you write in your script:

echo "Hello world"

it will go to the target file rather than to the screen.

A more common mistake is to run a program that writes to stdout by accident as it runs. When you do that, you'll produce your target on $3, but it might be intermingled with junk you wrote to stdout. redo is pretty good about catching this mistake, and it'll print a message like this:

redo  zot.do wrote to stdout *and* created $3.
redo  ...you should write status messages to stderr, not stdout.
redo  zot: exit code 207

Despite the disadvantages, though, automatically capturing stdout does make certain kinds of .do scripts really elegant. The "simplest possible .do file" can be very short. For example, here's one that produces a sub-list from a list:

redo-ifchange filelist
grep ^src/ filelist

redo's simplicity is an attempt to capture the "Zen of Unix," which has a lot to do with concepts like pipelines and stdout. Why should every program have to implement its own -o (output filename) option when the shell already has a redirection operator? Maybe if redo gets more popular, more programs in the world will be able to be even simpler than they are today.

By the way, if you're running some programs that might misbehave and write garbage to stdout instead of stderr (Informational/status messages always belong on stderr, not stdout! Fix your programs!), then just add this line to the top of your .do script:

exec >&2

That will redirect your stdout to stderr, so it works more like you expect.

Run redo-ifchange in a loop?

The obvious way to write a list of dependencies might be something like this:

for d in *.c; do
    redo-ifchange ${d%.c}.o
done

But it turns out that's very non-optimal. First of all, it forces all your dependencies to be built in order (redo-ifchange doesn't return until it has finished building), which makes -j parallelism a lot less useful. And secondly, it forks and execs redo-ifchange over and over, which can waste CPU time unnecessarily.

A better way is something like this:

for d in *.c; do
    echo ${d%.c}.o
done |
xargs redo-ifchange

That only runs redo-ifchange once (or maybe a few times, if there are really a lot of dependencies and xargs has to split it up), which saves fork/exec time and allows for parallelism.

If a target is identical after rebuilding, how do I prevent dependents from being rebuilt?

For example, running ./configure creates a bunch of files including config.h, and config.h might or might not change from one run to the next. We don't want to rebuild everything that depends on config.h if config.h is identical.

With make, which makes build decisions based on timestamps, you would simply have the ./configure script write to config.h.new, then only overwrite config.h with that if the two files are different. However, that's a bit tedious.

With redo, there's an easier way. You can have a config.do script that looks like this:

redo-ifchange autogen.sh *.ac
./autogen.sh
./configure
cat config.h configure Makefile | redo-stamp

Now any of your other .do files can depend on a target called config. config gets rebuilt automatically if any of your autoconf input files are changed (or if someone does redo config to force it). But because of the call to redo-stamp, config is only considered to have changed if the contents of config.h, configure, or Makefile are different than they were before.

(Note that you might actually want to break this .do up into a few phases: for example, one that runs aclocal, one that runs autoconf, and one that runs ./configure. That way your build can always do the minimum amount of work necessary.)

Why does 'redo target' redo even unchanged targets?

When you run make target, make first checks the dependencies of target; if they've changed, then it rebuilds target. Otherwise it does nothing.

redo is a little different. It splits the build into two steps. redo target is the second step; if you run that at the command line, it just runs the .do file, whether it needs it or not.

If you really want to only rebuild targets that have changed, you can run redo-ifchange target instead.

The reasons I like this arrangement come down to semantics:

  • "make target" implies that if target exists, you're done; conversely, "redo target" in English implies you really want to redo it, not just sit around.

  • If this weren't the rule, redo and redo-ifchange would mean the same thing, which seems rather confusing.

  • If redo could refuse to run a .do script, you would have no easy one-line way to force a particular target to be rebuilt. You'd have to remove the target and then redo it, which is more typing. On the other hand, nobody actually types "redo foo.o" if they honestly think foo.o doesn't need rebuilding.

  • For "contentless" targets like "test" or "clean", it would be extremely confusing if they refused to run just because they ran successfully last time.

In make, things get complicated because it doesn't differentiate between these two modes. Makefile rules with no dependencies run every time, unless the target exists, in which case they run never, unless the target is marked ".PHONY", in which case they run every time. But targets that do have dependencies follow totally different rules. And all this is needed because there's no way to tell make, "Listen, I just really want you to run the rules for this target right now."

With redo, the semantics are really simple to explain. If your brain has already been fried by make, you might be surprised by it at first, but once you get used to it, it's really much nicer this way.

Can I write .do files in my favourite language, not sh?

Yes. If the first line of your .do file starts with the magic "#!/" sequence (eg. #!/usr/bin/python), then redo will execute your script using that particular interpreter.

Note that this is slightly different from normal Unix execution semantics. redo never execs your script directly; it only looks for the "#!/" line. The main reason for this is so that your .do scripts don't have to be marked executable (chmod +x). Executable .do scripts would suggest to users that they should run them directly, and they shouldn't; .do scripts should always be executed inside an instance of redo, so that dependencies can be tracked correctly.

WARNING: If your .do script is written in Unix sh, we recommend not including the #!/bin/sh line. That's because there are many variations of /bin/sh, and not all of them are POSIX compliant. redo tries pretty hard to find a good default shell that will be "as POSIXy as possible," and if you override it using #!/bin/sh, you lose this benefit and you'll have to worry more about portability.

Can a single .do script generate multiple outputs?

FIXME: Yes, but this is a bit imperfect.

For example, compiling a .java file produces a bunch of .class files, but exactly which files? It depends on the content of the .java file. Ideally, we would like to allow our .do file to compile the .java file, note which .class files were generated, and tell redo about it for dependency checking.

However, this ends up being confusing; if myprog depends on foo.class, we know that foo.class was generated from bar.java only after bar.java has been compiled. But how do you know, the first time someone asks to build myprog, where foo.class is supposed to come from?

So we haven't thought about this enough yet.

Note that it's okay for a .do file to produce targets other than the advertised one; you just have to be careful. You could have a default.javac.do that runs 'javac $2.java', and then have your program depend on a bunch of .javac files. Just be careful not to depend on the .class files themselves, since redo won't know how to regenerate them.

This feature would also be useful, again, with ./configure: typically running the configure script produces several output files, and it would be nice to declare dependencies on all of them.

Should I use environment variables to affect my build?

Directly using environment variables is a bad idea because you can't declare dependencies on them. Also, if there were a file that contained a set of variables that all your .do scripts need to run, then redo would have to read that file every time it starts (which is frequently, since it's recursive), and that could get slow.

Luckily, there's an alternative. Once you get used to it, this method is actually much better than environment variables, because it runs faster and it's easier to debug.

For example, djb often uses a computer-generated script called compile for compiling a .c file into a .o file. To generate the compile script, we create a file called compile.do:

redo-ifchange config.sh
. ./config.sh
echo "gcc -c -o \$3 \$2.c $CFLAGS" >$3
chmod a+x $3

Then, your default.o.do can simply look like this:

redo-ifchange compile $2.c
./compile $1 $2 $3

This is not only elegant, it's useful too. With make, you have to always output everything it does to stdout/stderr so you can try to figure out exactly what it was running; because this gets noisy, some people write Makefiles that deliberately hide the output and print something friendlier, like "Compiling hello.c". But then you have to guess what the compile command looked like.

With redo, the command is ./compile hello.c, which looks good when printed, but is also completely meaningful. Because it doesn't depend on any environment variables, you can just run ./compile hello.c to reproduce its output, or you can look inside the compile file to see exactly what command line is being used.

As a bonus, all the variable expansions only need to be done once: when generating the ./compile program. With make, it would be recalculating expansions every time it compiles a file. Because of the way make does expansions as macros instead of as normal variables, this can be slow.

Example default.o.do for both C and C++ source?

We can upgrade the compile.do from the previous answer to look something like this:

    redo-ifchange config.sh
    . ./config.sh
    cat <<-EOF
            [ -e "\$2.cc" ] && EXT=.cc || EXT=.c
            gcc -o "\$3" -c "\$1\$EXT" -Wall $CFLAGS
    EOF
    chmod a+x "$3"

Isn't it expensive to have ./compile doing this kind of test for every single source file? Not really. Remember, if you have two implicit rules in make:

%.o: %.cc
    gcc ...

%.o: %.c
    gcc ...

Then it has to do all the same checks. Except make has even more implicit rules than that, so it ends up trying and discarding lots of possibilities before it actually builds your program. Is there a %.s? A %.cpp? A %.pas? It needs to look for all of them, and it gets slow. The more implicit rules you have, the slower make gets.

In redo, it's not implicit at all; you're specifying exactly how to decide whether it's a C program or a C++ program, and what to do in each case. Plus you can share the two gcc command lines between the two rules, which is hard in make. (In GNU make you can use macro functions, but the syntax for those is ugly.)

Can I just rebuild just part of a project?

Absolutely! Although redo runs "top down" in the sense of one .do file calling into all its dependencies, you can start at any point in the dependency tree that you want.

Unlike recursive make, no matter which subdir of your project you're in when you start, redo will be able to build all the dependencies in the right order.

Unlike non-recursive make, you don't have to jump through any strange hoops (like adding, in each directory, a fake Makefile that does make -C ${TOPDIR} back up to the main non-recursive Makefile). redo just uses filename.do to build filename, or uses default*.do if the specific filename.do doesn't exist.

When running any .do file, redo makes sure its current directory is set to the directory where the .do file is located. That means you can do this:

redo ../utils/foo.o

And it will work exactly like this:

cd ../utils
redo foo.o

In make, if you run

make ../utils/foo.o

it means to look in ./Makefile for a rule called ../utils/foo.o... and it probably doesn't have such a rule. On the other hand, if you run

cd ../utils
make foo.o

it means to look in ../utils/Makefile and look for a rule called foo.o. And that might do something totally different! redo combines these two forms and does the right thing in both cases.

Note: redo will always change to the directory containing the .do file before trying to build it. So if you do

redo ../utils/foo.o

the ../utils/default.o.do file will be run with its current directory set to ../utils. Thus, the .do file's runtime environment is always reliable.

On the other hand, if you had a file called ../default.o.do, but there was no ../utils/default.o.do, redo would select ../default.o.do as the best matching .do file. It would then run with its current directory set to .., and tell default.o.do to create an output file called "utils/foo.o" (that is, foo.o, with a relative path explaining how to find foo.o when you're starting from the directory containing the .do file).

That sounds a lot more complicated than it is. The results are actually very simple: if you have a toplevel default.o.do, then all your .o files will be compiled with $PWD set to the top level, and all the .o filenames passed as relative paths from $PWD. That way, if you use relative paths in -I and -L gcc options (for example), they will always be correct no matter where in the hierarchy your source files are.

Can I put my .o files in a different directory from my .c files?

Yes. There's nothing in redo that assumes anything about the location of your source files. You can do all sorts of interesting tricks, limited only by your imagination. For example, imagine that you have a toplevel default.o.do that looks like this:

ARCH=${1#out/}
ARCH=${ARCH%%/*}
SRC=${1#out/$ARCH/}
redo-ifchange $SRC.c
$ARCH-gcc -o $3 -c $SRC.c

If you run redo out/i586-mingw32msvc/path/to/foo.o, then the above script would end up running

i586-mingw32msvc-gcc -o $3 -c path/to/foo.c

You could also choose to read the compiler name or options from out/$ARCH/config.sh, or config.$ARCH.sh, or use any other arrangement you want.

You could use the same technique to have separate build directories for out/debug, out/optimized, out/profiled, and so on.

Can my filenames have spaces in them?

Yes, unlike with make. For historical reasons, the Makefile syntax doesn't support filenames with spaces; spaces are used to separate one filename from the next, and there's no way to escape these spaces.

Since redo just uses sh, which has working escape characters and quoting, it doesn't have this problem.

Does redo care about the differences between tabs and spaces?

No.

What if my .c file depends on a generated .h file?

This problem arises as follows. foo.c includes config.h, and config.h is created by running ./configure. The second part is easy; just write a config.h.do that depends on the existence of configure (which is created by configure.do, which probably runs autoconf).

The first part, however, is not so easy. Normally, the headers that a C file depends on are detected as part of the compilation process. That works fine if the headers, themselves, don't need to be generated first. But if you do

redo foo.o

There's no way for redo to automatically know that compiling foo.c into foo.o depends on first generating config.h.

Since most .h files are not auto-generated, the easiest thing to do is probably to just add a line like this to your default.o.do:

redo-ifchange config.h

Sometimes a specific solution is much easier than a general one.

If you really want to solve the general case, djb has a solution for his own projects, which is a simple script that looks through C files to pull out #include lines. He assumes that #include <file.h> is a system header (thus not subject to being built) and #include "file.h" is in the current directory (thus easy to find). Unfortunately this isn't really a complete solution, but at least it would be able to redo-ifchange a required header before compiling a program that requires that header.