The TeX core situation
Loading files is an important part of using TeX. At the primitive level,
reading an entire file is done using \input
. As many people know, files are
found by a TeX system using the kpathsea
library, which means that the
argument to \input
should (usually) be the file name alone.
However, it’s often convenient to have files found in subdirectories of a
project: the LaTeX2e \graphicspath
command is perhaps the classic example
where this is used. Looking in multiple places means having an approach to
searching for files. The same idea comes up again with graphics whenever you
use \includegraphics
: most of the time, you don’t give the file extension but
rather let (La)TeX do some searching.
At the same time as this need to search for existing files, there’s the issue
of when a file might be missing. The \input
primitive is pretty unforgiving
if the file is not found, and there are lots of times we want to ‘use this file
only’ if it actually exists, or to retain control of the error state if
a file is missing.
Classical TeX offers one way to check for files before trying to input them.
That’s done by using the \openin
primitive to open the file, then using an
\ifeof
test to see if we have reached the end of the file. That works because
a non-existent file gives an immediate end-of-file for a read (\openin
), but
does not lead to an errors (in contrast to \input
). The downside to this
approach is it performs an assignment, so is not usable in an expansion context.
Some years ago, pdfTeX introduced a number of ‘file information’ primitives,
including \pdffilesize
. This takes a file name, and expands to the size of
the file. Importantly, it works without error with a non-existent file, and
expands to nothing at all. That means that it can be used to know if a file
exists: any value at all means that it does. As the primitive works by
expansion, it also can be used anywhere in TeX.
Searching in expl3
For TeX Live 2019, the LaTeX team did some
work to bring primitives into line
between XeTeX and other engines. That means that we can now look to exploit
\pdffilesize
as a way to find files and add new functionality. (The team
looking after pTeX and upTeX had already added \pdffilesize
.)
I’ve just sent an update of expl3
to
CTAN which uses this new approach to file finding.
There’s more to it than just changing the file opening primitive. To do a
search for different paths, we need to be able to check one at a time. The
older expl3
code checks each possible path using an assignment: again, not
allowed in an expansion context. So I’ve re-written all of the search code to
work by expansion: tricky but workable.
This means we have some new goodies: things like \file_size:n
which can
be used inside an x
-type expansion (\edef
) to give the size of a file
even if it is not on the standard search path. Of course, being expl3
code, everything still handles spaces-in-filenames and active characters
correctly.
Future plans
At present, where \pdffilesize
is not available we will still fall back on
the older code, so not everything can be done by expansion. However, in the
near(ish) future we will likely make \pdffilesize
a required primitive for
expl3
. At that point, some other code can be made expandable, most
obviously \file_if_exist:n(TF)
. That will lead to some changes in the minimal
engine versions: more news as and when a change happens.