Ctags is a programming tool that will create a ‘tags’ file for a set of source-code files. Various editors can use that tags file to allow developers, for example, to automatically find the definition of a function, or variable, or class, and so on. Personally I prefer to use Exuberant Ctags, a backwards-compatible version of the original with built-in support for more languages. And it is possible to customize Exuberant Ctags to expand its support for existing languages or even entirely new languages.
Today I will show you an example of how to customize Exuberant Ctags so that it creates information for traits in PHP.
The Complete Customization
First you need to find your configuration file for Exuberant Ctags. Within that file we can add command-line options that the program will apply every time we run it. The option we care about for this task is
--regex-php. Let’s start with the complete line to add to our configuration file and then we will break down how it works.
--regex-php=/^[ \t]*trait[ \t]+([a-z0_9_]+)/\1/t,traits/i
With this in our configuration file we can run Exuberant Ctags in a directory of PHP files and it will create tags for any traits. So for example, in GNU Emacs we can now find traits by pressing
M-. in the same way we can already find classes. Now let’s examine exactly why this works.
This is the general structure of the
--regex-php option we define:
The first part is the regular expression that Exuberant Ctags uses to find traits. The program uses GNU Extended Regular Expressions like the egrep utility. That means our regular expression,
/^[ \t]*trait[ \t]+([a-z0_9_]+)/, matches the following:
Zero or more whitespace characters from the start of the line.
Followed by the
Followed by one or more whitespace characters.
Followed by a sequence of one or more alphanumeric characters which we capture as group. This group is the name of the trait so we will want access to it later when creating the tag.
The second part of our custom definition is what we want Exuberant Ctags to include in the tags file. Often we only need the name of whatever the tag points to. That is why we simply write
\1 for that section. That back-reference refers to the trait name we capture in the initial regular expression. The result is that the name of the trait will appear in the resulting tags file, and that’s all we need.
Note: The next two parts are optional.
The third part,
/t,traits/, is the kind specifier. Exuberant Ctags groups tags into ‘kinds’, names consisting of a single letter identifier for short-hand together with a longer, more informative name. We can see the kind specifiers for any language like so:
Some editors make use of this additional information and some do not. Personally I feel that it is always good to include.
The final part are any flags for the initial regular expression. In this example we use the
i flag to perform case-insensitive matching. That way we do not have to also include capital letters in the regular expression that finds trait names. That is, without the
i flag we would have to change
([a-z0_9_]+) into the longer
I hope that explains well-enough how we can expand Exuberant Ctag’s knowledge about a programming language with a fairly simple line of configuration. The truth is we can do much more, such as teaching the program how to create tags for languages it knows nothing about, e.g. Rust, like so:
--langdef=rust --langmap=rust:.rs --regex-rust=/[ \t]*fn[ \t]+([a-zA-Z0-9_]+)/\1/f,function/ --regex-rust=/[ \t]*type[ \t]+([a-zA-Z0-9_]+)/\1/T,types/ --regex-rust=/[ \t]*enum[ \t]+([a-zA-Z0-9_]+)/\1/T,types/ --regex-rust=/[ \t]*struct[ \t]+([a-zA-Z0-9_]+)/\1/m,types/ --regex-rust=/[ \t]*class[ \t]+([a-zA-Z0-9_]+)/\1/m,types/ --regex-rust=/[ \t]*mod[ \t]+([a-zA-Z0-9_]+)/\1/m,modules/ --regex-rust=/[ \t]*const[ \t]+([a-zA-Z0-9_]+)/\1/m,consts/ --regex-rust=/[ \t]*trait[ \t]+([a-zA-Z0-9_]+)/\1/m,traits/ --regex-rust=/[ \t]*impl[ \t]+([a-zA-Z0-9_]+)/\1/m,impls/ --regex-rust=/[ \t]*impl[ \t]+of[ \t]([a-zA-Z0-9_]+)/\1/m,impls/
The manual for Exuberant Ctags does a good job explaining the logic behind a more complex configuration like this. But if you have any questions don’t hesitate to ask in the article comments. And if you want to test your understanding of what I’ve tried to explain then I suggest configuring Exuberant Ctags to recognize PHP namespaces, another language feature that it does not understand out of the box.