Finding PHP Traits With Exuberant Ctags

Ctags is a programming tool that will create a ‘tags’ file for a set of source-code files. Various editors can use that tags file to allow developers, for example, to automatically find the definition of a function, or variable, or class, and so on. Personally I prefer to use Exuberant Ctags, a backwards-compatible version of the original with built-in support for more languages. And it is possible to customize Exuberant Ctags to expand its support for existing languages or even entirely new languages.

Today I will show you an example of how to customize Exuberant Ctags so that it creates information for traits in PHP.

The Complete Customization

First you need to find your configuration file for Exuberant Ctags. Within that file we can add command-line options that the program will apply every time we run it. The option we care about for this task is --regex-php. Let’s start with the complete line to add to our configuration file and then we will break down how it works.

--regex-php=/^[ \t]*trait[ \t]+([a-z0_9_]+)/\1/t,traits/i

With this in our configuration file we can run Exuberant Ctags in a directory of PHP files and it will create tags for any traits. So for example, in GNU Emacs we can now find traits by pressing M-. in the same way we can already find classes. Now let’s examine exactly why this works.

The Breakdown

This is the general structure of the --regex-php option we define:

--regex-php=/regex/tag-name/tag-kind/flags

The first part is the regular expression that Exuberant Ctags uses to find traits. The program uses GNU Extended Regular Expressions like the egrep utility. That means our regular expression, /^[ \t]*trait[ \t]+([a-z0_9_]+)/, matches the following:

  1. Zero or more whitespace characters from the start of the line.

  2. Followed by the trait keyword.

  3. Followed by one or more whitespace characters.

  4. Followed by a sequence of one or more alphanumeric characters which we capture as group. This group is the name of the trait so we will want access to it later when creating the tag.

The second part of our custom definition is what we want Exuberant Ctags to include in the tags file. Often we only need the name of whatever the tag points to. That is why we simply write \1 for that section. That back-reference refers to the trait name we capture in the initial regular expression. The result is that the name of the trait will appear in the resulting tags file, and that’s all we need.

Note: The next two parts are optional.

The third part, /t,traits/, is the kind specifier. Exuberant Ctags groups tags into ‘kinds’, names consisting of a single letter identifier for short-hand together with a longer, more informative name. We can see the kind specifiers for any language like so:

$ ctags-exuberant --list-kinds=php
c  classes 
i  interfaces 
d  constant definitions 
f  functions 
v  variables 
v  variables 
j  javascript functions 
j  javascript functions 
j  javascript functions 
t  traits 

Some editors make use of this additional information and some do not. Personally I feel that it is always good to include.

The final part are any flags for the initial regular expression. In this example we use the i flag to perform case-insensitive matching. That way we do not have to also include capital letters in the regular expression that finds trait names. That is, without the i flag we would have to change ([a-z0_9_]+) into the longer ([a-zA-Z0_9_]+).

Conclusion

I hope that explains well-enough how we can expand Exuberant Ctag’s knowledge about a programming language with a fairly simple line of configuration. The truth is we can do much more, such as teaching the program how to create tags for languages it knows nothing about, e.g. Rust, like so:

--langdef=rust
--langmap=rust:.rs
--regex-rust=/[ \t]*fn[ \t]+([a-zA-Z0-9_]+)/\1/f,function/
--regex-rust=/[ \t]*type[ \t]+([a-zA-Z0-9_]+)/\1/T,types/
--regex-rust=/[ \t]*enum[ \t]+([a-zA-Z0-9_]+)/\1/T,types/
--regex-rust=/[ \t]*struct[ \t]+([a-zA-Z0-9_]+)/\1/m,types/
--regex-rust=/[ \t]*class[ \t]+([a-zA-Z0-9_]+)/\1/m,types/
--regex-rust=/[ \t]*mod[ \t]+([a-zA-Z0-9_]+)/\1/m,modules/
--regex-rust=/[ \t]*const[ \t]+([a-zA-Z0-9_]+)/\1/m,consts/
--regex-rust=/[ \t]*trait[ \t]+([a-zA-Z0-9_]+)/\1/m,traits/
--regex-rust=/[ \t]*impl[ \t]+([a-zA-Z0-9_]+)/\1/m,impls/
--regex-rust=/[ \t]*impl[ \t]+of[ \t]([a-zA-Z0-9_]+)/\1/m,impls/

The manual for Exuberant Ctags does a good job explaining the logic behind a more complex configuration like this. But if you have any questions don’t hesitate to ask in the article comments. And if you want to test your understanding of what I’ve tried to explain then I suggest configuring Exuberant Ctags to recognize PHP namespaces, another language feature that it does not understand out of the box.

Advertisements

3 thoughts on “Finding PHP Traits With Exuberant Ctags

Add Your Thoughts

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s