CakePHP and static analyzers

In this post, I discuss the usefulness of clean coding and static analyzers used to introspect it.
I will also outline a few neat tools and tricks to get there faster.

Static Analyzers

Statically analyzing your code can be super helpful to find obvious bugs early on, even without a single test case.
Type errors, a few basic mistakes below the API can still quite easily be found by such tools introspecting the code from the outside.

In the PHP world, we mainly refer to PHPStan or Psalm here.
They both serve their use cases, aiming at a slightly different focus. Psalm, these days, seems to have left the other one behind, in terms of functionality and depth of introspection. PHPStan seems a bit easier to get started, however.

If in doubt which to chose/try, go with PHPStan. Better cost-income ratio: Most relevant issues solved with minimum effort.
Psalm can still be checked on top afterwards to see if any issues remain that have been missed. For most apps there is only a very slight benefit of doing this, you will also
have to look through quite a bit of false positives and noise to find relevant things.

PHPStan

In terms of PHPStan, your code should usually always reach at least level 5. Ideally, you are even compatible with the stricter checks and other sniffer enhancements of level 7, though. Level 8 is "nullable" topic and often not realistic for larger apps. If you can, even better!

It is usually hard to become cleaner then the core and plugin code you depend on. Cake 5 is on level 8 and probably leading in PHP frameworks on that metric. Most plugins should also be. So from that side you are safe.

Make sure to use the CakePHP extension for PHPStan for better out of the box results for this framework:

composer require --dev phpstan/extension-installer cakedc/cakephp-phpstan

Note that adding the extension-installer makes it also work right away, no further configuration necessary.

Psalm

If you prefer psalm, make sure to set up at least a psalm.xml with some configs.
Your base level should be lower (e.g. errorLevel="4"), and you can work yourself higher piece by piece until you reach level 1.
Psalm is very delayed, sometimes to a fault. Ignore certain "irrelevant" and more noisy checks.

Make sure to use the cakephp-psalm extension here to get better out of the box results for this framework.

Annotations

This is one of the easiest and most important topics: Always keep your (class) annotations up to date.
The IdeHelper plugin takes care of 99% fully automatically.
You only got to do some tweaking and manual adjustments for edge cases.

They provide valuable information not only for the IDE and you, but also for the static analyzers.
As @property/@method tags it sets important meta data for them to know what methods are available, what the possible input and output types are.

Clean Code

Here I mainly mean to code defensive, taking edge cases into consideration. Also to code as strict as possible to avoid issues along the way.
Strict interface contracting, SOLID principles and good type-hinting can help, as well. Interfaces and a certain architecture, however, are often more important for framework or plugin code. For project code you do not have to "over-engineer" certain topics. Here no one will further extend your code, or depend on it.
Also casting can be an important tool prior to passing certain "user inputted" values further into methods.

Be precise in method arguments and return values

This is also quite important here: If you expect "mixed" arguments everywhere, you basically destroy the use of the tools outlined above.
Try to always accept only one type of input, maybe "nullable" (allowing null as 2nd argument type) on top if needed. In PHP 7.1+ world, it would be ?type for type-hinting.
The same goes for return values.

With PHP 8 you can also use union types of sorts. But try to keep the amount of types always to the absolute minimum.

Setters/getters

These are especially hard to get right in CakePHP, as the entities by design are quite dirty to allow simple usage and customization.
All properties can always be null, even if the DB field is not nullable. You could not have queried the field, or hydration here was not complete.
As such, you will always either have the tools complaining here if you annotate it truthfully (type|null) or you will trip over the code in production despite proper validation and tests due to the hidden "lie".

Try to use the getOrFail pattern as much as possible where you expect a not-null return value and otherwise explicitly check on !== null and throw respective exceptions here.

Especially around entities and their fields it is often impossible to have the annotations be "correct" for all cases.
Even "required" fields can be empty (null) if not pulled (selected fields) from the DB. So for cases like these it can be useful to leverage the Shim plugin enhancements:

use Shim\Model\Entity\GetSetTrait;

class MyEntity extends Entity {

    use GetSetTrait;

Now you can use these methods on values that are expected to be present:

$entity->getOrFail('field'); // Cannot be null
$entity->getFieldOrFail(); // Cannot be null

DTOs and typed objects

Try to go away from larger associative arrays in favor of clear objects.
Ideally, those are either value objects or DTOs.

$this->Pr->displayInfo($pullRequest['head']['ref'], $pullRequest['head']['sha']);

becomes

$this->Pr->displayInfo($pullRequestDto->getHead()->getRef(), $pullRequestDto->getHead()->getSha());

Using an API with methods and a clear param/return type helps to validate the expected type.
If e.g. the type on the displayInfo() method here does not match the type of the getter, you can now get warned or the code error early.
With assoc arrays array<string, mixed> you have no such type-safety.

Generics

Whenever you return a collection of something, make sure the element type is set/known, e.g. on the returning method. If this is not possible, use an inline annotation:

/** iterable<\App\Model\Entity\Article> $articles */
$articles = $somewhere->getArticleCollection();

When you now foreach over them, the entity fields are visible and autocompleted, and static analyzers can help here, too.

Issues and workarounds

There are known issues with this rather new feature of generics in most modern IDEs, including PHPStorm (1, 2).

To workaround those, and have it work for both IDE (human devs) as well as analyzers (PHPStan/Psalm), you can also just call it an array for now:

/** array<\App\Model\Entity\Article> $articles */
$articles = $somewhere->getArticleCollection();
foreach (...)

Making mixed/unknown vars visible

Now this is a pro-tip to really deep dive into potentially "hidden" issues.

Your app might appear to be fine in PHPStan level 7/8. Everything green, all good.
But in fact, often times, it is just green because PHPStan ignored certain variables due to it being mixed or unknown.
To make those visible and clarify where an inline annotation is needed for actual protection against regressions or issues, add the following library to your stack:

composer require --dev symplify/phpstan-rules

In your PHPStan file:

rules:
    - Symplify\PHPStanRules\Rules\Explicit\NoMixedPropertyFetcherRule
    - Symplify\PHPStanRules\Rules\Explicit\NoMixedMethodCallerRule

You will now get the feedback on where PHPStan silently ignored certain lines and checks. Once you added the necessary inline annotations you should see the issues being visible again and you can fix up your code some more.

If you cannot fix all issues, you can also comment out those two lines again and only run them occasionally.

Asserts

With assert() there is another alternative to throwing exceptions available.
I mainly use it for theoretical (practically unlikely to impossible) cases, where PHPStan would think a value could be null or some other type that by code flow is not possible.
Instead of checking and throwing exceptions this can be a shortcut here.

Example:

$foreignKey = $relation->getForeignKey(); // Return type says array|string
assert(is_string($foreignKey), 'Not a string');

We know that for our code and relation it can always only return a scalar value.
So the shortcut is fine here.

It also helps developers to catch certain issues early without imposing those same checks in production (sure, often somewhat nano-optimization).

That said: Since those can (and often are) disabled for production, if those are realistic issues there, it will lead to fatal errors with sometimes not clear exception messages.
So for those scenarios it is better to handle these bail-early cases with classic Exceptions and throw meaningful errors messages to be logged. In many cases
those can also be 404s then, instead of 5xx. In my case, those go to a different log handling and thus do not necessarily alert me via message right away.

Composer command

It can be useful to have a composer command ("scripts") for the commands to execute.
For PHPStan, for example, add this to your composer.json file scripts section:

    "scripts": {
        ...
        "stan": "phpstan analyze",

Note: I would not recommend using both for an application, only for a vendor lib/plugin maybe.
The reason is simple: They are reporting similar things and you will have to work twice for some of the workaround or silencing – with little benefit for you.

As noted above, psalm is more fine-grained and detailed, which for most apps can be almost too much. So personally prefer to have it work with PHPStan and am happy with it on level 7 or 8.
It finds almost 99% of the important issues without hours of extra work spent on the topic.

Continuous Integration

Make sure to set up your static analyzer checks as CI based hook.
So not only run it locally before pushing and pull requesting, but also on that PR.
The skeleton app contains an example here you can copy and paste or adjust to your needs.

I for example use Gitlab and the included free Gitlab CI.
It uses a .gitlab-ci.yml in the repository root and basically lists:

    - composer test
    - composer stan
    - composer cs-check

I recommend this order, as the tests are the most important result, the CS issues are the least important. In most pipelines it usually

Limitations

Maybe just to complete the topic, a few things are worth mentioning that it cannot (yet) cover.

The static analyzers usually can not look into the templates and PHP code there.
So here, you need to run classic tests, usually controller based integration tests.

As they only check the type, not the content, don’t rely on them for value based contracts and asserts.
Here, use classic unit tests.

Summing it up

Using most or all of the above will make your code much more robust and bug-free. It will also make any refactoring way easier as side effects will be found out to a much higher percentage and the developer alerted immediately in the PR.
Instead of finding a lot of issues after the fact, those can be taken into consideration and addressed in the same step.

You still need to your tests and test coverage, though 🙂
This just compliments it where they don’t reach. And since coverage is usually between 20-40% (very rarely 60%) for most larger projects, this is usually an important 2nd layer, that comes pretty much for free.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31